Excellent! Tested on several docs and working perfectly - will also be very helpful in other situations. Really appreciate your help.

One question, is there a reasonably easy way to extract the date from the "Document Reference:" line rather than the document number?

For example, your code currently extracts "75680398987" (note, number of digits will vary in this number) from the line below - could it be modified to extract "2004-06-15" instead?

Document Reference: 75680398987 2004-06-15 12:27:39