PDA

View Full Version : Solved: Determine if the PDF is a Scanned Image or Native PDF



Onetrack
12-17-2012, 08:55 AM
I'm trying to determine if the webpage loaded into Internet Explorer is a scanned image or native pdf.

Native PDF can be parsed, scanned cannot.

The URL I am referencing is not a pdf file per se, but the page returned to IE is embedded and has a type="application/pdf"


I'd like to skip files of this type. However if the file is a .pdf file type then I'd like to not skip that file.

------
ie.navigate URL

Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
ie.Visible = IE_visible

Set ieDoc = ie.document
For Each element In ieDoc.all ' Code fails here for scanned pdfs
...

thanks...

Onetrack
12-17-2012, 10:05 AM
I think the required logic is:

If TypeName(ieDoc) = "HTMLDocument" then...

This will return "AcroPDF" for the embedded PDF files.