Consulting

Results 1 to 5 of 5

Thread: Acrobat 8 - Export to Text

  1. #1
    VBAX Master stanl's Avatar
    Joined
    Jan 2005
    Posts
    1,141
    Location

    Acrobat 8 - Export to Text

    We are experimenting with eFax - where workorders are faxed and arrive as PDF's rather than being mailed, then scanned to PDF.

    Often several workorders are faxed in batch - each can be 2-3 pages which includes the workorder, signed contract, and perhaps an additional warranty

    We set up an 800 number which routes to an email address and I have no problem opening the emails, extracting the headers/body into fields in an access table, then using an ADO Stream to place the PDF as a binary object in another field for later export/viewing.

    If I open an exported PDF in Acrobat 8 - there is an option to export to text, and even though the entire pdf is an image, the OCR functionality of Acrobat gives me enough to interpret the page # and workorder # so I can parse that data as addtional field info.

    I have used VBA with Acrobat up to version 6.0, so I can code the creation of and loading of the pdf

    [vba] 'Initialize Acrobat by creating App object
    Set gApp = CreateObject("AcroExch.App")
    gApp.Hide

    'Set AVDoc object
    Set gAvDoc = CreateObject("AcroExch.AVDoc")
    [/vba]

    I am assuming I need

    gApp.MenuItemExecute("[Something]")

    but I can't figure out what [Something] is. Sure would appeciate a code snippet to get any readable text to a variable.

    TIA Stan

  2. #2
    I haven't used Acrobat from VBA, but if you can get an image file of your PDF you may be able to use Microsoft Office Document Imaging (MODI) to OCR your text...not sure if it's as good as Acrobat's though...

    This is a function I use to OCR an image file...

    [VBA]Function GetOCRText(TheFile As String) As String
    On Error GoTo PROC_ERR

    If TheFile = "" Then Exit Function

    Dim MyDoc As Object ' MODI.document
    Dim MyLayout As Object ' MODI.Layout

    Set MyDoc = CreateObject("MODI.document") ' New MODI.document
    MyDoc.create TheFile
    MyDoc.images(0).OCR
    Set MyLayout = MyDoc.images(0).Layout

    For Each TheWord In MyLayout.Words
    Result = Result & " " & TheWord.Text
    Next TheWord
    Result = Result & vbCrLf & vbCrLf

    GetOCRText = Result

    Set MyLayout = Nothing
    MyDoc.Close False
    Set MyDoc = Nothing

    PROC_ERR:

    End Function[/VBA]

  3. #3
    VBAX Master stanl's Avatar
    Joined
    Jan 2005
    Posts
    1,141
    Location
    Ah! and - therein lies the rub. I tried this in Vista w/Office 2007 installed and there is no MODI - does it come as a separate download?

  4. #4
    MODI should come with all office versions, you may need to configure it

    This is how to check in Office 2003, hopefully it's similar in 2007:

    Go to Add/Remove Programs, Microsoft Office 200x, and click change. Then select "Add/Remove Features", and make sure "Choose advanced customization of applications" is checked on the list of Office utilities. YOu'll then be presented a treeview of different items, and MODI is under "Office Tools"

  5. #5
    VBAX Master stanl's Avatar
    Joined
    Jan 2005
    Posts
    1,141
    Location
    Thanks; works great, but only accepts .tif or .mdi. I think we can change the efax setting to attach as .tif and this would be an easy solution.

    Stan

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •