PDA

View Full Version : OCR by MODI via VBA Macro in Excel / Runtime Error -2147417848 (80010108)



ReneDarc
03-11-2015, 06:55 AM
Dear Helpers,

nice to be in this Forum, appriciate it and hope you can help me.

Im using Microsoft Office 2010 and want to make a Macro in Excel to extract Text from a PDF via MODI (Microsoft Office Document Imaging).
And later getting it into a Worksheet, but thats a different business.

I got this Code here:



Sub TestWords()

Dim miDoc As MODI.Document
Dim miWord As MODI.Word
Dim strWordInfo As String

' Load an existing TIFF file.
miDoc.Create "C:\Users\gast_v\Desktop\test2.tif"

' Perform OCR.
miDoc.Images(0).OCR

' Retrieve and display word information.
Set miWord = miDoc.Images(0).Layout.Words(2)
strWordInfo = _
"Id: " & miWord.ID & vbCrLf & _
"Line Id: " & miWord.LineId & vbCrLf & _
"Region Id: " & miWord.RegionId & vbCrLf & _
"Font Id: " & miWord.FontId & vbCrLf & _
"Recognition confidence: " & _
miWord.RecognitionConfidence & vbCrLf & _
"Text: " & miWord.Text
MsgBox strWordInfo, vbInformation + vbOKOnly, _
"Word Information"

Set miWord = Nothing
Set miDoc = Nothing

End Sub


The Error Message is: "Runtime Error -2147417848 (80010108) / The Method 'OCR' for the Object 'IImage' has failed" and Debugging selects the "miDoc.Images(0).OCR"

So the only Solution i found is to change it from "late binding" to "early binding" but as i think it is "early binding" already, or?

MODI is working and i also referenced MODI in Excel, same as i registered the "MDIVWCTL.DLL".



i googled around a lot and searched in many forums but i couldnt find any solution,

would be glad if anybody can help.


Thanx i regards,

Rene

mancubus
03-11-2015, 07:20 AM
welcome to VBAX.

event though the code you posted is from msdn, i dont know why and i dont have a solution.

below "unsolved" thread may give you an idea:
http://www.vbaexpress.com/forum/showthread.php?44308-MODI-Document-Images(0)-OCR-fails-for-Office-2010-(on-Windows-7)

or
http://www.office-forums.com/threads/modi-ocr.2158826/

mancubus
03-11-2015, 07:41 AM
perhaps you have seen these...


OCR Method

Performs optical character recognition (OCR) on the specified document or image.

expression.OCR(LangId, OCROrientImage, OCRStraightenImage)

expression Required. An expression that returns a Document (http://social.msdn.microsoft.com/Forums/zh-TW/233/thread/diobjDocument.htm) object or an Image (http://social.msdn.microsoft.com/Forums/zh-TW/233/thread/diobjImage.htm) object.
LangId Optional MiLANGUAGES (http://social.msdn.microsoft.com/Forums/zh-TW/233/thread/#). The language to use when performing OCR. Default is miLANG_SYSDEFAULT.



LangId can be one of the following MiLANGUAGES constants.



miLANG_CHINESE_SIMPLIFIED (2052, &H804)



miLANG_CHINESE_TRADITIONAL (1028, &H404)



miLANG_CZECH (5)



miLANG_DANISH (6)



miLANG_DUTCH (19, &H13)



miLANG_ENGLISH (9)



miLANG_FINNISH (11)



miLANG_FRENCH (12)



miLANG_GERMAN (7)



miLANG_GREEK (8)



miLANG_HUNGARIAN (14)



miLANG_ITALIAN (16, &H10)



miLANG_JAPANESE (17, &H11)



miLANG_KOREAN (18, &H12)



miLANG_NORWEGIAN (20, &H14)



miLANG_POLISH (21, &H15)



miLANG_PORTUGUESE (22, &H16)



miLANG_RUSSIAN (25, &H19)



miLANG_SPANISH (10)



miLANG_SWEDISH (29, &H1D)



miLANG_SYSDEFAULT (2048, &H800)



miLANG_TURKISH (31, &H1F)





OCROrientImage Optional Boolean. Specifies whether the OCR engine attempts to determine the orientation of the page. Default is true.
OCRStraightenImage Optional Boolean. Specifies whether the OCR engine attempts to "de-skew" the page to correct for small angles of misalignment from the vertical. Default is true.


Remarks

The OCR engine always defaults to the user's regional settings for the LangID argument, unless you specify the language explicitly when calling the OCR method; it does not retain the previously used setting. In a mixed-language environment, it is a good practice to specify the LangID argument explicitly in every call to the OCR method.

ReneDarc
03-11-2015, 08:31 AM
Thx for the Replies,

the written Link i already checked, didnt help me so far :(

with the Language thing, tried this before so it was:



Sub TestWords()

Dim miDoc As MODI.Document
Dim miWord As MODI.Word
Dim strWordInfo As String

' Load an existing TIFF file.
Set miDoc = New MODI.Document
miDoc.Create "C:\Users\gast_v\Desktop\test2.tif"

' Perform OCR.
miDoc.Images(0).OCR miLANG_GERMAN, True, True

' Retrieve and display word information.
Set miWord = miDoc.Images(0).Layout.Words(2)
strWordInfo = _
"Id: " & miWord.ID & vbCrLf & _
"Line Id: " & miWord.LineId & vbCrLf & _
"Region Id: " & miWord.RegionId & vbCrLf & _
"Font Id: " & miWord.FontId & vbCrLf & _
"Recognition confidence: " & _
miWord.RecognitionConfidence & vbCrLf & _
"Text: " & miWord.Text
MsgBox strWordInfo, vbInformation + vbOKOnly, _
"Word Information"

Set miWord = Nothing
Set miDoc = Nothing

End Sub


But sadly didnt change anything. I found out that the MODI Printer doesnt work on windows 7 64bit. Maybe thats the reason? Cos i also cant open "File Import Properties".

ReneDarc
03-11-2015, 08:56 AM
i mean "File Import Preferences" ofc ;) also didnt find any threads about that issue.

the link posted before naming the CreateObject as a solution, that i tried and doesnt work either. also would think this is the mentioned "late binding", eventho i dont think this late and early binding is the actual issue.


Sub TestWords()

Dim miDoc As Object
Dim miWord As MODI.Word
Dim strWordInfo As String

' Load an existing TIFF file.
Set miDoc = CreateObject("MODI.Document")
miDoc.Create "C:\Users\gast_v\Desktop\test2.tif"

' Perform OCR.
miDoc.Images(0).OCR miLANG_GERMAN, True, True

' Retrieve and display word information.
Set miWord = miDoc.Images(0).Layout.Words(2)
strWordInfo = _
"Id: " & miWord.ID & vbCrLf & _
"Line Id: " & miWord.LineId & vbCrLf & _
"Region Id: " & miWord.RegionId & vbCrLf & _
"Font Id: " & miWord.FontId & vbCrLf & _
"Recognition confidence: " & _
miWord.RecognitionConfidence & vbCrLf & _
"Text: " & miWord.Text
MsgBox strWordInfo, vbInformation + vbOKOnly, _
"Word Information"

Set miWord = Nothing
Set miDoc = Nothing

End Sub

snb
03-11-2015, 10:21 AM
Can you please refrain from quoting the same code over and over again ?

ReneDarc
03-11-2015, 10:39 AM
Well, there were little changes, just wanted to show what i changed in case i forgot smth.

kkonfusion
07-28-2015, 06:26 AM
Hi

This is my first post in any VBA forum so forgive me if I have missed any convention.

I have been looking into the same issue for a few weeks now and for my situation, the problem is associated with using Excel version >2007 and MODI. MODI OCR does not seem to work in the new Office versions through VBA even though many parts of MODI itself works.

As for a workaround, what worked for me was to run OCR in a shell invoked by VBA to insert the OCR text into the TIF file then read it back in using MODI in VBA to obtain the text. There was also a complication that this way of obtaining the text only captures the last page of the text in a multipage TIFF. The workaround I found for that was to split the TIFF file using MODI in VBA when it had more than one page and OCR each one in turn.

Rgds
KK