JohnnyBravo
07-12-2006, 10:16 AM
Ok rather than continue from my last thread, I thought I would start a new one since this deals with a slightly different task. I've got several resumes which have been scanned into MS Word and saved as an *.rtf format.
Some people have 2, 3 or more pages to their resume, so I've been trying to Google my way into automating the following procedure which is taking me a lot of time and tedious effort.
1) Jane Doe has 3 pages to her resume. So when I open her first page, I would like VBA to take the ActiveDocument.Name (Doe, Jane.rtf) and if it finds "string + pg2.rtf" or "string + pg3.rtf" in path C:\Scanned Resumes\;
Then insert each file into the active document.
(Edit: I should specify that i want page 2 contents inserted at the bottom of page 1 of course. And likewise with any subsequent pages found...pg 3...pg 4...etc.)
Here's my first baby step:
Sub Insert_AllResumePages()
Dim sFilename As String
Dim sFilename As ActiveDocument.Name
With sFilename
'look in this directory - C:\Scanned Resumes\
'if it matches a string pattern
'then insert at end of page 1
'and keep doing so until no more matches are found
At this point, I should clarify one thing. After it's been scanned, each output (or you could say each page) is saved as a rtf document C:\<path> mentioned above. So for example, there is:
Doe, Jane.rtf
Doe, Jane Pg 2.rtf
Doe, Jane Pg 3.rtf
Letterman, David.rtf
Smith, John.rtf
Smith, John Pg2.rtf
Smith, John Pg3.rtf
Smith, John Pg4.rtf
...and so on and so forth.
2) The OCR software uses frame boxes throughout the entire resume. So at the top of Page 2 it might say something like:
Jane Doe - Page Two
So I delete such frames because this kind of verbage in the final edit is unnecessary. While I can continue to search and manually delete these frames, it becomes tedious when you have 3 or 4 pages to a resume and 200 more resumes to go.
Now there are variations of this of course, some people might just have "Page 2" or "Resume Cont'd" etc. I don't need those text boxes in the final finished edited version. So I was wondering if VBA can recognize text patterns, so that if the frame contents are very short and contain key words such as Pg 2; Continued; Cont'd; Jane Doe Page 2; delete that frame if found.
I've been reading up a little on VBA and found an example which I started to tweak, but I'm so new to VBA, I don't know how to proceed from here:
Sub Resumes_ExamineFrameContents()
Dim oFrame As Frame
Dim sFrameContent As String
' Turn on error checking.
On Error GoTo ErrorHandler
' Loop through each frame in the table.
For Each oFrame In ActiveDocument.Frames
' Set sFrameContent equal to text of the cell.
sFrameContent = oFrame.Range
' Conditional Delete Frame code here as described in
' Question #2 above
Next oFrame
ErrorHandler:
If Err <> 0 Then
Dim Msg As String
Msg = "Error # " & Str(Err.Number) & Chr(13) & Err.Description _
& Chr(13) & "Make sure there is a frame in the current document."
MsgBox Msg, , "Error"
End If
End Sub
Some people have 2, 3 or more pages to their resume, so I've been trying to Google my way into automating the following procedure which is taking me a lot of time and tedious effort.
1) Jane Doe has 3 pages to her resume. So when I open her first page, I would like VBA to take the ActiveDocument.Name (Doe, Jane.rtf) and if it finds "string + pg2.rtf" or "string + pg3.rtf" in path C:\Scanned Resumes\;
Then insert each file into the active document.
(Edit: I should specify that i want page 2 contents inserted at the bottom of page 1 of course. And likewise with any subsequent pages found...pg 3...pg 4...etc.)
Here's my first baby step:
Sub Insert_AllResumePages()
Dim sFilename As String
Dim sFilename As ActiveDocument.Name
With sFilename
'look in this directory - C:\Scanned Resumes\
'if it matches a string pattern
'then insert at end of page 1
'and keep doing so until no more matches are found
At this point, I should clarify one thing. After it's been scanned, each output (or you could say each page) is saved as a rtf document C:\<path> mentioned above. So for example, there is:
Doe, Jane.rtf
Doe, Jane Pg 2.rtf
Doe, Jane Pg 3.rtf
Letterman, David.rtf
Smith, John.rtf
Smith, John Pg2.rtf
Smith, John Pg3.rtf
Smith, John Pg4.rtf
...and so on and so forth.
2) The OCR software uses frame boxes throughout the entire resume. So at the top of Page 2 it might say something like:
Jane Doe - Page Two
So I delete such frames because this kind of verbage in the final edit is unnecessary. While I can continue to search and manually delete these frames, it becomes tedious when you have 3 or 4 pages to a resume and 200 more resumes to go.
Now there are variations of this of course, some people might just have "Page 2" or "Resume Cont'd" etc. I don't need those text boxes in the final finished edited version. So I was wondering if VBA can recognize text patterns, so that if the frame contents are very short and contain key words such as Pg 2; Continued; Cont'd; Jane Doe Page 2; delete that frame if found.
I've been reading up a little on VBA and found an example which I started to tweak, but I'm so new to VBA, I don't know how to proceed from here:
Sub Resumes_ExamineFrameContents()
Dim oFrame As Frame
Dim sFrameContent As String
' Turn on error checking.
On Error GoTo ErrorHandler
' Loop through each frame in the table.
For Each oFrame In ActiveDocument.Frames
' Set sFrameContent equal to text of the cell.
sFrameContent = oFrame.Range
' Conditional Delete Frame code here as described in
' Question #2 above
Next oFrame
ErrorHandler:
If Err <> 0 Then
Dim Msg As String
Msg = "Error # " & Str(Err.Number) & Chr(13) & Err.Description _
& Chr(13) & "Make sure there is a frame in the current document."
MsgBox Msg, , "Error"
End If
End Sub