PDA

View Full Version : Solved: Open File based on Character string & working with Bullet points



JohnnyBravo
07-12-2006, 10:16 AM
Ok rather than continue from my last thread, I thought I would start a new one since this deals with a slightly different task. I've got several resumes which have been scanned into MS Word and saved as an *.rtf format.

Some people have 2, 3 or more pages to their resume, so I've been trying to Google my way into automating the following procedure which is taking me a lot of time and tedious effort.

1) Jane Doe has 3 pages to her resume. So when I open her first page, I would like VBA to take the ActiveDocument.Name (Doe, Jane.rtf) and if it finds "string + pg2.rtf" or "string + pg3.rtf" in path C:\Scanned Resumes\;

Then insert each file into the active document.
(Edit: I should specify that i want page 2 contents inserted at the bottom of page 1 of course. And likewise with any subsequent pages found...pg 3...pg 4...etc.)

Here's my first baby step:

Sub Insert_AllResumePages()
Dim sFilename As String
Dim sFilename As ActiveDocument.Name
With sFilename
'look in this directory - C:\Scanned Resumes\
'if it matches a string pattern
'then insert at end of page 1
'and keep doing so until no more matches are found

At this point, I should clarify one thing. After it's been scanned, each output (or you could say each page) is saved as a rtf document C:\<path> mentioned above. So for example, there is:

Doe, Jane.rtf
Doe, Jane Pg 2.rtf
Doe, Jane Pg 3.rtf
Letterman, David.rtf
Smith, John.rtf
Smith, John Pg2.rtf
Smith, John Pg3.rtf
Smith, John Pg4.rtf

...and so on and so forth.

2) The OCR software uses frame boxes throughout the entire resume. So at the top of Page 2 it might say something like:

Jane Doe - Page Two

So I delete such frames because this kind of verbage in the final edit is unnecessary. While I can continue to search and manually delete these frames, it becomes tedious when you have 3 or 4 pages to a resume and 200 more resumes to go.

Now there are variations of this of course, some people might just have "Page 2" or "Resume Cont'd" etc. I don't need those text boxes in the final finished edited version. So I was wondering if VBA can recognize text patterns, so that if the frame contents are very short and contain key words such as Pg 2; Continued; Cont'd; Jane Doe Page 2; delete that frame if found.

I've been reading up a little on VBA and found an example which I started to tweak, but I'm so new to VBA, I don't know how to proceed from here:

Sub Resumes_ExamineFrameContents()
Dim oFrame As Frame
Dim sFrameContent As String
' Turn on error checking.
On Error GoTo ErrorHandler

' Loop through each frame in the table.
For Each oFrame In ActiveDocument.Frames
' Set sFrameContent equal to text of the cell.
sFrameContent = oFrame.Range

' Conditional Delete Frame code here as described in
' Question #2 above

Next oFrame

ErrorHandler:
If Err <> 0 Then
Dim Msg As String
Msg = "Error # " & Str(Err.Number) & Chr(13) & Err.Description _
& Chr(13) & "Make sure there is a frame in the current document."
MsgBox Msg, , "Error"
End If

End Sub

Killian
07-14-2006, 07:24 AM
I think you can use "Application.Filesearch" to bring back the other files for insertion.
Once you have the document, you can use the loop you provided to go through all the frames - you just need to check the string it contains against a list of exceptions.
Option Explicit

Sub main()

Dim i As Long
Dim oFrame As Frame

With Application.FileSearch
.NewSearch
.LookIn = ActiveDocument.Path
.FileName = Left(ActiveDocument.Name, Len(ActiveDocument.Name) - 4) & "*.rtf"
.FileType = msoFileTypeAllFiles
If .Execute(SortBy:=msoSortByFileName, _
SortOrder:=msoSortOrderAscending) > 0 Then
For i = 1 To .FoundFiles.Count
If .FoundFiles(i) <> ActiveDocument.Path & _
Application.PathSeparator & ActiveDocument.Name Then
' insert found file
Selection.EndKey Unit:=wdStory
Selection.InsertFile FileName:=.FoundFiles(i), _
ConfirmConversions:=False, Link:=False, Attachment:=False
End If
Next i
End If
End With

' loop through all frames and check the text
For Each oFrame In ActiveDocument.Frames
If UnWantedFrame(oFrame.Range.Text) Then
oFrame.Cut
End If
Next oFrame

End Sub

Function UnWantedFrame(str As String) As Boolean
Dim i
Dim arr()
' define exceptions
arr = Array("pg", "page", "continued", "cont'd")

For i = LBound(arr) To UBound(arr)
If InStr(1, LCase(str), arr(i)) > 0 Then
UnWantedFrame = True
Exit For
End If
Next i
End Function

JohnnyBravo
07-14-2006, 12:33 PM
Thank you Killian. 2 quick follows for ya... Rather than search for all file types I want it to only insert RTF files. I changed your line
.FileType = msoFileTypeAllFiles to:
.FileType = ".rtf"
But it's not working.

Also When I have Gina Smith's first page open and then run your macro, it not only inserts her 2nd page as desired but inserts her 1st page as well. So basically I need to change it so that:

If .FoundFiles(i) = ActiveDocument.Name
Then skip that one and insert the rest as in page 2, 3, etc.

Killian
07-14-2006, 01:05 PM
ok, I've got the first one covered...

.FileType = msoFileTypeAllFiles
FileType needs to be set to one of it's predefined values. These are set as constants in the Office typelibrary. You can use msoFileTypeWordDocuments to include .docs, etc. Consistent with your specification, I used the filename to return rtf's only.

The second one, I'm not so sure...
The active document will always be returned by.FileName = Left(ActiveDocument.Name, Len(ActiveDocument.Name) - 4) & "*.rtf"which is why I start the looped code withIf .FoundFiles(i) <> ActiveDocument.Path & _
Application.PathSeparator & ActiveDocument.Name Thenso I'm not sure what's happening there. Can we rule out a couple of things?
1) check how many files are retuned - debug the FoundFiles object (set a breakpoint after the Execute and add a watch to .FoundFiles)
2) is there any additional code that may change the ActiveDocument on each iteration?

:think:

JohnnyBravo
07-14-2006, 03:10 PM
I got it now Killian. I was testing your code on a document I had previously processed so that's why it wasn't acting right. But it's cool now - I took your code and modified it a tad bit.

.....errr nevermind... while I was posting this message, I was fooling around with a macro which made MS Word crash. :-(

And hence, I lost all of the modified coding. Now I gotta start over and try to remember what I modified in your routine. Was working like a champ before... arghhh!

Thanks for your help Killian - much appreciated. One of these days, I'll get competent enough with VBA so that I can start answering questions insteading of being being the newb all the time! :)

mdmackillop
07-15-2006, 04:37 AM
One of these days, I'll get competent enough with VBA so that I can start answering questions insteading of being being the newb all the time! :)
Hi Johnny,
It's a chicken and egg situation. Once you start answering, you'll learn much more in areas that you've never thought of. Then you'll find that you make use of this knowledge, and wonder how you ever managed without it.
Regards
MD

fumei
07-15-2006, 05:37 AM
And Johnny, dinna fess yourself about making any mistakes offering answers either. There are lots of people who make mistakes....ME for one. Just jump in, maybe acknowledge that you think it is the right answer, and maybe not...but what the heck I'm going to offer it anyway.

Given with due regard, ALL answers are good ones.

We'll keep you honest....right Malcolm?

mdmackillop
07-15-2006, 07:25 AM
We'll keep you honest....right Malcolm?
Of course Gerry, if we replied with perfect answers the first time, there would be no fun!

lucas
07-15-2006, 08:21 AM
Hi Johnny,
It's a chicken and egg situation. Once you start answering, you'll learn much more in areas that you've never thought of. Then you'll find that you make use of this knowledge, and wonder how you ever managed without it.
Regards
MD

Malcolm and Gerry are right Johnny, I find that I learn a lot just trying to figure out questions posted in the forum. Often by the time I even get close someone else has already posted a solution. Also folks here will chip in with better and cleaner solutions which is great. Its a great forum for learning.