PDA

View Full Version : Finding text strings in Word from an excel list of text strings



gtech
02-28-2006, 10:42 AM
I have an immediate requirement to find text strings in several Word documents from a list of text strings in Excel. I would think VBA could handle this but I'm not sure how to start!

There are 400 cells (single column) of very short text strings (aircraft part numbers) in Excel and I need to search through 10 specific Word documents (each document is about 300 pages) to locate each text string. Once the text string is found (in one or more of the documents) I need the macro to list the Word document page and paragraph number in the Excel columns next to the text string.

Thanks in advance for any help.

mdmackillop
02-28-2006, 02:10 PM
A few questions
Are all the files in one folder, if so, are there other files in the folder?
Are the text strings "words" in the documents, or are they part of longer strings?
Do you want all the results to be returned on one sheet?
Is any "structure" required for the return?
Can you post a sample of what you would like the return to look like?

gtech
02-28-2006, 02:22 PM
Thanks for responding...

The files could be placed in one dedicated folder or I could list the file names and path in a spreadsheet column, whichever is easier...

...the strings are words and are not part of a longer string...

... yes, I would like the results on the same spreadsheet row adjacent to the column of the text string..

Here is what I would like the spreadsheet to look like:

Search String Document Page Number Document Paragraph
D145-1112 resulting page # resulting para #
L2233-1 resulting page # resulting para #

mdmackillop
02-28-2006, 04:40 PM
I'm stuggling with the syntax to run the Word search from Excel. Here's some code that I've got so far. The Word code contained within the asterisks works within Word to return the values you're looking for. The code requires a reference to Microsoft Word.


Sub OpenAndReadWordDoc()
' assumes that the previous procedure has been executed
Dim wrdApp As Word.Application
Dim wrdDoc As Word.Document
Dim tString As String, tRange As Word.Range

Set wrdApp = CreateObject("Word.Application")
'wrdApp.Visible = True
Set wrdDoc = wrdApp.Documents.Open("C:\Foldername\MyNewWordDoc.doc")
' example word operations
With wrdDoc
'**************************
'This section comes from Word and needs adapting
Dim MyPos As Range, Paras As Long, Chk As Long, Pg As Long
With Selection.Find
Do
.ClearFormatting
.Text = "stop"
.Execute
Set MyPos = ActiveDocument.Paragraphs(1).Range
MyPos.SetRange Start:=MyPos.Start, _
End:=Selection.Range.End
Paras = MyPos.Paragraphs.Count
Pg = Selection.Information(wdActiveEndPageNumber)
If Paras = Chk Then Exit Do
Chk = Paras
MsgBox "page " & Pg & " - " & Paras
Loop
End With
'**************************
.Close ' close the document
End With
wrdApp.Quit ' close the Word application
Set wrdDoc = Nothing
Set wrdApp = Nothing
ActiveWorkbook.Saved = True
End Sub

gtech
02-28-2006, 05:08 PM
You found the same problem I am having. When running from Excel I can't get the Word Find property to work...although it works fine when run from Word

Any suggestions?

mdmackillop
02-28-2006, 05:10 PM
:banghead:

Killian
02-28-2006, 07:25 PM
Yes, it seems more difficult than it needs to be... :think:
If you apply Find to a range (like the whole content of a doc), it returns a boolean but modifies the selection to the found string.
If you use the selection object to get the position, it needs to be referred to from the document window.
Well, this seems to work:
(I've used the filesystemobject to go through the files in the folder so you'll need a reference to the Microsoft Sctipting Runtime as well as Microsoft Word. And I've named the range of cells "partnumbers")
Sub Main()

Const TARGET_FOLDER_PATH As String = "C:\TEMP\"

Dim fso As FileSystemObject
Dim oTargetFolder As Folder
Dim f As File

Dim appWD As Word.Application
Dim docSource As Word.Document
Dim oSearchRange As Word.Range

Dim rngPartnumber As Range

Set fso = New FileSystemObject
Set oTargetFolder = fso.GetFolder(TARGET_FOLDER_PATH)

Set appWD = New Word.Application
For Each rngPartnumber In Range("partnumbers")
For Each f In oTargetFolder.Files
Set docSource = appWD.Documents.Open(TARGET_FOLDER_PATH & f.Name)
Set oSearchRange = docSource.Content
With oSearchRange.Find
.ClearFormatting
.MatchWholeWord = True
.Text = rngPartnumber.Text
If .Execute Then
docSource.Range(docSource.Paragraphs(1).Range.Start, _
oSearchRange.End).Select
rngPartnumber.Offset(0, 1).Value = f.Name & " Page: " & _
appWD.ActiveWindow.Selection.Information(wdActiveEndPageNumber) _
& " Para: " & appWD.ActiveWindow.Selection.Paragraphs.Count

End If
End With
docSource.Close False
Next f
Next rngPartnumber
appWD.Quit False

End Sub

mdmackillop
03-01-2006, 12:55 AM
Thanks Killian.:friends:

Killian
03-01-2006, 05:19 AM
I always find all that messing around with the selection object a bit strange but I thought I'd better stick with it - I've got a Word template project 'round the corner so I need to get used to strangeness.

mdmackillop
03-01-2006, 06:58 AM
A couple of small adjustments to Killian's code, but which may not be necessary. This will ignore any files without a Doc extension and check for multiple instances of the search string.
Regards
MD

Sub Main()

Const TARGET_FOLDER_PATH As String = "C:\TEMP\"

Dim fso As FileSystemObject
Dim oTargetFolder As Folder
Dim f As File

Dim appWD As Word.Application
Dim docSource As Word.Document
Dim oSearchRange As Word.Range

Dim rngPartnumber As Range
Dim Rw As Long, Paras As Long, Chk As Long


Set fso = New FileSystemObject
Set oTargetFolder = fso.GetFolder(TARGET_FOLDER_PATH)

Set appWD = New Word.Application
For Each rngPartnumber In Range("partnumbers")
Rw = rngPartnumber.Row
For Each f In oTargetFolder.Files
If UCase(Right(f.Name, 3)) = "DOC" Then
Set docSource = appWD.Documents.Open(TARGET_FOLDER_PATH & f.Name)
Set oSearchRange = docSource.Content
With oSearchRange.Find
.ClearFormatting
.MatchWholeWord = True
.Text = rngPartnumber.Text
Do
If .Execute Then
docSource.Range(docSource.Paragraphs(1).Range.Start, _
oSearchRange.End).Select
Paras = appWD.ActiveWindow.Selection.Paragraphs.Count
Cells(Rw, 256).End(xlToLeft).Offset(0, 1).Value = f.Name & " Page: " & _
appWD.ActiveWindow.Selection.Information(wdActiveEndPageNumber) _
& " Para: " & Paras
End If
If Paras = Chk Then Exit Do
Chk = Paras
Loop
End With
docSource.Close False
End If
Next f
Next rngPartnumber
appWD.Quit False
End Sub

gtech
03-01-2006, 09:33 AM
Thanks so much for the help.

I am getting a Run-time error '1004' Method 'Range' of object'_Global' failed

at

For Each rngPartnumber In Range ("partnumbers")

any thoughts?

mdmackillop
03-01-2006, 10:25 AM
You need to create a Range Name called PartNumbers in your workbook for the list of these.

gtech
03-01-2006, 12:06 PM
Making progress... however now it opens the Word document but then shuts down Excel... no error message given.

mdmackillop
03-01-2006, 12:14 PM
There's nothin in the code that should do this. Can you step through it to determine exactly when this occurs. Just for info, what version of Office/Windows are you using?

gtech
03-01-2006, 12:23 PM
The macro is shutting down in the "With oSearchRange.Find" statement

gtech
03-01-2006, 12:25 PM
Office 2003
Windows 2000, Service Pack 4

gtech
03-01-2006, 12:42 PM
The debugging code shows the following for oSearchRange.Find
<-The remote server machine does not exist or is unavailable

Is there a Reference that I need to turn on

mdmackillop
03-01-2006, 12:47 PM
Sorry, I can't replicate this behaviour.

mdmackillop
03-01-2006, 01:07 PM
OK,
I found this behaviour when the code tries to open a file containing code or with links to other files. Try incorporating the following at "Set oSearchRange" to track down a problem file. I don't get 100% success with this, but it may show a way forward.


On Error Resume Next
Set oSearchRange = docSource.Content
If Err <> 0 Then
MsgBox Err.Description & " - " & f.Name
appWD.Quit False
Exit Sub
End If

gtech
03-01-2006, 02:17 PM
mdmackillop:

I was able to get the code to work on my laptop. Works great! It apparently has something to do with my desktop's References or the module the code is running from.

When it runs it searches each document for each Rw. Is it possible to change the loops to first go down the list of all Rw's for document 1, then search document 2 for all the Rw's and so on. With 400 Rw's and 10 documents it will open/close 4000 times...which could take some time.

You have been extremely helpful in getting me this far. If you have time to continue helping me that would be great. If not, I understand and thanks so much for your help.

mdmackillop
03-01-2006, 02:22 PM
We can do that. Do you expect/need the facility for multiple occurrences in each document. It simplifies layout if not required, but no big deal.

gtech
03-01-2006, 08:30 PM
Yes, unfortunetly there may be multple occurences in each document.

Killian
03-03-2006, 02:49 AM
So we need to bring the file loop outside the main loop (for each cell) and put it up front when we set up the Word app.
Then in the main loop, we can iterate through the Word app's "Documents" collection (all the open docs) rather than the files.
This is a better approach, provided you don't open up so many large documents you swallow up all the comp's resources.

So the re-ordered code (the declarations are the same)
Set fso = New FileSystemObject
Set oTargetFolder = fso.GetFolder(TARGET_FOLDER_PATH)
Set appWD = New Word.Application
For Each f In oTargetFolder.Files
If UCase(Right(f.Name, 3)) = "DOC" Then
appWD.Documents.Open TARGET_FOLDER_PATH & f.Name
End If
Next f

For Each rngPartnumber In Range("partnumbers")
Rw = rngPartnumber.Row
For Each docSource In appWD.Documents
Set oSearchRange = docSource.Content
With oSearchRange.Find
.ClearFormatting
.MatchWholeWord = True
.Text = rngPartnumber.Text
Do
If .Execute Then
docSource.Range(docSource.Paragraphs(1).Range.Start, _
oSearchRange.End).Select
Paras = appWD.ActiveWindow.Selection.Paragraphs.Count
Cells(Rw, 256).End(xlToLeft).Offset(0, 1).Value = docSource.Name & " Page: " & _
appWD.ActiveWindow.Selection.Information(wdActiveEndPageNumber) _
& " Para: " & Paras
End If
If Paras = Chk Then Exit Do
Chk = Paras
Loop
End With
Next docSource
Next rngPartnumber
appWD.Quit False

Killian
03-03-2006, 04:26 AM
As a slight departure from the requirement...

I was just going through this code to make a potential kb entry and found a possible enhancement that might be useful.
I did a kb entry a while back that, from Excel, scanned a Word doc for bookmarks - you could then select one and insert a hyperlink to it in the cell.
Well there I was, in the middle of our loop, with a Word range in one hand and an excel cell in the other and thought it was a good opportunity to link the two together.

I've attached an example (unzip to c:\temp and run macro "Main" in the XL workbook)

Any use?