PDA

View Full Version : [SLEEPER:] From Excel, selecting Word text between headings



davidbhill
11-03-2015, 01:07 PM
I've been programming a long time, the last ten years mostly with Access VBA. I have ventured into Word and Excel VBA a few times, but not much. So I am embarrassed to admit I'm stumped.

I'm currently working in Office 2013.

I have a list of documents and the headings within each document. The headings are unique within each document.

I want to select the text between headings, place it into a string variable, then search the string variable for various words. My thought was I can use the list of headings to find a starting point and select all the text between that point and the next heading. It was not obvious to me how to do that so I have been searching the internet for code examples and modified it to fit my environment.

Here's the code I have so far. It is incomplete. I have commented out a couple of loops because they won't work until I get the selection problem resolved.


Sub search_docs_2()

' This subroutine reads through the list of document headings.

' Each section of content between headers is searched using the keywords selected
' in the "Key Words" worksheet. For each keyword found, the priority value (from column D)
' is added to the total score for that section.

' After all documents are searched, the document sections are sorted from highest score to lowest
' score. The presumption is that a section with a high search score is more likely to be relavent
' to the original requirement.

Dim Ctr As Long
Dim MyData As DataObject
Dim docSource As Word.Document
Dim SearchText, sPath, sDoc, sSource As String
Dim BegHead, EndHead, DocOpen As String
Dim appwd As Object
Dim test1, test2, test3 As String

Set MyData = New DataObject
Set appwd = CreateObject("Word.Application")
DocOpen = "N"

' cell E1 on sheet "Search Results" contains number of rows to process
For x = 2 To Sheets("Search Results").Range("E1").Value

' if document name changed, close old doc and open new one
If Sheets("Search Results").Range("$A" & x).Value <> Sheets("Search Results").Range("$A" & x - 1).Value Then
If DocOpen = "Y" Then
docSource.Close (False)
End If
sPath = "C:\Users\dhhill\Documents\RTM Search Folder\"
sDoc = Sheets("Search Results").Range("$A" & x).Value
sSource = sPath & sDoc
Set docSource = Word.Documents.Open(sSource, , True)
DocOpen = "Y"
End If

' get beginning and ending headings - end of document is marked with string "9999999999999999"
BegHead = Sheets("Search Results").Range("$B" & x).Value
If Sheets("Search Results").Range("$A" & x).Value = Sheets("Search Results").Range("$A" & x + 1).Value Then
EndHead = Sheets("Search Results").Range("$B" & x + 1).Value
Else
EndHead = "9999999999999999"
End If

'Find first heading (equivalent to the Word FIND command)
'Sets the parameters
With docSource.Range.Find
.ClearFormatting
.Text = BegHead
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
'Execute the command
docSource.Range.Find.Execute
test1 = docSource.Range.Text
'The first heading text is selected
'Set the cursor at the beginning of the next line
docSource.Range.Next Unit:=wdParagraph, Count:=1
test2 = docSource.Range.Text
'selects the whole paragraph
docSource.Range.Next Unit:=wdParagraph, Count:=1
test3 = docSource.Range.Text
'Loops through and counting paragraphs until text equal to second heading
'string compared is equal to paragraph text length -1 to suppress paragraph mark
' Do Until Left(docSource.Range.Text, Len(docSource.Range.Text) - 1) = EndHead
' Ctr = Ctr + 1
' docSource.Range.Next Unit:=wdParagraph, Count:=1
' Loop
'Positions again under "Heading 1"
With docSource.Range.Find
.ClearFormatting
.Text = BegHead
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
End With
docSource.Range.Find.Execute
docSource.Range.Move Unit:=wdParagraph, Count:=1
'Selects the number of paragraphs counted above
' For i = 1 To Ctr - 1
' docSource.Range.Move Unit:=wdParagraph, Count:=1
' Next I
'Copies them
docSource.Range.Copy
' Puts clipboard text in variable
MyData.GetFromClipboard
SearchText = MyData.GetText(1)
Next x

End Sub


I threw in a few "testx" strings to view the results of the Range.Find, Range.Move, and Range.Next lines. All I seem to be doing is getting the text at the beginning of the document before any headings.

If I have not explained the problem clearly let me know. If my approach is all wrong, let me know. If you can help me, I will be very grateful.

thanks,
David

Leith Ross
11-03-2015, 02:35 PM
Hello David,

Can you upload of a copy of the word document to a public file sharing site? It would be nice to have in order to test the code with.

davidbhill
11-03-2015, 05:27 PM
Hello David,

Can you upload of a copy of the word document to a public file sharing site? It would be nice to have in order to test the code with.

Sorry, I can't. Both the format and the content are proprietary. I know it would be nice, but I will have to settle for some advice on methods and properties regarding the search.

It should be fairly simple to construct a test document. Just have three headings interspersed with text, then open the Word doc from Excel and select the text between two headings.

Leith Ross
11-04-2015, 06:00 PM
Hello David,

Try this macro out. It looks for headings with the style "Heading 1". You can change this to match the style of the headings in your document.



Sub GetTextBetweenHeadings()
Dim colPara As New Collection
Dim Doc As Document
Dim n As Long
Dim Text As String
Set Doc = ActiveDocument
For n = 1 To Doc.Paragraphs.Count - 1
If Doc.Paragraphs(n).Style.NameLocal = "Heading 1" Then
colPara.Add Doc.Paragraphs(n).Range
End If
Next n
For n = 1 To colPara.Count
On Error Resume Next
Text = Doc.Range(colPara(n).End, colPara(n + 1).Start - 1).Text
If Err = 9 Then
Text = Doc.Range(colPara(n).End, Doc.Content.End).Text
End If
On Error GoTo 0
MsgBox Text
Next n
End Sub