PDA

View Full Version : Solved: Styling Matching Paragraphs



Paul_Hossler
05-16-2010, 06:03 AM
Trying to come up with effecient way to find all that paragraphs that equal all of an input text and apply an input style to the entire paragraph

I was hoping to avoid looping through all the .Paragraphs matching text. Just not elegant :think:

So far ...


Sub drv()
Call MatchPara("one")
Call MatchPara("two")
Call MatchPara("three")
End Sub

Sub MatchPara(sFind As String, Optional sStyle As String = "Heading 1")

Application.StatusBar = "Marking '" & sFind & "' in " & sStyle

With ActiveDocument.StoryRanges(wdMainTextStory).Find
.ClearFormatting
.Replacement.ClearFormatting
.Replacement.sTyle = ActiveDocument.Styles("Heading 1")

.Text = sFind & "^p"
.Replacement.Text = "^&"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute Replace:=wdReplaceAll
End With
End Sub



1. Using .Text = sFind & "^p" works UNLESS sFind happens to be the last text in the previous para

Works: NoNoNo^pYes^p formats the Yes in the right style

Does not work: NoNoNoMaybeMaybeYes^pYes^p incorrectly formats NoNoNoMaybeMaybeYes

2. Using .Text = "^p" & sFind & "^p" incorrectly formats the previous paragraph in the style


I'm hoping there are some switches I don't know about. I tried using WildCards, but never hit the right combination I guess:dunno

Paul

TonyJollans
05-16-2010, 10:35 AM
'fraid there's no magic switch for this. There is a wild card for 'start of word', but not 'start of paragraph'. You can search for ^p(text)^p but (a) this will miss the first paragraph in the document and (b) you will need some extra code to process what you find.

Paul_Hossler
05-17-2010, 08:45 AM
Tony --

Nuts :motz2: Doggone MS Word

I ended up doing it the hard way -- looping through the .Paragraphs. Not as fast as I'd like for long docs

But thanks for confirming what I was beginning to suspect: no way to do it

Paul

fumei
05-18-2010, 09:11 AM
Paul, I can not see how it can be otherwise. You are trying to match the text content of paragraphs.

"find all that paragraphs that equal all of an input text and apply an input style to the entire paragraph"

How could you not have to actually use .Paragraphs? It is the defining part of your logic.

fumei
05-18-2010, 09:14 AM
Just to be clear...

"Does not work: NoNoNoMaybeMaybeYes^pYes^p incorrectly formats NoNoNoMaybeMaybeYes"

According to your logic, no, it does not incorrectly format it. It correctly formats it, precisely the way you have told it to, by your logic. By.....paragraphs.

Paul_Hossler
05-18-2010, 09:38 AM
Gerry -- your #4

Yes you are correct that when I loop through .Paragraphs I do indeed match the paragraph text to see if it's what I'm looking for to Style. That's the approach I endedup using.

The original idea (#1) I had was a Find/Replace of the Main Story, without having to loop using VBA through the .Paragraphs collection. Figured Word's C++ (or what ever it's written in) would do it faster


Your #5

Does not work: NoNoNoMaybeMaybeYes^pYes^p incorrectly formats NoNoNoMaybeMaybeYes"

The requirement was to only Style paragraphs = Yes.

Example:

Document

NoNoNo^p
NoNoNoMaybeYes^p
Yes^p

From my #1

.Find = Yes^p will apply the style to both line 2 and 3. I only wanted line 1

or

.Find = ^pYes^p will apply the style to both line 2 and 3. I only wanted line 1

Or so I think :doh:

Ideas?

Paul

fumei
05-18-2010, 10:34 AM
"I only wanted line 1"

Don't you mean lline 3? Your example is:

NoNoNo^p
NoNoNoMaybeYes^p
Yes^p


You want line 3, yes.....pun intended.

"That's the approach I endedup using." That is because, logically, it is the only approach that will work. As long as the logical requirement is the paragraph object, then the paragraph object is what you need to use.

And...you DO need to use the pargraph object, because - logically - the requirement is a test of the paragraph style.

The original idea (#1) I had was a Find/Replace of the Main Story, without having to loop using VBA through the .Paragraphs collection. Figured Word's C++ (or what ever it's written in) would do it faster
And indeed it would substantially faster using Find/Replace...except...you can't, BECAUSE - logically - you require the use of the paragraph object. You need to use Paragraphs.

THIS is your problem: "apply an input style to the entire paragraph"

Paragraphs.

So.....what did you end up with? The reason I ask is you mention an issue for long documents. I made a document of 100 pages of your example:

NoNoNo^p
NoNoNoMaybeYes^p
Yes^p


Option Explicit

Sub FindWhatever(strIn As String)
Dim oPara As Paragraph
For Each oPara In ActiveDocument.Paragraphs
If oPara.Range.Text = strIn & Chr(13) Then
oPara.Range.Style = "Yadda"
End If
Next
End Sub

Sub tryMeRoasted()
Dim Mystart As Date
Dim Myend As Date
Mystart = Format(Now, "hh:mm:ss")
Call FindWhatever("Yes")
Myend = Format(Now, "hh:mm:ss")
MsgBox "Start: " & Mystart & vbCrLf & _
"End: " & Myend
End Sub
I made up a style - Yadda - and executed the procedure. The result, for 100 pages of processing:

Start: 10:31:08 AM
End: 10:31:09

In other words...one second. All paragraphs (and ONLY those paragraphs) with the input string - "Yes" - were changed to the Yadda style.

Paul_Hossler
05-18-2010, 04:54 PM
You want line 3, yes.....pun intended.


You can be a pretty punny guy sometimes :thumb

Yes, I'm glad you knew what I meant, even if my keyboard didn't

And yes, I did end up using the approach you said is the only way.


Function MatchParagraph(sFind As String, Optional sStyle As String = "Heading 1") As Boolean
Dim s As String
Dim oPara As Paragraph

Application.StatusBar = "Marking '" & sFind & "' in " & sStyle

MatchParagraph = False

For Each oPara In ActiveDocument.StoryRanges(wdMainTextStory).Paragraphs

s = Left(oPara.Range.Text, Len(oPara.Range.Text) - 1)

If UCase(s) = UCase(sFind) Then
oPara.sTyle = sStyle
MatchParagraph = True
End If
Next
End Function


I always try to avoid my own loop if there's a way to let Word do the work. I know that the Find/Replace feature has a lot of options as to the way you build the .Find text, so I was hoping there was a way I didn't know.

My concerns about run time were because I use this (plus a lot of other 'cleanup' macros) to re-format text to be compatible with my eBook reader (Kindle).

A large number of TXT have the chapter in a seperate paragraph.
By styling it as Heading 1, it will be picked up as a TOC entry by the eBook reader

1^p
2^p
....
67^p


or

one^p
...
sixty seven^p


Sometimes a chaper number will be missing, or not set off in a seperate paragraph. I just have to clean those up by hand

So it's something that will be run N times per doc since there's really no sure way to see how far the numbers go. I use a counter so if it fails to find a match 4 times in a row it exits the For loop


Call MatchParagraph("prologue")
Call MatchParagraph("epilogue")

iCount = 0
For i = 1 To 999
If Not MatchParagraph(CStr(i)) Then
iCount = iCount + 1
Else
iCount = 0
End If
If iCount = 4 Then Exit For
Next i

iCount = 0
For i = 1 To 999
If Not MatchParagraph(LongToText(i)) Then
iCount = iCount + 1
Else
iCount = 0
End If
If iCount = 4 Then Exit For
Next i






And...you DO need to use the pargraph object, because - logically - the requirement is a test of the paragraph style.


I was thinking of it as more of a test of the text of a paragraph

That's why I was thinking Find/Replace


Paul

fumei
05-20-2010, 08:40 AM
Except Find/Replace is not a test of paragraphs.

Paul_Hossler
05-20-2010, 10:35 AM
Except Find/Replace is not a test of paragraphs.



Not the .Paragraph object, I agree, but for the text within isn't it?


Sub Macro1()

With ActiveDocument.StoryRanges(wdMainTextStory).Find
.ClearFormatting
.Replacement.ClearFormatting
.Replacement.Style = "Heading 1"
.Text = "Yes^p"
.Replacement.Text = "^&"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.Execute Replace:=wdReplaceAll
End With
End Sub


Paul

PS -- You must be a GREAT teacher, lots of patience

fumei
05-20-2010, 02:19 PM
ummmm.........yes........but what does that do for your:

NoNoNo
NoNoNoMaybeYes
Yes


both 2 and 3 are actioned....yes? TWO instances of Yes^p are there, so TWO are actioned. This has zip, nada to do with the paragraphs (other than the container paragraph -undefined! - changes its style if the text is found).

Which is precisely your issue/problem.

Find/Replace operates on Range objects. Never mind that you can have:
With Selection.Find

The fact of the matter is Word VBA translates that as:
With Selection.Range.Find

Just like:
Selection.Text
really means:
Selection.Range.TextThis is also why you CAN NOT action non-contiguous selected chunks of text using VBA. If you select non-contiguous portions of text and click the Bold icon...all the chunks become Bold. You can not do this via VBA. VBA uses Range.

So....when you use that code on:

NoNoNo
NoNoNoMaybeYes
Yes

the first .Found finds

NoNoNo
NoNoNoMaybeYes
Yes

and applies...a Paragraph style, which of course applies to the - ahem - paragraph, which in turns means the entire paragraph.

Bottom line?

"Not the .Paragraph object, I agree, but for the text within isn't it?"

Technically speaking, no. The .Found of the first "Yes^p" has no idea that the paragraph containing the .Found has other text.

Which is the issue/problem. Yes, you most certainly CAN do a different route testing on the whole paragraph. Like this:
Sub ForPaul()
Dim r As Range
Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
.Replacement.ClearFormatting
Do While .Execute(FindText:="Yes" & vbCr, _
Forward:=True) = True
r.Expand Unit:=wdParagraph
If Len(r.Text) = 4 Then
r.Style = "Heading 1"
End If
r.Collapse 0
Loop
End With
End Sub


You can not use StoryRanges(wdMaintextStory) for this.

On a 119 page document filled with your sample text, the time was 7 seconds.

Executing it again using my original code took 3 seconds.

A gold star for stating why the difference (double the time).

Paul_Hossler
05-20-2010, 05:37 PM
A gold star for stating why the difference (double the time).


Homework assignment???

The first thing I can see is the


r.Expand Unit:=wdParagraph
...
r.Collapse 0


might be taking up time, but 2X???


Final answer ...

The

Do While


and testing is not as eficient as just iterating and testing


Paul

fumei
05-21-2010, 08:45 AM
Very good. It is true that using Expand to the Range adds overhead.

In my earlier code, the whole paragraph IS the object(s) being iterated.

In the later one it is an added instruction to get the range.

Nevertheless, it is indeed the Do While that is the main culprit. And you can test for it.

On a 126 page document of your NoNo...yadda yaddaYes text, I used the following code to first change to Heading 1, then going through the exact same testing and changing it back - in other words, the actions are identical, searching for Yes, and changing the Style.
Option Explicit

Sub Yadda_A()
Dim r As Range
Dim oPara As Paragraph
Dim Jim As Date
Dim Morrison As Date
Dim ThisIsTheEnd_MyFriend As Date

' first one way
Set r = ActiveDocument.Range
Jim = Format(Now, "hh:mm:ss")
With r.Find
.ClearFormatting
.Replacement.ClearFormatting
Do While .Execute(FindText:="Yes" & vbCr, _
Forward:=True) = True
r.Expand Unit:=wdParagraph
If Len(r.Text) = 4 Then
r.Style = "Heading 1"
End If
r.Collapse 0
Loop
End With
Morrison = Format(Now, "hh:mm:ss")

' and then back again
For Each oPara In ActiveDocument.Paragraphs
If oPara.Range.Text = "Yes" & Chr(13) Then
oPara.Range.Style = "Normal"
End If
Next
ThisIsTheEnd_MyFriend = Format(Now, "hh:mm:ss")

MsgBox "First way using Do While: " & _
Format(Morrison - Jim, "hh:mm:ss") & _
vbCrLf & vbCrLf & _
"Second way using Paragraph object: " & _
Format(ThisIsTheEnd_MyFriend - Morrison, "hh:mm:ss")
End Sub
The result?

First way using Do While: 00:00:05
Second way using Paragraph object: 00:00:01

Hopefully this is a lesson to all that, whenever possible, use objects from Collections.

The first way (Do While) constructs the object to be tested - search for "Yes", adjust/construct the range using Expand...THEN test.

The second way tests against already existing objects in the Paragraph collection.

fumei
05-21-2010, 08:48 AM
And homework?? Nah. You would not like to get what I give out as homework.




evil grin

fumei
05-21-2010, 11:37 AM
Sub FBlah()
Dim r As Range
Dim Jim As Date
Dim Morrison As Date
Dim ThisIsTheEnd_MyFriend As Date

' first one way
Set r = ActiveDocument.Range
Jim = Format(Now, "hh:mm:ss")
With r.Find
.ClearFormatting
.Replacement.ClearFormatting
Do While .Execute(FindText:="Yes" & vbCr, _
Forward:=True) = True
r.Font.Bold = True
Loop
End With
Morrison = Format(Now, "hh:mm:ss")

' reset range object and iterate again
Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
.Replacement.ClearFormatting
.MatchWildcards = True
.Text = "Yes"
.Replacement.Text = "^&"
.Replacement.Font.Italic = True
.Execute Replace:=wdReplaceAll
End With

ThisIsTheEnd_MyFriend = Format(Now, "hh:mm:ss")

MsgBox "First way using Do While: " & _
Format(Morrison - Jim, "hh:mm:ss") & _
vbCrLf & vbCrLf & _
"Second way using Find/Replace: " & _
Format(ThisIsTheEnd_MyFriend - Morrison, "hh:mm:ss")

End Sub
In both cases the only action is a Font change. Thefirst uses a Do While, the second the native Find/Replace wdReplaceAll.

Time?

1. 8 seconds
2. 4 seconds

Paul_Hossler
05-21-2010, 12:20 PM
1. Nice teachers don't give out homework on Fridays

2. I promise to always "Use objects from Collections"

3. Your #15, if you added a For Each oPara loop as a 3rd case, what would the times be? (like in post #7)

Paul

fumei
05-21-2010, 12:28 PM
Your #3.

You have not been paying attention! Stay after class.

WHY did I not include a For Each Para???? I could have. There is a reason...and it is the point I have been trying to make.

Paul_Hossler
05-21-2010, 03:36 PM
I was interested in how much faster your recommendation to always use collections would be

After all the initial question was about the fastest way to apply a style to a paragraph when the entire paragraph = an input string, such as "Yes".

I originally thought that a Find/Replace would get them all at once. You and Tony have shown that it can't be done, and that a .Paragraphs loop is needed. Agreed

Macro below

Do While = 8 sec
Find/repalce = 2 sec
Collection = 4

Would it be fair to say that "Under some circumstances, Find/Replace is faster than using the objects in a collection?"

Paul




Sub FBlah2()
Dim r As Range
Dim Peter As Date
Dim Paul As Date
Dim Mary As Date
Dim Ringo As Date

Dim oPara As Paragraph

'---------------------------------------------
'make text+CR anywhere BOLD
Set r = ActiveDocument.Range
Peter = Now
With r.Find
.ClearFormatting
.Replacement.ClearFormatting
Do While .Execute(FindText:="Yes" & vbCr, _
Forward:=True) = True
r.Font.Bold = True
Loop
End With
Paul = Now

'---------------------------------------------
' makes text+CR italic
Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
.Replacement.ClearFormatting
.MatchWildcards = False
.Text = "Yes" & vbCr
.Replacement.Text = "^&"
.Replacement.Font.Italic = True
.Execute Replace:=wdReplaceAll
End With

Mary = Now

'---------------------------------------------
' makes text+CR underlined if entire paragraph
Set r = ActiveDocument.Range
For Each oPara In r.Paragraphs
If Right(oPara.Range.Text, 4) = "Yes" & Chr(13) Then
oPara.Range.Font.Underline = True
End If
Next

Ringo = Now

'---------------------------------------------
MsgBox _
"1. Do While: " & Format(Paul - Peter, "hh:mm:ss") & _
vbCrLf & vbCrLf & _
"2. Find/Replace: " & Format(Mary - Paul, "hh:mm:ss") & _
vbCrLf & vbCrLf & _
"3. Collection: " & Format(Ringo - Mary, "hh:mm:ss")

End Sub

fumei
05-25-2010, 10:35 AM
Would it be fair to say that "Under some circumstances, Find/Replace is faster than using the objects in a collection?"
No, it would fair to say that in ALL circumstances Find/Replace is faster.

Bottom line is this. If you can use Find/Replace, then use it. It is always the fastest.

If you can not use Find/Replace - most often because of precisely the reason you can not use it here - then using objects from a collection is the fastest.

Paul_Hossler
05-28-2010, 07:47 AM
Thanks for the lesson -- always learning something here

Paul