Consulting

Results 1 to 20 of 20

Thread: Solved: Styling Matching Paragraphs

  1. #1
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location

    Solved: Styling Matching Paragraphs

    Trying to come up with effecient way to find all that paragraphs that equal all of an input text and apply an input style to the entire paragraph

    I was hoping to avoid looping through all the .Paragraphs matching text. Just not elegant

    So far ...

    [vba]
    Sub drv()
    Call MatchPara("one")
    Call MatchPara("two")
    Call MatchPara("three")
    End Sub

    Sub MatchPara(sFind As String, Optional sStyle As String = "Heading 1")

    Application.StatusBar = "Marking '" & sFind & "' in " & sStyle

    With ActiveDocument.StoryRanges(wdMainTextStory).Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Replacement.sTyle = ActiveDocument.Styles("Heading 1")

    .Text = sFind & "^p"
    .Replacement.Text = "^&"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    .Execute Replace:=wdReplaceAll
    End With
    End Sub
    [/vba]


    1. Using .Text = sFind & "^p" works UNLESS sFind happens to be the last text in the previous para

    Works: NoNoNo^pYes^p formats the Yes in the right style

    Does not work: NoNoNoMaybeMaybeYes^pYes^p incorrectly formats NoNoNoMaybeMaybeYes

    2. Using .Text = "^p" & sFind & "^p" incorrectly formats the previous paragraph in the style


    I'm hoping there are some switches I don't know about. I tried using WildCards, but never hit the right combination I guess

    Paul

  2. #2
    VBAX Master TonyJollans's Avatar
    Joined
    May 2004
    Location
    Norfolk, England
    Posts
    2,291
    Location
    'fraid there's no magic switch for this. There is a wild card for 'start of word', but not 'start of paragraph'. You can search for ^p(text)^p but (a) this will miss the first paragraph in the document and (b) you will need some extra code to process what you find.
    Enjoy,
    Tony

    ---------------------------------------------------------------
    Give a man a fish and he'll eat for a day.
    Teach him how to fish and he'll sit in a boat and drink beer all day.

    I'm (slowly) building my own site: www.WordArticles.com

  3. #3
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Tony --

    Nuts Doggone MS Word

    I ended up doing it the hard way -- looping through the .Paragraphs. Not as fast as I'd like for long docs

    But thanks for confirming what I was beginning to suspect: no way to do it

    Paul

  4. #4
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Paul, I can not see how it can be otherwise. You are trying to match the text content of paragraphs.

    "find all that paragraphs that equal all of an input text and apply an input style to the entire paragraph"

    How could you not have to actually use .Paragraphs? It is the defining part of your logic.

  5. #5
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Just to be clear...

    "Does not work: NoNoNoMaybeMaybeYes^pYes^p incorrectly formats NoNoNoMaybeMaybeYes"

    According to your logic, no, it does not incorrectly format it. It correctly formats it, precisely the way you have told it to, by your logic. By.....paragraphs.

  6. #6
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Gerry -- your #4

    Yes you are correct that when I loop through .Paragraphs I do indeed match the paragraph text to see if it's what I'm looking for to Style. That's the approach I endedup using.

    The original idea (#1) I had was a Find/Replace of the Main Story, without having to loop using VBA through the .Paragraphs collection. Figured Word's C++ (or what ever it's written in) would do it faster


    Your #5

    Does not work: NoNoNoMaybeMaybeYes^pYes^p incorrectly formats NoNoNoMaybeMaybeYes"

    The requirement was to only Style paragraphs = Yes.

    Example:

    Document

    NoNoNo^p
    NoNoNoMaybeYes^p
    Yes^p

    From my #1

    .Find = Yes^p will apply the style to both line 2 and 3. I only wanted line 1

    or

    .Find = ^pYes^p will apply the style to both line 2 and 3. I only wanted line 1

    Or so I think

    Ideas?

    Paul

  7. #7
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    "I only wanted line 1"

    Don't you mean lline 3? Your example is:

    NoNoNo^p
    NoNoNoMaybeYes^p
    Yes^p


    You want line 3, yes.....pun intended.

    "That's the approach I endedup using." That is because, logically, it is the only approach that will work. As long as the logical requirement is the paragraph object, then the paragraph object is what you need to use.

    And...you DO need to use the pargraph object, because - logically - the requirement is a test of the paragraph style.
    The original idea (#1) I had was a Find/Replace of the Main Story, without having to loop using VBA through the .Paragraphs collection. Figured Word's C++ (or what ever it's written in) would do it faster
    And indeed it would substantially faster using Find/Replace...except...you can't, BECAUSE - logically - you require the use of the paragraph object. You need to use Paragraphs.

    THIS is your problem: "apply an input style to the entire paragraph"

    Paragraphs.

    So.....what did you end up with? The reason I ask is you mention an issue for long documents. I made a document of 100 pages of your example:

    NoNoNo^p
    NoNoNoMaybeYes^p
    Yes^p

    [vba]
    Option Explicit

    Sub FindWhatever(strIn As String)
    Dim oPara As Paragraph
    For Each oPara In ActiveDocument.Paragraphs
    If oPara.Range.Text = strIn & Chr(13) Then
    oPara.Range.Style = "Yadda"
    End If
    Next
    End Sub

    Sub tryMeRoasted()
    Dim Mystart As Date
    Dim Myend As Date
    Mystart = Format(Now, "hh:mm:ss")
    Call FindWhatever("Yes")
    Myend = Format(Now, "hh:mm:ss")
    MsgBox "Start: " & Mystart & vbCrLf & _
    "End: " & Myend
    End Sub
    [/vba]I made up a style - Yadda - and executed the procedure. The result, for 100 pages of processing:

    Start: 10:31:08 AM
    End: 10:31:09

    In other words...one second. All paragraphs (and ONLY those paragraphs) with the input string - "Yes" - were changed to the Yadda style.

  8. #8
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    You want line 3, yes.....pun intended.
    You can be a pretty punny guy sometimes

    Yes, I'm glad you knew what I meant, even if my keyboard didn't

    And yes, I did end up using the approach you said is the only way.

    [vba]
    Function MatchParagraph(sFind As String, Optional sStyle As String = "Heading 1") As Boolean
    Dim s As String
    Dim oPara As Paragraph

    Application.StatusBar = "Marking '" & sFind & "' in " & sStyle

    MatchParagraph = False

    For Each oPara In ActiveDocument.StoryRanges(wdMainTextStory).Paragraphs

    s = Left(oPara.Range.Text, Len(oPara.Range.Text) - 1)

    If UCase(s) = UCase(sFind) Then
    oPara.sTyle = sStyle
    MatchParagraph = True
    End If
    Next
    End Function
    [/vba]

    I always try to avoid my own loop if there's a way to let Word do the work. I know that the Find/Replace feature has a lot of options as to the way you build the .Find text, so I was hoping there was a way I didn't know.

    My concerns about run time were because I use this (plus a lot of other 'cleanup' macros) to re-format text to be compatible with my eBook reader (Kindle).

    A large number of TXT have the chapter in a seperate paragraph.
    By styling it as Heading 1, it will be picked up as a TOC entry by the eBook reader

    1^p
    2^p
    ....
    67^p


    or

    one^p
    ...
    sixty seven^p


    Sometimes a chaper number will be missing, or not set off in a seperate paragraph. I just have to clean those up by hand

    So it's something that will be run N times per doc since there's really no sure way to see how far the numbers go. I use a counter so if it fails to find a match 4 times in a row it exits the For loop

    [vba]
    Call MatchParagraph("prologue")
    Call MatchParagraph("epilogue")

    iCount = 0
    For i = 1 To 999
    If Not MatchParagraph(CStr(i)) Then
    iCount = iCount + 1
    Else
    iCount = 0
    End If
    If iCount = 4 Then Exit For
    Next i

    iCount = 0
    For i = 1 To 999
    If Not MatchParagraph(LongToText(i)) Then
    iCount = iCount + 1
    Else
    iCount = 0
    End If
    If iCount = 4 Then Exit For
    Next i

    [/vba]


    And...you DO need to use the pargraph object, because - logically - the requirement is a test of the paragraph style.
    I was thinking of it as more of a test of the text of a paragraph

    That's why I was thinking Find/Replace


    Paul

  9. #9
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Except Find/Replace is not a test of paragraphs.

  10. #10
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Except Find/Replace is not a test of paragraphs.

    Not the .Paragraph object, I agree, but for the text within isn't it?

    [VBA]
    Sub Macro1()

    With ActiveDocument.StoryRanges(wdMainTextStory).Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Replacement.Style = "Heading 1"
    .Text = "Yes^p"
    .Replacement.Text = "^&"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .Execute Replace:=wdReplaceAll
    End With
    End Sub
    [/VBA]

    Paul

    PS -- You must be a GREAT teacher, lots of patience

  11. #11
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    ummmm.........yes........but what does that do for your:

    NoNoNo
    NoNoNoMaybeYes
    Yes


    both 2 and 3 are actioned....yes? TWO instances of Yes^p are there, so TWO are actioned. This has zip, nada to do with the paragraphs (other than the container paragraph -undefined! - changes its style if the text is found).

    Which is precisely your issue/problem.

    Find/Replace operates on Range objects. Never mind that you can have:[vba]
    With Selection.Find
    [/vba]
    The fact of the matter is Word VBA translates that as:[vba]
    With Selection.Range.Find
    [/vba]
    Just like:[vba]
    Selection.Text
    [/vba]really means:
    [vba]Selection.Range.Text[/vba]This is also why you CAN NOT action non-contiguous selected chunks of text using VBA. If you select non-contiguous portions of text and click the Bold icon...all the chunks become Bold. You can not do this via VBA. VBA uses Range.

    So....when you use that code on:

    NoNoNo
    NoNoNoMaybeYes
    Yes

    the first .Found finds

    NoNoNo
    NoNoNoMaybeYes
    Yes

    and applies...a Paragraph style, which of course applies to the - ahem - paragraph, which in turns means the entire paragraph.

    Bottom line?

    "Not the .Paragraph object, I agree, but for the text within isn't it?"

    Technically speaking, no. The .Found of the first "Yes^p" has no idea that the paragraph containing the .Found has other text.

    Which is the issue/problem. Yes, you most certainly CAN do a different route testing on the whole paragraph. Like this:[vba]
    Sub ForPaul()
    Dim r As Range
    Set r = ActiveDocument.Range
    With r.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    Do While .Execute(FindText:="Yes" & vbCr, _
    Forward:=True) = True
    r.Expand Unit:=wdParagraph
    If Len(r.Text) = 4 Then
    r.Style = "Heading 1"
    End If
    r.Collapse 0
    Loop
    End With
    End Sub
    [/vba]

    You can not use StoryRanges(wdMaintextStory) for this.

    On a 119 page document filled with your sample text, the time was 7 seconds.

    Executing it again using my original code took 3 seconds.

    A gold star for stating why the difference (double the time).

  12. #12
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    A gold star for stating why the difference (double the time).
    Homework assignment???

    The first thing I can see is the

    [VBA]
    r.Expand Unit:=wdParagraph
    ...
    r.Collapse 0
    [/VBA]

    might be taking up time, but 2X???


    Final answer ...

    The
    [VBA]
    Do While
    [/VBA]

    and testing is not as eficient as just iterating and testing


    Paul

  13. #13
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Very good. It is true that using Expand to the Range adds overhead.

    In my earlier code, the whole paragraph IS the object(s) being iterated.

    In the later one it is an added instruction to get the range.

    Nevertheless, it is indeed the Do While that is the main culprit. And you can test for it.

    On a 126 page document of your NoNo...yadda yaddaYes text, I used the following code to first change to Heading 1, then going through the exact same testing and changing it back - in other words, the actions are identical, searching for Yes, and changing the Style.[vba]
    Option Explicit

    Sub Yadda_A()
    Dim r As Range
    Dim oPara As Paragraph
    Dim Jim As Date
    Dim Morrison As Date
    Dim ThisIsTheEnd_MyFriend As Date

    ' first one way
    Set r = ActiveDocument.Range
    Jim = Format(Now, "hh:mm:ss")
    With r.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    Do While .Execute(FindText:="Yes" & vbCr, _
    Forward:=True) = True
    r.Expand Unit:=wdParagraph
    If Len(r.Text) = 4 Then
    r.Style = "Heading 1"
    End If
    r.Collapse 0
    Loop
    End With
    Morrison = Format(Now, "hh:mm:ss")

    ' and then back again
    For Each oPara In ActiveDocument.Paragraphs
    If oPara.Range.Text = "Yes" & Chr(13) Then
    oPara.Range.Style = "Normal"
    End If
    Next
    ThisIsTheEnd_MyFriend = Format(Now, "hh:mm:ss")

    MsgBox "First way using Do While: " & _
    Format(Morrison - Jim, "hh:mm:ss") & _
    vbCrLf & vbCrLf & _
    "Second way using Paragraph object: " & _
    Format(ThisIsTheEnd_MyFriend - Morrison, "hh:mm:ss")
    End Sub
    [/vba]The result?

    First way using Do While: 00:00:05
    Second way using Paragraph object: 00:00:01

    Hopefully this is a lesson to all that, whenever possible, use objects from Collections.

    The first way (Do While) constructs the object to be tested - search for "Yes", adjust/construct the range using Expand...THEN test.

    The second way tests against already existing objects in the Paragraph collection.

  14. #14
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    And homework?? Nah. You would not like to get what I give out as homework.




    evil grin

  15. #15
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    [vba]Sub FBlah()
    Dim r As Range
    Dim Jim As Date
    Dim Morrison As Date
    Dim ThisIsTheEnd_MyFriend As Date

    ' first one way
    Set r = ActiveDocument.Range
    Jim = Format(Now, "hh:mm:ss")
    With r.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    Do While .Execute(FindText:="Yes" & vbCr, _
    Forward:=True) = True
    r.Font.Bold = True
    Loop
    End With
    Morrison = Format(Now, "hh:mm:ss")

    ' reset range object and iterate again
    Set r = ActiveDocument.Range
    With r.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .MatchWildcards = True
    .Text = "Yes"
    .Replacement.Text = "^&"
    .Replacement.Font.Italic = True
    .Execute Replace:=wdReplaceAll
    End With

    ThisIsTheEnd_MyFriend = Format(Now, "hh:mm:ss")

    MsgBox "First way using Do While: " & _
    Format(Morrison - Jim, "hh:mm:ss") & _
    vbCrLf & vbCrLf & _
    "Second way using Find/Replace: " & _
    Format(ThisIsTheEnd_MyFriend - Morrison, "hh:mm:ss")

    End Sub
    [/vba]In both cases the only action is a Font change. Thefirst uses a Do While, the second the native Find/Replace wdReplaceAll.

    Time?

    1. 8 seconds
    2. 4 seconds

  16. #16
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    1. Nice teachers don't give out homework on Fridays

    2. I promise to always "Use objects from Collections"

    3. Your #15, if you added a For Each oPara loop as a 3rd case, what would the times be? (like in post #7)

    Paul

  17. #17
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Your #3.

    You have not been paying attention! Stay after class.

    WHY did I not include a For Each Para???? I could have. There is a reason...and it is the point I have been trying to make.

  18. #18
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    I was interested in how much faster your recommendation to always use collections would be

    After all the initial question was about the fastest way to apply a style to a paragraph when the entire paragraph = an input string, such as "Yes".

    I originally thought that a Find/Replace would get them all at once. You and Tony have shown that it can't be done, and that a .Paragraphs loop is needed. Agreed

    Macro below

    Do While = 8 sec
    Find/repalce = 2 sec
    Collection = 4

    Would it be fair to say that "Under some circumstances, Find/Replace is faster than using the objects in a collection?"

    Paul



    [VBA]
    Sub FBlah2()
    Dim r As Range
    Dim Peter As Date
    Dim Paul As Date
    Dim Mary As Date
    Dim Ringo As Date

    Dim oPara As Paragraph

    '---------------------------------------------
    'make text+CR anywhere BOLD
    Set r = ActiveDocument.Range
    Peter = Now
    With r.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    Do While .Execute(FindText:="Yes" & vbCr, _
    Forward:=True) = True
    r.Font.Bold = True
    Loop
    End With
    Paul = Now

    '---------------------------------------------
    ' makes text+CR italic
    Set r = ActiveDocument.Range
    With r.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .MatchWildcards = False
    .Text = "Yes" & vbCr
    .Replacement.Text = "^&"
    .Replacement.Font.Italic = True
    .Execute Replace:=wdReplaceAll
    End With

    Mary = Now

    '---------------------------------------------
    ' makes text+CR underlined if entire paragraph
    Set r = ActiveDocument.Range
    For Each oPara In r.Paragraphs
    If Right(oPara.Range.Text, 4) = "Yes" & Chr(13) Then
    oPara.Range.Font.Underline = True
    End If
    Next

    Ringo = Now

    '---------------------------------------------
    MsgBox _
    "1. Do While: " & Format(Paul - Peter, "hh:mm:ss") & _
    vbCrLf & vbCrLf & _
    "2. Find/Replace: " & Format(Mary - Paul, "hh:mm:ss") & _
    vbCrLf & vbCrLf & _
    "3. Collection: " & Format(Ringo - Mary, "hh:mm:ss")

    End Sub
    [/VBA]

  19. #19
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Would it be fair to say that "Under some circumstances, Find/Replace is faster than using the objects in a collection?"
    No, it would fair to say that in ALL circumstances Find/Replace is faster.

    Bottom line is this. If you can use Find/Replace, then use it. It is always the fastest.

    If you can not use Find/Replace - most often because of precisely the reason you can not use it here - then using objects from a collection is the fastest.

  20. #20
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Thanks for the lesson -- always learning something here

    Paul

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •