Consulting

Results 1 to 11 of 11

Thread: Turning ALL CAPS text into lower case/sentence case text with character style applied

  1. #1
    VBAX Regular
    Joined
    Feb 2020
    Location
    Auckland, NZ
    Posts
    29
    Location

    Turning ALL CAPS text into lower case/sentence case text with character style applied

    We produce etext files for blind/low vision folk, and the convention is to replace all caps words with lower case and a bold character style. Some words have to be in title case where they start a sentence (or are names). This can be tedious to do manually for young adult novels that are full of "shouting" text.

    I have a macro which performs a series of Find and Replaces to change upper case text into lower case, with a bolding character style applied to it, a style called "Emphasis".

    The last action of the macro is to find sentences whose first word now starts with a lower case letter, and turn it into an upper case letter.

    I've got some code to do this:

            For i = 1 To docFind.Sentences.count
                If docFind.Sentences(i).Characters.First.Bold = True Then
                    docFind.Sentences(i).Characters(1).Case = wdUpperCase
                    docFind.Sentences(i).Characters(1).HighlightColorIndex = wdPink
                End If
            Next i
    Two issues with this way of doing things:

    1. For a long document, it's extremely slow. The full code loops through all sentences, more than once.
    2. It's even slower when additional lines of code deal with the cases where the sentence starts with a quotation mark, and the second character is lower case.

    So the question is:

    Rather than loop through potentially thousands of sentences, how to write code to restrict the search to sentences whose first word is in Emphasis style (or in bold), and whose first letter is lower case?

  2. #2
    Without access to a sample document it is difficult to make suggestions on how to address the issue, but on the face of it if you have a document full of all caps words then why not simply convert the document to sentence case? It will not address proper names, but for the rest it should be virtually instantaneous, leaving fewer replacements to worry about. You can do that from a list of keywords - see https://www.gmayor.com/document_batch_processes.htm

    Dim oRng As Range    Set oRng = ActiveDocument.Range
        oRng.Case = wdTitleSentence
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  3. #3
    VBAX Regular
    Joined
    Feb 2020
    Location
    Auckland, NZ
    Posts
    29
    Location
    Here are some examples, to make the situation clearer. This sort of thing is littered throughout a novel-length document. The macro aims to transform the caps text into lower case bold using a named style, and restores the start of sentences to title case.

    "THAT'S IT!" he cried.
    "Oh, NO!" screamed Fluffy.
    Um ... boys and girls ... BOYS AND GIRLS!
    "Children, PLEASE STOP LAUGHING!"
    STOP IT RIGHT NOW!
    "WOW!" said Connor. "Science is COOL!"
    "I said NO. A BIG FAT NO!"
    "GGGGLLUUZZZZZZZRRRRT!"
    "Tra-la-LAAAAA!" sang the hero.
    "DON'T!" Fred shouted.

    My macro handles a list like this well enough, but not a full length document.

  4. #4
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    For example:
    Sub Demo()Application.ScreenUpdating = False
    With ActiveDocument.Range
      With .Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Text = "<[A-Z][A-Z' \!\?,.:;^t^l^13]{2,}"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindStop
        .Format = False
        .MatchCase = False
        .MatchWildcards = True
      End With
      Do While .Find.Execute
        Do While .Characters.Last Like "[\!\?,.:;" & vbTab & vbCr & Chr(11) & "]"
          .End = .End - 1
        Loop
        .Case = wdTitleWord
        .Style = wdStyleStrong
        .Collapse wdCollapseEnd
      Loop
    End With
    Application.ScreenUpdating = True
    End Sub
    Word's 'Strong' character Style applies bold formatting. Rather than turning everything into lower-case, which would make sentence structures harder to follow, or to sentence case, which would lose the capitalisation for names, etc., I've chosen a compromise by applying Word's 'Title' case, which capitalises the first letter of each word. You can choose a different format if you prefer.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  5. #5
    VBAX Regular
    Joined
    Feb 2020
    Location
    Auckland, NZ
    Posts
    29
    Location
    That's a great advance on what I had. I changed the case to wdTitleSentence, and then added a second Find and Replace to fix the lower case sentences. (The highlights are temporary measures to show me cause and effect.) That gave me 99% of what I wanted, at great speed.

    The only case that was wrongly captured that I've so far noticed is when the last word in a sentence is all caps. The capital letter of the following sentence was included.

    Removing the full stop from the Find text fixed that, but introduced an exclusion: a two letter word like "NO": "I said NO. ETC ..." This was left untouched.

    Sub FormatCapsStringToEmphasis_new()
    
    Application.ScreenUpdating = False
        With ActiveDocument.Range
          With .Find
            .ClearFormatting
            .Replacement.ClearFormatting
            .Text = "<[A-Z][A-Z' \!\?:;^t^l^13]{2,}" ' I removed the full stop and comma that were here
            .Replacement.Text = ""
            .Forward = True
            .Wrap = wdFindStop
            .Format = False
            .MatchCase = False
            .MatchWildcards = True
          End With
          Do While .Find.Execute
            Do While .Characters.Last Like "[\!\?:;.," & vbTab & vbCr & Chr(11) & "]"
              .End = .End - 1
            Loop
            '.Case = wdTitleWord
            .Case = wdTitleSentence
            .Style = ActiveDocument.Styles("Emphasis")
            .HighlightColorIndex = wdPink
            .Collapse wdCollapseEnd
          Loop
        End With
    
        Selection.HomeKey Unit:=wdStory
    
        With Selection.Find
            .ClearFormatting
            .Replacement.ClearFormatting
            .Text = "(^13^34[a-z]{1,})" '
            .Replacement.Text = "^&"
            .Replacement.Highlight = True
            .Forward = True
            .Wrap = wdFindContinue
            .Format = True
            .MatchCase = False
            .MatchWholeWord = False
            .MatchWildcards = True
            .MatchSoundsLike = False
            .MatchAllWordForms = False
        Selection.Find.Execute Replace:=wdReplaceAll
                
            Do Until .Execute = False
                With Selection
                   .Start = .Start + 2
                   .Range.Case = wdTitleWord
                   .Range.HighlightColorIndex = wdYellow
                   .Range.Style = ActiveDocument.Styles("Emphasis")
                   .Collapse wdCollapseEnd
                End With
             Loop
        End With
        Selection.HomeKey Unit:=wdStory
    Application.ScreenUpdating = True
    End Sub
    Last edited by Walentyn; 02-03-2021 at 04:44 PM.

  6. #6
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    Try:
    Sub FormatCapsStringToEmphasis()
    Application.ScreenUpdating = False
    Dim Rng As Range, i As Long
    With ActiveDocument
      Set Rng = .Range(0, 0)
      Set Rng = Rng.GoTo(What:=wdGoToPage, Name:=4)
      Rng.End = .Range.End
      With Rng
        With .Find
          .ClearFormatting
          .Replacement.ClearFormatting
          .Text = "<[A-Z][A-Z '\!\?.,:;]{2,}"
          .Replacement.Text = ""
          .Forward = True
          .Wrap = wdFindStop
          .Format = False
          .MatchCase = False
          .MatchWildcards = True
        End With
        Do While .Find.Execute
          If .Paragraphs.Count > 1 Then .End = .Paragraphs.First.Range.End
          If .End = .Paragraphs.First.Range.End Then .End = .End - 1
          .MoveEndWhile "[ \!\?.,:;]", wdBackward
          If .Characters.Last.Next Like "[a-z]" Then .End = .End - 1
          .MoveEndWhile "[ \!\?.,:;]", wdBackward: i = 0
          Select Case Split(.Text, " ")(UBound(Split(.Text, " ")))
            Case "I", "A", "OK", "O.K", "ADD", "A.D.D", "ADHD", "A.D.H.D", "UFO", "U.F.O", "US", "U.S", "USA", "U.S.A"
            i = Len(Split(.Text, " ")(UBound(Split(.Text, " ")))) + 1
            .End = .End - i
          End Select
          If Len(.Text) > 1 Then
            .Style = wdStyleEmphasis
            .HighlightColorIndex = wdPink
            .Case = wdTitleSentence
            If .Characters.First.Previous Like "[""" & vbCr & Chr(11) & "]" Then .Words.First.Case = wdTitleWord
            Select Case LCase(Split(.Text, " ")(UBound(Split(.Text, " "))))
              Case "alabama", "ohio"
              .Words.Last.Case = wdTitleWord
            End Select
          End If
          .End = .End + i
          .Collapse wdCollapseEnd
        Loop
      End With
    End With
    Application.ScreenUpdating = True
    End Sub
    Last edited by macropod; 02-04-2021 at 09:38 PM.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  7. #7
    VBAX Regular
    Joined
    Feb 2020
    Location
    Auckland, NZ
    Posts
    29
    Location
    That gives another unwanted consequence, which I can resolve by removing the square brackets around [^13""""] but this leaves untouched the first of two issues mentioned above, and loses the restoration of sentence case. But don't want to take up any more of your time. The most comprehensive version is still the two part sub posted above.

  8. #8
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    Perhaps you could attach a document to a post demonstrating the content you're having issues with.

    Regardless, the code you added in post #5 is markedly less efficient than the code I've posted.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  9. #9
    VBAX Regular
    Joined
    Feb 2020
    Location
    Auckland, NZ
    Posts
    29
    Location

    Demo file attached

    I'm attaching a demo file. One page 3 of the file, labelled "Page 1", heading called "Dummy text:" I've added a bunch of fake text as examples. Others are scattered throughout.

    What we call Prelim pages, comprising Publishing Info, Cover Information, Contents etc, will be excluded from the range of this macro.
    Attached Files Attached Files

  10. #10
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    See updated code in post #6. You'll note it now:
    1. starts searching at page 4
    2. has a series of upper-case strings that, if they're the last (or only) word in the matched string, don't get processed (you can add more)
    3. has a series of lower-case strings that, if they're the last (or only) word in the matched string, get converted to title case (you can add more)

    Obviously, this isn't going to cater for every possibility, but it'll get you a whole lot closer to your goal than you have been. After running the macro in post #6, you may want to run another that Find/Replace that re-capitalises names, etc. in the document.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  11. #11
    VBAX Regular
    Joined
    Feb 2020
    Location
    Auckland, NZ
    Posts
    29
    Location
    Gosh, what can I say ... genius! A quite different beast than the previous version(s), and very fast. Much to learn from it too. Thanks so much for taking the time to figure this out! It's a great help.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •