View Full Version : Loop through doc, delete specific characters
Zack Barresse
11-17-2006, 10:30 AM
Okay, I'm looking to make my life a little easier, I do not know if it is possible or not.  I have a document I am 'refurbishing' and I am applying styles to.  I can't apply the styles at the moment because there are many additional returns and tabs in the document.  Where bullets and numberings should have been, somebody manually tabbed over and typed "A.    ", etc.  I want to delete all of this stuff and make it 'right'.  I normally wouldn't want to do this, but we will be adding/subtracting from these documents in the future, so I want them easier to work with, user friendly and just plain done right.
I have the document, one of them (there are 5 total right now, more to follow), and can upload it if desired.  I can also upload a small sample of it (the current document I'm on is 90 some odd pages).  I wanted to know if there was a way I could loop through the document and delete certain characters.  Things I want deleted:
 Carrige returns with no data on that line
 Tab characters
 Manual page breaks
 Any bullets or numberings (they're all manually entered)
I know these may not be possible.  I don't know how you would tell of the bullets and numberings.  Maybe check for "A. ", "B. ", "1. " or "2. "??  Something like that (check) should suffice - I would think.
Please let me know if you want a sample file, or a sample of text posted to this thread, or even if I'm fooling myself.  I don't know the limits and capabilities of Word's OM.  :dunno
Zack Barresse
11-17-2006, 11:16 AM
My other recent post has to deal with a specific issue on these documents.  A sample of the documents is on this thread, one before editing and the other after editing.  Here is the link: http://www.vbaexpress.com/forum/showthread.php?t=10182
mdmackillop
11-17-2006, 11:31 AM
Hi Zack,
I've done a lot of this over the years
Have a look at Edit/Replace/More/Special for a list of the special characters.
You typically want to replace multiple para marks so use 
Find ^p^p
Replace ^p
or variation thereof
also
 ^t^t with ^t
^t^p with ^p for tabs with nothing after them.
 
For letter bullets try
^p^$.^t with ^p
 
for number bullets, best start with multiple digits and work down
^p^#^#.^t with ^p
^p^#.^t with ^p
 
and so on
mdmackillop
11-17-2006, 11:33 AM
BTW,
Save it frequently for when it goes wrong!  You don't want to start again.
Zack Barresse
11-17-2006, 11:45 AM
I have no idea what you are saying, but I'll look through the Find/Replace and try it out.  I'll let you know the results here shortly.  :yes
Zack Barresse
11-17-2006, 11:53 AM
Wow!  That took care of 75% of the document!!!  I LOVE that feature!!!  :D
Thanks Malcolm!!
mdmackillop
11-17-2006, 11:53 AM
Try this on your sample.  Not perfect, but a start
 
Sub Macro1()
'
' Macro1 Macro
' Macro recorded 17/11/06 by malcolm
'
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^p^p"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^$.^t"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    ActiveWindow.ActivePane.VerticalPercentScrolled = 11
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^p^t^$.^t"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^t^#^#.^t"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^t^#.^t"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
     End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^t^t"
        .Replacement.Text = "^t"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
    End With
    
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = "^p^t^#. "
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^t^# "
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^t^p"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^$.  "
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p "
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^t"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    With Selection.Find
        .Text = "^p^? ^t"
        .Replacement.Text = "^p"
        .Forward = True
        .Wrap = wdFindContinue
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Sub
fumei
11-17-2006, 11:54 AM
Zack, post a sample file.  Here is a start though.
Carrige returns with no data on that line:
Manual page breaks.
Sub StartCleanUp()
Dim oPara As Word.Paragraph
' remove empty paragraphs
For Each oPara In ActiveDocument.Paragraphs
    Select Case Len(oPara.Range.Text)
        Case 1  ' just the paragraph mark
            oPara.Range.Delete
        Case 2  ' if empty paragraph preceded by page break
            If Left(oPara.Range.Text, 1) = Chr(12) Then
                 oPara.Range.Delete
                End If
    End Select
Next
' remove manual page breaks
Selection.HomeKey Unit:=wdStory
    With Selection.Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Text = "^m"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
        .Execute Replace:=wdReplaceAll
    End With
End Sub
mdmackillop
11-17-2006, 11:59 AM
Gerry,
Zack has a sample in his (only?) other Word thread.
Zack Barresse
11-17-2006, 12:06 PM
Yes, sorry about the confusion here.  I thought they (my two posts) were separate issues and posted them separately.  In fact, I still have the other problem.  Getting a solution to that would speed up this 'document conversion' very quickly.  I've already trimmed 3 days of work with what has been covered in this thread.
Gerry: the link in post #2 is to the other thread where the file URL is located.
Thank you very much both of you!  I'm learning a lot here, and seeing how useful some of the rudimentary - and native - features of Word can be.  :yes
mdmackillop
11-17-2006, 12:22 PM
Hey Zack,
I love the minimalist approach to website design.
mdmackillop
11-17-2006, 03:55 PM
Hi Zack,
I thought I'd have a go at tidying this code up and make it a bit more "user friendly".  Due to text changes, it seems best to run it twice.
 
Option Explicit
Sub CleanUp()
    Dim Repeats As Long
    Dim FaR
    Dim i As Long
    ReDim FaR(2)
    
    'Using wildcards
    'Set repeated items value
    Repeats = 10
    'Multiple spaces
    FaR(0) = " "
    'Multiple tabs
    FaR(1) = "^t"
    For i = 0 To 1
        With selection.Find
            .Text = FaR(i) & "{2," & Repeats & "}"
            .Replacement.Text = FaR(i)
            .Forward = True
            .Wrap = wdFindContinue
            .MatchWildcards = True
        End With
        selection.Find.Execute Replace:=wdReplaceAll
    Next
    
    'No wildcards
    ReDim FaR(12, 1)
    'Multiple Para
    FaR(1, 0) = "^p^p"
    FaR(1, 1) = "^p"
    'Tab Para
    FaR(2, 0) = "^t^p"
    FaR(2, 1) = "^p"
    'Start  of line space
    FaR(3, 0) = "^p "
    FaR(3, 1) = "^p"
    'End  of line space
    FaR(4, 0) = " ^p "
    FaR(4, 1) = "^p"
    'Numbering with period
    FaR(5, 0) = "^p^?^?.^t"
    FaR(5, 1) = "^p"
    'Numbering with period
    FaR(6, 0) = "^p^?.^t"
    FaR(6, 1) = "^p"
    'Start of line tab
    FaR(7, 0) = "^p^t"
    FaR(7, 1) = "^p"
    'Start of line number
    FaR(8, 0) = "^p^# "
    FaR(8, 1) = "^p"
    'Start of line number with period
    FaR(9, 0) = "^p^#. "
    FaR(9, 1) = "^p"
    'Start of line letter
    FaR(10, 0) = "^p^$ "
    FaR(10, 1) = "^p"
    'Start of line letter with period
    FaR(11, 0) = "^p^$. "
    FaR(11, 1) = "^p"
    'Tab Para
    FaR(12, 0) = "^t^p"
    FaR(12, 1) = "^p"
    For i = 1 To 12
        selection.HomeKey Unit:=wdStory
        With selection.Find
            .Text = FaR(i, 0)
            .Replacement.Text = FaR(i, 1)
            .Forward = True
            .Wrap = wdFindContinue
            .MatchWildcards = False
        End With
        Do Until selection.Find.Execute = False
            selection.Find.Execute Replace:=wdReplaceAll
        Loop
    Next
End Sub
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.