PDA

View Full Version : Extract Red Text from Word Document and Put in New Document



clhare
01-06-2016, 07:10 AM
Hi all!

I found the following code that will extract all red text from one document and put it into another document. Rather than keep adding the red text to a string and only copying it to the other file once all the red text is found, I would prefer that it copy red text to the new document as it finds it and goes back and forth between the two files. Since the file I need to extract from is quite large and has alot of red text, I would like to be able to see that something is happening so I know it is working. Using this code as is, I just saw a blank document on my screen for 15 minutes with some little flashes going on and I ended up stopping it because I was unsure if it got stuck in an endless loop or not.


Sub ExtractText()

Const wdColorRed = 255

Set objWord = CreateObject("Word.Application")
objWord.Visible = True
Set objDoc = objWord.Documents.Open("C:\Test.doc")
Set objSelection = objWord.Selection
Set objDoc2 = objWord.Documents.Add()
Set objSelection2 = objWord.Selection

objSelection.Find.Forward = True
objSelection.Find.Format = True
objSelection.Find.Font.Color = wdColorRed

Do While True
objSelection.Find.Execute
If objSelection.Find.Found Then
strText = strText & objSelection.Text & vbCrLf
Else
Exit Do
End If
Loop

objSelection2.TypeText strText

End Sub


Thanks!

Cheryl

clhare
01-06-2016, 03:58 PM
I also just found out that it needs to work on the active document rather than open a document and the macro will actually be inserted in the document itself. Not sure how to do that with this code.

Cheryl

gmayor
01-07-2016, 12:41 AM
Do you mean that you want to extract the red text to the end of the same document? Presumably you are running this code from Word (although the code you posted suggests otherwise)? In which case
Sub CopyRedText()
Dim oDoc As Document
Dim oRng As Range, oEnd As Range, oOriginal As Range
Set oDoc = ActiveDocument
Set oRng = oDoc.Range
Set oOriginal = oDoc.Range
Set oEnd = oDoc.Range
oOriginal.Collapse 0
With oRng.Find
.Font.ColorIndex = wdRed
Do While .Execute
If oRng.Start >= oOriginal.Start Then GoTo lbl_Exit
oEnd.Collapse 0
oEnd.FormattedText = oRng.FormattedText
oRng.Collapse 0
Loop
End With
oDoc.Activate
lbl_Exit:
Exit Sub
End Sub

clhare
01-26-2016, 07:56 AM
Hi Graham!

Sorry for the delay... I actually need to (1) extract the red text and save it in a "new" document and then (2) remove the red text from the original document. Is that possible?

gmayor
01-28-2016, 12:14 AM
It needs but a minor change to use different document
Option Explicit

Sub CopyRedText()
Dim oTarget As Document
Dim oDoc As Document
Dim oRng As Range, oEnd As Range, oOriginal As Range
Set oDoc = ActiveDocument
Set oTarget = Documents.Add
Set oRng = oDoc.Range
Set oOriginal = oDoc.Range
Set oEnd = oTarget.Range
oOriginal.Collapse 0
With oRng.Find
.Font.ColorIndex = wdRed
Do While .Execute
If oRng.Start >= oOriginal.Start Then GoTo lbl_Exit
oEnd.Collapse 0

'=====================
'If you want the red colour to remain
'oEnd.FormattedText = oRng.FormattedText
'If not use
oEnd.Text = oRng.Text
'=====================

oEnd.Collapse 0
oEnd.InsertParagraphAfter
oEnd.End = oTarget.Range.End
oRng.Text = ""
oRng.Collapse 0
Loop
End With
oTarget.Activate
lbl_Exit:
Exit Sub
End Sub

clhare
01-28-2016, 06:16 AM
Hi Graham,

I ran the macro on my 62-page document and while I did see that the red text was inserted into a new document very quickly, the macro kept going and seemed to be hung up somewhere. I hit Ctrl-Break to stop it and found that once all the red text had been moved from the original document to the new document, the macro kept adding paragraph marks until I stopped it. I stopped the macro after 3 minutes and found that the new document had 11 pages of text and 1,413 pages of empty paragraph marks running down the page. I'm not sure why it's doing that or how to prevent it.

Also can the macro save the new document as "Extracted Text" and resave the original document as "Without Red Text" so the original document will actually remain as is?

Thanks!

Cheryl

gmayor
01-28-2016, 07:06 AM
That will happen if the final paragraph mark is formatted red. The following should account for that and will also save the documents in the folder strPath - here the same folder as the original.
Option Explicit

Sub ExtractRedText()
Dim oTarget As Document
Dim oDoc As Document
Dim oRng As Range, oEnd As Range, oOriginal As Range
Dim strDocName1 As String
Dim strDocName2 As String
Dim strPath As String
Set oDoc = ActiveDocument
oDoc.Save
strPath = oDoc.Path & Chr(92)
If strPath = "" Then
MsgBox "Document not saved!"
GoTo lbl_Exit
End If

strDocName1 = Left(oDoc.name, InStrRev(oDoc.name, Chr(46)) - 1)
strDocName2 = strDocName1 & " - Extracted Text.docx"
strDocName1 = strDocName1 & " - Without Red Text.docx"

Set oTarget = Documents.Add
Set oRng = oDoc.Range
Set oOriginal = oDoc.Range
Set oEnd = oTarget.Range
oOriginal.Collapse 0
With oRng.Find
.Font.ColorIndex = wdRed
Do While .Execute
oEnd.Collapse 0
'=====================
'If you want the red colour to remain
'oEnd.FormattedText = oRng.FormattedText
'If not use
oEnd.Text = oRng.Text
'=====================

oEnd.Collapse 0
oEnd.InsertParagraphAfter
oEnd.End = oTarget.Range.End
oRng.Text = ""
oRng.Collapse 0
If oRng.Start = oOriginal.Start Then
GoTo lbl_Save
End If
Loop
End With
lbl_Exit:
Exit Sub
lbl_Save:
oTarget.SaveAs2 strPath & strDocName2
oDoc.SaveAs2 strPath & strDocName1
oTarget.Activate
GoTo lbl_Exit
End Sub

clhare
01-28-2016, 09:08 AM
I tried the new code and I'm still getting an endless loop that's creating pages of paragraph marks. I checked the document and the last paragraph is set to automatic (black) text.

akuini
01-28-2016, 07:54 PM
Hi, clhare.


Why don’t you take more direct method?
If the document contains only red dan black text then delete all the black text. You can do that with macro but the find & replace from the menu can do that too.
The steps:


Copy the document.
Change the paragraph mark dan space color to red or any other color.
Delete all the black text.


As for deleting red text from original document, well, delete all the red text.

gmayor
01-28-2016, 10:52 PM
I have tested the code in Word 2010, 2013 and 2016 without the issue you describe. Can you post a copy of the document you are processing so that I may determine if the issue relates to the document.

clhare
01-29-2016, 08:52 AM
The problem must be the file itself. I had them send me 2 other files and it worked just fine in those. Just not the first one.