PDA

View Full Version : Extracting Specific text within a word document



EdgeMMG
07-16-2005, 10:37 AM
Hello all and thanks in advance for the help.

I am trying to write a macro which will search through a word document and locate certain pieces of text.

For instance:

I want it to find any HTML IMG tags and just return the ALT info.

I want it to find:
<IMG SRC="thisimg.gif" ALT="This is an Image">

and Replace it with:
This is an Image

Am I making sense?

I look forward to any help you are able to give!

lucas
07-16-2005, 11:14 AM
I'm sure there is a more efficient way to do this, but this might give you a start. Recorded with Word macro recorder.
Option Explicit
Sub Find_Replace()

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "<IMG SRC=""thisimg.gif"" ALT=""This is an Image"">"
.Replacement.Text = "This is an Image"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute
Selection.Find.Execute Replace:=wdReplaceAll
End Sub

EdgeMMG
07-16-2005, 11:21 AM
Hey Lucas. Thanks for the reply.

The problem is that I will never know what text is in either the SRC or the ALT.

I want the macro to search the document for an <IMG and if found, extract the text in the ALT tag to replace the tag with.

So it needs to be dynamic. The plus side, it will be well-formed with " " where they need to be, so maybe something along the lines of:

<IMG SRC="ThisImg.gif" ALT="Some TExt here">
Search for <IMG.

If <IMG is found Select from <IMG until the third " is found.
<IMG SRC="ThisImg.gif" ALT="Some TExt here">

Delete.
Leaving:
Some TExt here">

From that point, find the next " and select from there until > and delete

Some TExt here">

Leaving:Some TExt here

Is that even possible within VBA?

Thanks again and I look forward to more help.

fumei
07-17-2005, 07:09 AM
Yes, this is very possible. I will re post with something in about an hour.

Jay Freedman
07-17-2005, 03:16 PM
You don't really need a macro. You can process the whole document with a single wildcard Replace All. In the Replace dialog, clikc the More button and check the Use Wildcards option. Then in the Find What box enter

\<IMG SRC*ALT="(*)"\>

and in the Replace With box enter

\1

and click Replace All.

If you want to make a macro of this so it's easy to repeat, use lucas's code but replace the .Text and .Replacement.Text expressions with these two, and set .MatchWildcards = True.

Jay