PDA

View Full Version : Big ugly macro overflows



JulieC
02-23-2010, 08:01 AM
I was put in the unfortunate position of having to write a big ugly VBA macro for Word 2003 to fix badly formatted links (cursing about third party publishing tool X omitted). It worked ok until the document generated by X grew from about 350 pages to about 550. Trying to make this ugly macro shorter, here is my custom data structure and the loop that overflows around paragraph 27,500...



Private Type LinkStruct
m_IsHyperlink As Boolean ' False indicates Bookmark
m_Address As String ' The bookmark name
m_Text As String ' The text displayed
m_Start As Double ' The starting location in the document
End Type

With ActiveDocument
' blah omitted
numParagraphs = 0
Set p = .Paragraphs.Last
strCopy = vbNullString
Do
numParagraphs = numParagraphs + 1

strCopy = Left$(p.style, 3)
If (0 <> StrComp("TOC", strCopy)) Then
For i = 1 To linkCount
If linkArray(i).m_Start = p.Range.start Then
linkArray(i).m_Text = p.Range.Text
Exit For
End If
Next i
End If
strCopy = vbNullString

Set q = p
Set p = q.Previous
Set q = Nothing
Loop While Not p Is Nothing
Set p = Nothing

' blah omitted
End With


Currently running a test (takes hours) replacing the .m_Text assignment with


p.Range.Select
linkArray(i).m_Text = Selection.Text


The error is a stack overflow. I can get around it (with missing info of course) by executing in the immediate window:



Set p = Nothing
Set p = .Paragraphs.First


I am not a VBA expert by far, that .m_Text assignment seems to make something hang around the stack (not copying the text I'm guessing), but I can't figure out how to get around this.

Direct help, info about increasing or monitoring stack size, sympathy all appreciated!

fumei
02-23-2010, 11:11 AM
Can you please post full code?

In particular, I would like to see what - exactly - your variables are declared as, and where; plus where theyare being used. For example, you have a:
Type LinkStruct
but there is nothing in your posted code that uses this.

Other comments.

It depends on what exactly you are doing (which I do not know), but there may be an issue with your counting variable. Is it possible it would be better to do a For Each... for the paragraphs, rather than a count.

I am quite curious as to what "p" is declared as.

Set p = .Paragraphs.Last
Again, using Set indicates an object. So, it appears that p is set to the last paragraph, as a paragraph object...yes? Why?

JulieC
02-23-2010, 11:29 AM
I'm not at liberty to post the whole thing, the copyright is held by my employer.

Dim linkArray() As LinkStruct
Dim linkCount As Integer
Dim p As Paragraph
Dim q As Paragraph

Now that I look at it, I need to change linkCount from Integer to Long, that might be the overflow. Thanks for pointing that out.

The loop used to be of the "Set p = .Paragraph.Item(i)" from 1 to count, but that is seriously inefficient at high paragraph counts. Word counts from "1" for every one. I never tried that version with this high paragraph/page count.

I'll try the tweak when my current attempt dies, thanks.

fumei
02-23-2010, 12:18 PM
VBA can not be copywrited. But let's let that pass.

"Now that I look at it, I need to change linkCount from Integer to Long, that might be the overflow."

Indeed it could be. Except that VBA no longer uses Integer. ALL declared Integers are internally converted to Long when VBA parses the code. Still making it an explicit Long may help.

If you can not post code (OK), could you please tell us what you are trying to do, exactly. And again, perhaps explain why you are doing a counting method, rather than a For Each method.

JulieC
02-23-2010, 01:57 PM
Is the statement about integers being converted to long true for Word 2003 (version stated in first e-mail)? Either way, I'm running that experiment now.

What I am trying to do, and it works when the document doesn't exceed the stack size, is correct every single link throughout the document.

The only way X (publishing program) outputs links is essentially:
My topic (see page 12-34)

Where the hyperlink is on the page number rather than the blue text (of style "Hotspot"). What our end users and 99% of the rest of the world expects is:
My topic (see page 12-34)

Where the hyperlink is on the blue underlined text (of style "Hyperlink"). The purpose of the macro is to build a map of all the bookmarks and hyperlinks, and the associated text. Then it loops through all the "Hotspot" text, matches, and adds the correct hyperlink.

There was no compelling reason to remove the hyperlink from the page number text, so I didn't complicate matters by trying to do that too.

As I mentioned, it works. It is ugly, but it works ... unless the document exceeds some limit that I am trying to figure out.

The output from the publishing program is Word 2003, per the Word 2003 output template provided to it. It executes a macro called "AfterPublish", which is in the output template, from which my Sub, called FixLinks() is executed.

JulieC
02-23-2010, 02:09 PM
The other question, why am I not using a for each loop, I already answered:

Word indexing is painfully slow for large documents; this macro was taking DAYS to run at the 350 page level. It is much more efficient to start at the last element of .Paragraphs and use the .Previous property to navigate.

fumei
02-24-2010, 10:51 AM
The other question, why am I not using a for each loop, I already answered:

Word indexing is painfully slow for large documents; this macro was taking DAYS to run at the 350 page level. It is much more efficient to start at the last element of .Paragraphs and use the .Previous property to navigate.No, sorry, but that is not an answer. What does the slow index have to do with using a counter versus a For Each?


In fact if the indexing is the slow issue, using a For Each rather than a counter is MORE efficient.

You have not answered in any serious way what exactly you are doing, so I will leave this thread with the following.
Option Explicit

Sub FixThings()
Dim r As Range
Dim HyperAddress As String
Dim NewHyperDisplayText As String
Dim InitialEnd As Long
Dim lngEnd As Long

Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
.Style = "HotSpot"
Do While .Execute(FindText:="", Forward:=True) _
= True
If r.Fields.Count = 1 Then
InitialEnd = r.End
HyperAddress = r.Hyperlinks(1).SubAddress
r.Hyperlinks(1).Delete
lngEnd = r.Words(2).End
r.Collapse 1
r.End = lngEnd
ActiveDocument.Hyperlinks.Add Anchor:=r, Address:="", _
SubAddress:=HyperAddress, ScreenTip:="", _
TextToDisplay:=r.Text
r.End = InitialEnd
r.Collapse 0
End If
r.Collapse 0
If r.End = ActiveDocument.Range.End Then
Exit Do
End If
Loop
End With
End Sub
Demo attached. Click "Fix Things" on the top toolbar. You have not explained the use of your style "HotSpot", but I used one.

I suggest you first click the hyperlink, so you can see where they go.

"12-13" jumps to the bookmark above it (Here for page 12-13)

"BLAH" jumps to the bookmark below it (The other bookmark BLAH)

Now run "Fix Things"

You will notice that the hyperlink which still point to the same location is removed from page 12-13, and is now at "My topic".

If I understand correctly, this is what you are asking for.

What I do not know, is if these paragraphs that have the hyperlink you wish to adjust all use your "HotSpot" style. If they do, then the code will work. Further, I do not know if the text to take over the hyperlink will always be the first two words ("My topic")

In any case, the code does not use, nor need any counter. It does not use a For Each either, as it searches ONLY paragraphs that are using the "HotSpot" style.

If your situation is different, then you can explain better.

Good luck.

JulieC
02-24-2010, 12:18 PM
The sample kind of gets the point, so to further clarify. In my document and X:

"HotSpot" is a character style
"Hyperlink" is a character style

There are several thousand occurrances of
HotspotCharStyle (on page 12-661)

in several dozen various paragraph styles. The hyperlink is on the page number, which is in the default character style for the paragraph.

I got the find to locate the right things with a couple modifications to "Fix Things":


'...
With r.Find
.ClearFormatting
.Font.Italic = True
.Font.Color = wdColorBlue
' was .style = "HotSpot"
'...


I am working on the navigation after the find, but not quite getting it.

fumei
02-24-2010, 01:27 PM
OK, so "HotSpot" is a Character Style. That would have been nice to know. So how about this?

Are ALL of the incidents you wish to correct going to have it as "page xx-yy"?

If so, then...search for that.

My basic point is that you should be looking for what you want to action. Not making a range of every single paragraph. I have no idea whatsoever why you are doing things like:
strCopy = Left$(p.style, 3)
If (0 <> StrComp("TOC", strCopy)) Then
or:
For i = 1 To linkCount
If linkArray(i).m_Start = p.Range.start Then
linkArray(i).m_Text = p.Range.Text
Exit For
End If
Next i
I fail to see any reason at all for building some array. Perhaps there is one, but you have not remotely explained why there would be such a need.

I hope the posted alternative (possibly) helps.

JulieC
02-24-2010, 02:07 PM
Most of what you are asking is covered by "I am not a VBA expert". ;) I have been programming for almost 30 years in many languages, but VBA is a strange beast. I have now learned to appreciate "Find".

This new macro, based on yours, is mostly working. The hyperlinks are working, the pagerefs aren't so much.


Sub FixThis()
Dim r As Range
Dim address As String
Dim subAddress As String
Dim initialEnd As Long
Dim lngEnd As Long
Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
.Font.Italic = True
.Font.Color = wdColorBlue
Do While .Execute(FindText:="", Forward:=True) = True
r.Select ' Set selection to current range
initialEnd = r.End
r.MoveEndUntil Cset:=")"
If r.Fields.Count > 0 Then
If r.Fields(1).Type = wdFieldHyperlink Then
address = r.Hyperlinks(1).address
subAddress = r.Hyperlinks(1).subAddress
r.Hyperlinks(1).Delete
lngEnd = r.Words(2).End
r.Collapse 1
r.End = lngEnd
ActiveDocument.Hyperlinks.Add _
Anchor:=Selection.Range, _
address:=address, _
subAddress:=subAddress, _
TextToDisplay:=Selection.Text
ElseIf r.Fields(1).Type = wdFieldPageRef Then
address = r.Fields(1).Code.Text

r.Fields(1).Delete
Selection.Fields.Add _
Range:=Selection.Range, _
Type:=wdFieldPageRef, Text:=address

End If
End If
r.End = initialEnd
r.Collapse 0
If r.End = ActiveDocument.Range.End Then
Exit Do
End If
Loop
End With 'r.Find
End Sub


I'm getting field codes that look like { PAGEREF PAGEREF PAGEREF O_2241 \h \* MERGEFORMAT \* MERGEFORMAT } and it is removing the page number text as well as the field code. I'll look at it some more, but it should be clear what I'm trying to do now that I'm doing it in not such a weird way I hope.

Paul_Hossler
03-06-2010, 06:11 PM
Still seems awfully brute force, and if run time is an issue .....

I think it'd be helpful if you would post a small doc with samples (real samples) of the current field/link technique and what it is you'd like to acheive

You can keep the 'copyrighted' stuff out of it, and we really don't need real paragraphs -- Lorem ipsum will do nicely

Word can be a complex and powerful program, but there's a lot of behind the scenes stuff (formatting, codes, etc.) that will only show in a real document

Paul

JulieC
03-08-2010, 07:21 AM
Yes, the original was very brute force, but the code above is working, except for the pagerefs that I changed to hyperlinks. Consider it solved! Thanks for your help!