Oh, and why do you need to select the range? Is that a left over from testing? No reason to have that, except while trying to understand working with ranges (very limited exceptions to this rule. Very)
Printable View
Oh, and why do you need to select the range? Is that a left over from testing? No reason to have that, except while trying to understand working with ranges (very limited exceptions to this rule. Very)
Good catch!
DoEvents was a leftover from a previous debugging attempt, but you're correct, I can remove it now, since the loop is working.
As for my wrong declaration, I truly didn't know that! Thanks for the teaching. I'll have to go through all my code to fix every other wrong declaration... :-(
As for selecting the range, well, I tried (really hard) without it, as everyone said it was not the right way to do it indeed. That part of code does work without this line indeed, but I have to say that after this part of the code, I am doing other actions on the text, and I need to play directly with the text within the document, and that's the only way I found it to work. But at least, it's only 1 extra line, while all my code has been reduced by at least 70% using the posted method (compared to the "manual" action I was using previously, which was a nightmare and awfully slow).
I would hazard a guess that you just don't quite understand working with the range object yet... 99% of the time, that's the reason why .Select is "needed" -- but the main reason to get rid of .Select is processing speed... (If word doesn't have to actually select an area of the document to work on, it doesn't have to refresh the computer screen, and verify the new selection has all the right buttons like bold/underline pressed or not pressed, etc etc).
So if you're happy with your processing speed, leave the next step to understanding range objects to another day.
The better improvement to your code, at this point, would be to encapsulate functionality.
Work on a routine that accepts a range argument (which you would pass from all your finding stuff), then build that routine to *only* work on whatever text manipulation you're going to do.
Basically, try to separate the actions of:
1. Looping through headers/footers in a document and getting each unique range
2. Finding a particular range within the range from #1 (which will really be the space between two other ranges)
3. Manipulation of the text within each range from #2
the benefit to this structure (as opposed to one big single routine which does all of the above) is that you can test the individual components without having to run the whole routine.
You can just use the immediate window to type
mytestsub Selection.Range
where mytestsub accepts one range parameter.
this may be too much for now... But down the road, keep the thought about encapsulating functionality and at some point, it'll click. Good luck!
I know you are right, but I have already come a long way from my original code (which you DO NOT want to see, you'd be really ***ed off) to the one I have now. I still need to get my mind on all this range stuff, which I certainly do not master well now, obviously. But you sound encouraging, and I like to think that way that working hard to have a clean code that is 100% optimised can gain a lot of speed in the end!
I will get there at some point, in which case I will probably need more help from you all!
If you want to give me more insight, you are more than welcome to. Here is the whole loop as it is right now. Just a few notes about it:
- I may have forgotten to include some declarations here, as I am copying/pasting them from different places, but every variable is declared in my code
- I copied only part of the code that is processing the text. I have a few other things not related to it here and there in the code
- you may see some French in the comments or variable names, as French is my first language ;-) I tried to translate quickly everything, but I may have forgotten a few things here and there
Code:Dim oSection As Section
Dim HdFt As HeaderFooter
Dim r as range
Dim HdFt as range
Dim Tag1 as string
Dim Tag2 as string
With ActiveDocument
For Each oSection In .Sections
For Each HdFt In oSection.Headers
With HdFt
If .LinkToPrevious = False Or oSection.Index = 1 Then
HdFt.Range.Select
TempPrev = 1
'Find position of 1st Tag1
Set r = HdFt.Range
Set fnd = HdFt.Range
With fnd.Find
.Execute (Tag1)
If .Found = True Then
TempStart = fnd.End
End If
End With
While TempStart >= TempPrev + Len(Tag1)
'Find position of next tag1
Set fnd = HdFt.Range
fnd.Start = TempStart
With fnd.Find
.Execute (Tag1)
If .Found = True Then
PosTag1 = fnd.Start
End If
End With
If Tag1 <> Tag2 Then
'Find position of Tag2
Set fnd = HdFt.Range
fnd.Start = TempStart
With fnd.Find
.Execute (Tag2)
If .Found = True Then
PosTag2 = fnd.Start
End If
End With
Else 'if tag1 = tag2
PosTag2 = PosTag1
End If
'THIS IS WHERE THE .SELECT IS REQUIRED! From here, you may find ways to improve my code
If PosTag2 <= PosTag1 And PosTag2 > TempStart Then 'correct order, replacing tags
'Highlighting text between tags
Pos = Application.Selection.MoveLeft(, 1, wdMove) 'place cursor at the beginning of the header
Pos = Application.Selection.MoveRight(, TempStart - 1, wdMove) 'move cursor after tag1
Pos = Application.Selection.MoveRight(, PosTag2 - TempStart, wdExtend) 'select text between both tags
Selection.Range.HighlightColorIndex = Couleur1_Selectionnee 'highlighting text with chosen color
Application.ScreenRefresh
'Deleting tags
Set r = HdFt.Range
Set fnd = HdFt.Range
'Replace Tag1
fnd.Start = TempStart - Len(Tag1)
With fnd.Find
.Forward = True
.Wrap = wdFindStop
.Text = Tag1
.Replacement.Text = ""
.Execute Replace:=wdReplaceOne, Forward:=True, Wrap:=wdFindContinue
End With
'Replace Tag2
fnd.Start = PosTag2 - Len(Tag1)
With fnd.Find
.Forward = True
.Wrap = wdFindStop
.Text = Tag2
.Replacement.Text = ""
.Execute Replace:=wdReplaceOne, Forward:=True, Wrap:=wdFindContinue
End With
TempStart = PosTag2 - Len(Tag1)
TempPrev = TempStart
HdFt.Range.Select
Else
'wrong order, skip case
TempStart = PosTag2
TempPrev = TempStart
End If
'Find next tag1
Set r = HdFt.Range
Set fnd = HdFt.Range
fnd.Start = TempStart
With fnd.Find
.Execute (Tag1)
If .Found = True Then
TempStart = fnd.End
End If
End With
Wend
End If
End With
Next HdFt
Next oSection
End With
End If
Code never irritates me. Everyone learns at a different pace. The reason I am on this forum (and I would imagine many of the other contributors) is to help teach and also learn ourselves. I'm always learning new things, and I've been working and programming in Word for almost 20 years.
It looks to me like the reason you're using Selection is because you haven't explored the analogous .MoveStart and .MoveEnd methods available off the range object (you're using .MoveLeft and .MoveRight methods off the selection object). With minor caveats (.MoveDown and .MoveUp aren't available off a range object, whereas they are available off the Selection object), you can achieve the same thing you're doing (which is basically translating .MoveLeft with .MoveStart and .MoveRight with .MoveEnd
That said, when you're working with ranges, you have to understand that you're working with an actual number... so while .MoveLeft wdcharacter, 1 will collapse the selection (if it's not an insertion point) and move it one character to the left... you would need to use a negative number with the range, so .MoveStart wdcharacter, -1 would do the same thing. But if your range wasn't an insertion point, you might need to use the .Collapse method on the range as well.
You should play around with this stuff... you'll get the hang of it if you stick with it. And you'll end up with much faster processing (especially since you said this typically operates on long documents).
Couple other notes...
1. Try to avoid giant (and useless) With... End With blocks. The entire code snippet contained above is within a With ActiveDocument block. You only use it once, at the top of your For Each loop... just do a For Each oSection In ActiveDocument.Sections -- and you can remove one level of indenting. Same with the HdFt with chunk... you rarely use it, and it's not helpful to your coding... so just get rid of that With statement as well -- and you've removed another indented chunk for no reason.
2. You can save your .Found logic and simply use If .Execute (Tag1) = True Then... The .Execute function returns True if it's found.
3. Prefixes prefixes prefixes. Use appropriate prefixes for your variables... it's much easier to debug code that way, and easier for others to read. Tag1 should be sTag1 (since it's a string) or strTag1 (those are the two common prefixes for a string type). Range should be "rng" or "r" or "rg" .... TempStart and TempEnd should be lStart/lngStart or lEnd or lngEnd. The actual prefix doesn't matter, but consistency does...
There are lots of little rid-bits... but take them in small chunks, apply, and watch your programming get easier and easier!
Good luck!
Jason,
You and I have had a discussion about the apparent StoryRange bug before :yes
I've gone back through my notes and found the following commented code that does (I think) a fair job of illustrating the issue and necessary fixes:
Code:Sub SetUpExample()
'This code demonstrates peculiarities in the Word "StoryRanges" collection and how to deal with them.
Dim oDoc As Word.Document
Dim oRngStory As Word.Range
Dim i As Long
Dim lngBugExterminator As Long
Set oDoc = Documents.Add
Stop
'Step through code and switch between code and document to follow the process.
Debug.Print oDoc.StoryRanges.Count
'Notice that a new document based on a clean normal template contains only 1 storyrange.
'Depending on the version of Word there are a total of 11 to 17 storyRanges
'Ensure formatting marks are showing.
'Notice the headers and footers will be empty (i.e., no visible paragraph mark.)
'This is because the header and footer ranges are not defined at this point.
'Add a new section.
oDoc.Sections.Add
Debug.Print oDoc.StoryRanges.Count
'Notice that adding a section does not effect the storyrange collection.
'Isolate section 2 header and footers from section 1.
For i = 1 To 3
oDoc.Sections(2).Headers(i).LinkToPrevious = False
oDoc.Sections(2).Headers(i).LinkToPrevious = False
Next i
'Reference a header (do something with it, anything).
oDoc.Sections(2).Headers(wdHeaderFooterPrimary).Range.Text = "Section 2 Header text"
Debug.Print oDoc.StoryRanges.Count
'Notice that simply referencing a header or footer defines and expands the storyrange collection
'to include the six header and footer storyranges (6 through 11).
For Each oRngStory In oDoc.StoryRanges
Debug.Print oRngStory.StoryType
Next oRngStory
'Every range has at least one paragraph defined.
'Accordingly, the formally (empty) section 1 header and footer will now contain a paragraph mark.
'The following code similates a user clicking in the section one header (defined with a single paragraph and no text).
oDoc.ActiveWindow.View.SeekView = wdSeekCurrentPageHeader
oDoc.ActiveWindow.View.SeekView = wdSeekMainDocument
Debug.Print oDoc.StoryRanges.Count
'Notice how this action, which could easily be done by the user in with the UI, removes (and kills) the
'section 1 header and footer ranges. The paragraph marks are gone!
'More troublesome is that the six header and footer storyranges (6 through 11) from the storyrange collection
'have also been killed as the following illustrates!!
For Each oRngStory In oDoc.StoryRanges
Debug.Print oRngStory.StoryType
Next oRngStory
'We know that there is a header range defined in section 2 because we can still see the text. This means that a header range does
'not necessarily mean a header storyrange.
'The following illustrates that in order to process all ranges, you must first ensure that all the storyranges are defined.
On Error GoTo Err_Handler
Err_ReEntry:
Set oRngStory = oDoc.StoryRanges(wdPrimaryHeaderStory)
Do Until oRngStory Is Nothing
Debug.Print oRngStory.Text
Set oRngStory = oRngStory.NextStoryRange
Loop
'Since we've reference the section 1 headers and created the range, we are left with empty paragraph marks. Remove them as demostrated above.
oDoc.ActiveWindow.View.SeekView = wdSeekCurrentPageHeader
oDoc.ActiveWindow.View.SeekView = wdSeekMainDocument
oDoc.Close wdDoNotSaveChanges
Exit Sub
Err_Handler:
'There is no wdPrimaryHeaderStory so the error occurred.
'It is content in Section 1 that defines the 6 header/footer storyranges.
'To ensure they are defined all you need to do is reference one of them:
lngBugExterminator = oDoc.Sections(1).Headers(1).Range.StoryType
Resume Err_ReEntry
'What I've learned is that lngBugExterminator and the reference to the section 1 header does "NOT" ensure that skipped headers are processed.
'It is actually section 1 that defines the storyrange collection.
End Sub
Sub PracticalExample()
Dim oRngStory As Word.Range
Dim i As Long
Dim lngBugExterminator As Long
'Create reference to one of the first section header storyranges. This ensures the six header/footer storyranges are defined.
lngBugExterminator = ActiveDocument.Sections(1).Headers(1).Range.StoryType
For Each oRngStory In ActiveDocument.StoryRanges
'Iterate through all linked stories
Do Until oRngStory Is Nothing
With oRngStory
Select Case .StoryType
Case wdEvenPagesHeaderStory, wdFirstPageHeaderStory, wdPrimaryHeaderStory, wdEvenPagesFooterStory, wdFirstPageFooterStory, wdPrimaryFooterStory
If Len(.Text) > 1 Then
Debug.Print .Text
End If
End Select
End With
Set oRngStory = oRngStory.NextStoryRange
Loop 'Until oRngStory Is Nothing
Next
End Sub
glencoe, I do wish you would actually SAY what it is you want to do, with some example of before and after. You have not done this and I strongly suspect that this is much more complicated than it needs to be. AND I still find the idea of what USERS need to do really debatable. But then, again, you do not actually state (with examples) what it is you need to happen.
oops, double post, now deleted.
Frosty,
Thank you so much for your valuable help again! You are 100% right on everything. Lots of things to improve here (and to learn, of course). As you know, fumei was the first to teach me on this thread, and you guys then hopped in and went even further. The only thing that scares me actually is that I have already done quite a bit of programming on this project, and though I always knew from the start that it would require lots of optimization, I'm afraid that what you show me here is only a small proof of what I will need to do next... Lots of work involved indeed. But I am thrilled to imagine what would be the result! Speed and performance will be so much improved! :-)
I will go through your suggestions and see how I can improve my code. I know I haven't gone through all the .range methods that would do the same, but only better. But I will now.
gmaxey,
Does your code suggest that the approach I followed here is also "bugged" because of the StoryRange method? I will have to study your code closely.
fumei,
I didn't tell you everything indeed, I only said what was strictly necessary to this thread. But you are right, this whole discussion is only part of a much bigger project. So here you go...
To start with, I am a technical translator (from English to French, though this is irrelevant here). I used to be an electronics engineer, but moved to the translation industry a couple of years ago. To make a short story, a client asked me to develop (for free, and for their internal resources) a macro that would clean French Word documents generated by different translation software. My strength here is not as much programming, but rather being able to understand the issues behind the French language and apply them in a code (though very awkwardly, I must say).
Now, what do I mean by "clean Word docs"?
As you understand, every language has grammar rules. Some of them are simple. French is rather complex. We pinpointed about 40 rules that could be automated. Rules like deleting extra (unnecessary) spaces, applying punctuation rules (like using a hard space in front of a : symbol or between specific strings of characters), replacing English quotation marks (" ") by French quotation marks (« »), applying "French" as the language of the whole document (for the spellchecker), and so on. Basically, the whole idea is to accelerate the proofreading process and deliver a clean file. Yes, we do use powerful spellcheckers (more powerful than Word's), but their process is not fully automated, so using this macro does optimize the task. And, as you probably guessed it, one of the tasks is to clean those "tags" and highlight text in-between. Now, those "tags" are not tags as such (like html coding), but only markers input by the translator in some translation software that cannot handle highlighting features. Those markers are therefore processed through the macro to highlight text. That's it. I may come up with other features later on with those markers, but right now, that's the task I am working on.
Now, because translation software are evolving and also use their own tagging system (and here, I mean tags like html coding that are used internally by the software), I didn't want to force users using specific tags (like ##text to be highlighted##), but rather let them chose them. The only restriction I am giving is to use a minimum of 2 characters per tag. Users are advised to use tags like ## ##, or #< ># or other similar strings.
Of course, the whole function is integrated within the macro and displays a real-time log of processed tasks/replacements and so forth.
As a side note, exporting files like PDFs in Word can give very unpredictable and messy results, like adding text boxes in headers/footers! So this partly answers your concern about messy text boxes in headers. I am not responsible for them| I have no choice but to process them as well...
Now, to come back to the project, I have already programmed most of it, and it works nicely (though I know speed is far from optimized due to my poor programming approach), and I will have to tackle the big task to optimize the whole thing at the end; that's what scares me most!
That's it. You know the whole story now... ;-)
gmaxey,
Does your code suggest that the approach I followed here is also "bugged" because of the StoryRange method? I will have to study your code closely.
From the example code you provided here, it does not appear that you are using StoryRanges. If you are, and if you don't create a reference to a section 1 header or footer (i.e., using lngBugExterminator) then you could have a situation where the header footer storyranges don't exist and therefor are not processed.
I'm in Gerry's camp on with his assessment of over complicated. You have told us what you are doing with tags in the headers, but you haven't shown us.
I added text like Text ##Test## and **Test** to a header and ran (after fixing compile errors) your code and nothing happens.
Can you provide an example of before and after header text?
First, the code as it is does work, even though it can probably be simplified indeed.Quote:
I added text like Text ##Test## and **Test** to a header and ran (after fixing compile errors) your code and nothing happens.
Can you provide an example of before and after header text?
Tag1 and Tag2 are those ## or whatever else you want them to be.
The example you gave is correct: assuming you have the following in a header:
##text to be highlighted##
then the code I provided will replace it with
text to be highlighted
and the highlight color is selected by the user (in an interface before running the macro).
I'm not sure what else I could provide exactly...
glencoe,
Of course I won't argue if you code is working for you. Here is didn't work with the tags defined:
Tag1 = "##"
Tag2 = "**"
When I changed that to:
Tag1 = "##"
Tag2 = "##"
It still did nothing. Then I changed:
TempPrev = 1
to:
TempPrev = 0
and I had "mixed" results, but at least something happened.
Now looking at the example you have just provided, something like this works (for that example):
Code:Sub ScratchMacro()
'A basic Word macro coded by Greg Maxey
Dim oSection As Section
Dim oHF As HeaderFooter
Dim oRng As Range, oTextRng As Range
Dim strTag As String
strTag = "##"
For Each oSection In ActiveDocument.Sections
For Each oHF In oSection.Headers
With oHF
If .LinkToPrevious = False Or oSection.Index = 1 Then
Set oRng = oHF.Range
With oRng.Find
.Text = "(" & strTag & ")(*)(" & strTag & ")"
.MatchWildcards = True
.Replacement.Text = "\2"
While .Execute(Replace:=wdReplaceOne)
oRng.HighlightColorIndex = wdBrightGreen
oRng.Collapse wdCollapseEnd
Wend
End With
End If
End With
Next oHF
Next oSection
End Sub
Which is a heck of lot simpler.
Sorry, but I still find this almost unbelievable. Are you telling me that you could - with virtually no rules apparently ("## ##, or #< ># or other similar ") - that you could have:
##text blah blah##
#<text blah blah>#
Because if there ARE rules, surely you can not have 40 DIFFERENT start and end identifiers. And even if you DO, then applying reverse rules is still going to be easier than some of the stuff that is being attempted.
And I will repeat myself for the third or fourth time...
If these tags (or identifiers) NEED to be changed, why why why why why why are the users doing ANY of this? This should be an automated process untouched by the users. Or are you saying SOME of these identifiers will stay, and some need to be changed.
If they all need to be changed, then I will flat tell you that any concerns about processing time are nothing compared to what is needed when users are involved. The time to take an input from the users will more than cover any time by processing.
gmaxey,
Your code looks so much simpler indeed! I will try it shortly. You would probably faint seeing the original code I had for the same loop (basically it was all manual search, character by character, using the same kind of ".moveright/.moveleft" procedures as I still have in the code I produced here. The whole loop must have gone from 200 lines to... less than 20! That's a heck of an optimization! :-) And I am not even talking about the effect on speed...
fumei, maybe my explanation was not clear, as I am confused at what you say. I have about 40 different tasks for the whole project (most of them being applying grammar rules). The current code of this thread is only processing ONE of the tasks, that is, processing the "commented part" added by the translator. There is no grammar rule involved here. The only "rule" concerns the markers (tags) being used to find those comments.
Does it make more sense to you?
Oh, OK. That does make sense. I will repeat what Frosty has mentioned. Break off functions into workable chunks. Trying to put everything into one routine is a BAD BAD idea. It leads to confusion and terrible debugging problems.
One confusion that still remains is why users are involved at all.
fumei,
The user (translator) adds tags in the translation software where he wants to comment text. Then he exports the translation into Word, which is the document processed through this macro, and that's where the automation starts.
As every tag is added by a human in the original document, errors are possible. Such errors are for example adding an opening tag, but forgetting to close it; as if you were opening a bracket in your sentence, but oops, you forget to close it... In such a case, I cannot process ALL tags, as one pair is not complete; my choice is to ignore tags that are not properly paired. I could possibly highlight tags that are single, to show the user something is missing.
Otherwise, if the document contains matched pairs of tags, then yes, ALL of them should be replaced (deleted and text highlighted).
Then there should be no user involvement whatsoever. Whoever has control of the files does an automated batch processing. And I have to say this is horribly sloppy. So even after all this crap, you can have instances of text with single "tags'. Obviously you do not want that (otherwise why bother trying to clear out matching tags), but you have to accept this?
And talk about time involved, the tags (identifiers of comments) are manually entered. Oh boy. And if the SAME user is adding the tags:
Then to me this is a training issue. Tell them to use ONE type of "tag".Quote:
The user (translator) adds tags in the translation software where he wants to comment text. Then he exports the translation into Word, which is the document processed
Sheeech.
This reminds me of the silliness in documentation and translation when I worked for the federal government.
Oh and going back (again) WHY is this going on in the headers. Do you have code to deal with the main body. I am assuming this stuff is going on there as well.
fumei,
Please don't take it too personally! ;-)
Quote:
Then there should be no user involvement whatsoever. Whoever has control of the files does an automated batch processing.
Indeed, it is automated batch processing... I didn't say otherwise.
Quote:
So even after all this crap, you can have instances of text with single "tags'. Obviously you do not want that (otherwise why bother trying to clear out matching tags), but you have to accept this?
I didn't say I accept this, I only say that I cannot simply delete single tags like that. The idea, really, is to leave it so the user sees he forgot the matching tag, in which case he can add it along and run the automated task again. As I said, if I don't take human error into account, I would have a bigger mess whatsoever...
Yes, it is a training issue, but everyone makes mistakes as well, so even the most rigorous person may forget to close **some open bracket** in a sentence, and in my case, may forget to add the closing tag where it belongs... Got it?Quote:
And talk about time involved, the tags (identifiers of comments) are manually entered. Oh boy. And if the SAME user is adding the tags. Then to me this is a training issue. Tell them to use ONE type of "tag".
And, as I said, I do not want to use ONE type of "tag", as it might be used as an internal code by translation software. This is a risk I don't want to take, as I do not know/use/have control on the software used to produce the Word file. Let's assume one moment that ## ## is an internal tag used by such software. What would happen to the file once it is exported into Word? I have absolutely no idea! Maybe the text would be underlined, or put in upperscript? Who knows? I'd rather let this open, so that any evolution made to translation software won't impact my code.
Don't laugh. Government might be involved behind this as well.Quote:
This reminds me of the silliness in documentation and translation when I worked for the federal government.
Currently, I am working with the headers. But yes, the main body is also involved, just like any other part of the file (textboxes, end notes, etc.).Quote:
Oh and going back (again) WHY is this going on in the headers. Do you have code to deal with the main body. I am assuming this stuff is going on there as well.
gmaxey, you will need to explain me a bit what you are doing in your code, as this is beyond my understanding...
I don't understand why, but the line Set oRng = oHF.Range deletes the content of my first header, while placing the mouse on "oHF.Range" says it is = "".
So I updated this line to if oHF.Range <> "" then Set oRng = oHF.Range
The setting then works.
Now, I am not very knowledgeable with wildcards, so I am not sure how your .Text works. Same for .Replacement. What is the \2 supposed to do?
Actually, I tried your code and it does work with a tag that is fixed (same opening and closing tag), but as soon as I use different opening/closing tags, the code doesn't work as expected. Of course, I would then have strTag# = "#<" and strTag2 = "#>" and the .Text would be updated to .Text = "(" & strTag1 & ")(*)(" & strTag2 & ")"
Please advise.
glencoe,
I can't explain the range issue because it isn't happening here.
As for the wildcard search it works like this.
The "( )" in the .Text string are wildcard grouping symbols. So we are searching for three groups 1) the opening tag, 2) any text 3) the closing tag.
The "\2" in the .Replacement.Text string simply means to replace what was found (i.e., the opening tag, any text, and the closing tag) with the content of group 2 (i.e., just the text)
The following code will find text tag like this "##Test**"
Notice the closing tag string "\*\*" may look a little odd. This is because "*" in a wildcard search means "anything" where in this case we want to search for the literal "**" so we add the "\" before each literal character.
Hope this helps.
Code:Sub ScratchMacro()
'A basic Word macro coded by Greg Maxey
Dim oSection As Section
Dim oHF As HeaderFooter
Dim oRng As Range, oTextRng As Range
Dim strOTag As String, strCTag As String
strOTag = "##"
strCTag = "\*\*"
For Each oSection In ActiveDocument.Sections
For Each oHF In oSection.Headers
With oHF
If .LinkToPrevious = False Or oSection.Index = 1 Then
Set oRng = oHF.Range
With oRng.Find
.Text = "(" & strOTag & ")(*)(" & strCTag & ")"
.MatchWildcards = True
.Replacement.Text = "\2"
While .Execute(Replace:=wdReplaceOne)
oRng.HighlightColorIndex = wdBrightGreen
oRng.Collapse wdCollapseEnd
Wend
End With
End If
End With
Next oHF
Next oSection
End Sub
gmaxey,
Thanks for the explanation. Makes more sense to me now! ;-)
I also found a great page referencing in detail wildcards use in Word (not sure if there is a place on the forum to reference useful documentation) here: http://www.gmayor.com/replace_using_wildcards.htm
Now, I also foresee a problem using wildcards search in this case: as you perfectly explained, we need to use a \ to really search the literal * character.
In other words, all wildcards "special" characters should be avoided in the tags, or I would have to add a \ before any of them: ( [ ] { } < > ( ) - @ ? ! * \ )
I assume this makes sense, right? I found this issue by using #< and ># as opening/closing tags.
Now, though your code is great, it replaces all occurrences blindly without doing any order check (to make sure that there is no strOtag in between a correct pair). I am a bit concerned by this issue, and I'd like to find a nice fix for it. Maybe the solution would be the check in the .Find result if the string strOtag exists, in which case I could highlight it in red or any other color, and skip to the next occurrence... Any better idea?
glencoe,
Yes using wildcards does present its problems. This doesn't look as clean, but you can avoid them if you want:
Code:Sub ScratchMacro2()
'A basic Word macro coded by Greg Maxey
Dim oSection As Section
Dim oHF As HeaderFooter
Dim oRng As Range, oTextRng As Range
Dim oRng2 As Range
Dim strOTag As String, strCTag As String
strOTag = "##"
strCTag = "**"
For Each oSection In ActiveDocument.Sections
For Each oHF In oSection.Headers
With oHF
If .LinkToPrevious = False Or oSection.Index = 1 Then
Set oRng = oHF.Range
With oRng.Find
.Text = strOTag
While .Execute
Set oRng2 = oHF.Range
oRng2.Start = oRng.End
With oRng2.Find
.Text = strCTag
If .Execute Then
Set oTextRng = oRng.Duplicate
oTextRng.Collapse wdCollapseEnd
oTextRng.End = oRng2.Start
oRng.Delete
oRng2.Delete
If Not InStr(oTextRng.Text, strOTag) > 0 Then
oTextRng.HighlightColorIndex = wdBrightGreen
Else
oTextRng.HighlightColorIndex = wdRed
End If
End If
End With
Wend
End With
End If
End With
Next oHF
Next oSection
End Sub
I will tell you one thing...you would be much much much much better off if none of this commenting "tagging" was done in your translation software, and ALL of it was done in Word.Quote:
Yes, it is a training issue, but everyone makes mistakes as well, so even the most rigorous person may forget to close **some open bracket** in a sentence, and in my case, may forget to add the closing tag where it belongs... Got it?
And, as I said, I do not want to use ONE type of "tag", as it might be used as an internal code by translation software. This is a risk I don't want to take, as I do not know/use/have control on the software used to produce the Word file. Let's assume one moment that ## ## is an internal tag used by such software. What would happen to the file once it is exported into Word? I have absolutely no idea!
You could make a unique tagging string (or no tagging and just highlight, since that seems to be what you end up with), fired by a keyboard shortcut (or a icon click, whatever), that would ALWAYS use open and closing "tags". You would never, ever have to deal with checking for a closer, or if there was an extra one.
I do. Do all your tagging in Word to start with.Quote:
Now, though your code is great, it replaces all occurrences blindly without doing any order check (to make sure that there is no strOtag in between a correct pair). I am a bit concerned by this issue, and I'd like to find a nice fix for it. Maybe the solution would be the check in the .Find result if the string strOtag exists, in which case I could highlight it in red or any other color, and skip to the next occurrence... Any better idea?
Oh, and then you could do REAL comments. In any case, I will leave you in the capable hands of Greg, as he knows his stuff and is better suited to help you.
Thank you Greg! Will check your code tomorrow.
fumei, I understand your point, but this is a client's request... Later on, it may be a suggestion indeed to transfer those comments as real Word comments through the macro (but probably never commenting/highlighting manually directly in Word). There are a few good reasons to do it this way, but I don't think there is any need for me to explain them, as it would be totally off-topic here. Point is, I need to find a solution, and gmaxey seems to be guiding me on the right track... Plus, I like challenges! ;-)
gmaxey, I love your solution! The only thing I would actually change is highlighting only the strOTag found in instr instead of highlighting the whole range, only to give the user less manipulations to do manually after (i.e. manually clear the highlighted part, add the missing tag, run the macro again). Highlighting only the tags only requires the user to add the missing tag and run the macro again.
Anyway, I really took the time to consider your suggestion seriously, and while it is excellent, that is the only change I'd like to add. Now, don't just give me the answer yet, I want to do my homework before! ;-) Otherwise, I will never learn if you just give me everything on a plate... lolll
glencoe,
Glad I could help and you are correct. Sometimes I give away too many fish. It is nice to know that you want to learn to catch your own fish.
If you need some help though, just post back.
OK, I've got it...
Now, I noticed something interesting: though the code is working exactly as expected, I didn't realize before running it through a test that it would actually erase the next correct pair of tags (following a bad pair)!
e.g. original text with tags
#<correct tags>#
#<wrong closing tag#
#<correct tags>#
would give at the very end of the loop:
correct tags
#<wrong closing tag#
correct tags
And when I think about it, this makes more sense than what I actually wanted to accomplish... So I thought I could get the code to do it directly instead. The code is also simpler that way!
I copied below both options (of course, it is either one or the other). The first one is the one I was talking about in my previous post. And I applied what I learnt with your help! Hopefully I got it right.
I also commented each line to make sure I understood properly its meaning and action.
I would really appreciate if you could double check both the code and my comments and let me know if I made a mistake somewhere!
Also, I still haven't gone through all the process renaming variables in all of my code. If you have more advise about a good rule of thumb... string variables would start by "str". integers, by "int" I assume? any other suggestion?Code:Dim oRng As Range, oRng2 As Range, fnd As Range, oTextRng As Range, oTextRng2 As Range
For Each oSection In ActiveDocument.Sections
For Each oHF In oSection.Headers
With oHF
'1st option
If .LinkToPrevious = False Or oSection.Index = 1 Then
Set oRng = oHF.Range 'Whole header
Set oRng2 = oHF.Range 'Whole header
With oRng.Find
.Text = strOTag 'Opening tag
While .Execute 'Find opening tag (oRng)
oRng2.Start = oRng.End 'Starting after opening tag found above
With oRng2.Find
.Text = strCTag 'Closing tag
If .Execute Then 'Find closing tag (oRng2)
Set oTextRng = oRng.Duplicate 'Duplicating oRng. Now oTextRng = opening tag
oTextRng.Collapse wdCollapseEnd 'Starting position of oTextRng is now the end of Opening tag
oTextRng.End = oRng2.Start 'End of oTextRng = before Closing tag. Therefore oTextRng is the string between opening/closing tags
If Not InStr(oTextRng.Text, strOTag) > 0 Then 'No extra opening tag found in oTextRng, therefore pair of tags is correct. Processing tags and highlighting text
oRng.Delete 'Delete opening tag
oRng2.Delete 'Delete closing tag
oTextRng.HighlightColorIndex = Couleur1_Selectionnee 'Highlight string (this color is chosen by the user)
Else 'opening tag found within the string. Highlighting tags only in another color.
oRng.HighlightColorIndex = Couleur2_Selectionnee 'Highlight opening tag (this color is chosen by the user, though it will probably be red)
oRng2.HighlightColorIndex = Couleur2_Selectionnee 'Highlight closing tag
Set oTextRng2 = oTextRng.Duplicate 'Duplicate string range
With oTextRng2.Find
.Text = strOTag 'Opening tag
If .Execute Then 'Find opening tag in string
oTextRng2.HighlightColorIndex = Couleur2_Selectionnee 'Highlight extra opening tag
End If
End With
End If
End If
End With
Wend
End With
End If
'2nd solution
If .LinkToPrevious = False Or oSection.Index = 1 Then
Set oRng = oHF.Range 'Whole header
Set oRng2 = oHF.Range 'Whole header
With oRng.Find
.Text = strOTag 'Opening tag
While .Execute 'Find opening tag (oRng)
oRng2.Start = oRng.End 'Starting after opening tag found above
With oRng2.Find
.Text = strCTag 'Closing tag
If .Execute Then 'Find closing tag (oRng2)
Set oTextRng = oRng.Duplicate 'Duplicating oRng. Now oTextRng = opening tag
oTextRng.Collapse wdCollapseEnd 'Starting position of oTextRng is now the end of Opening tag
oTextRng.End = oRng2.Start 'End of oTextRng = before Closing tag. Therefore oTextRng is the string between opening/closing tags
If Not InStr(oTextRng.Text, strOTag) > 0 Then 'No extra opening tag found in oTextRng, therefore pair of tags is correct. Processing tags and highlighting text
oRng.Delete 'Delete opening tag
oRng2.Delete 'Delete closing tag
oTextRng.HighlightColorIndex = Couleur1_Selectionnee 'Highlight string
Else 'opening tag found within the string. Highlighting tags only in another color.
oRng.HighlightColorIndex = Couleur2_Selectionnee 'Highlight opening tag
End If
End If
End With
Wend
End With
End If
End With
Next oHF
Next oSection
Looks like you've achieved your goal and understand how you did it.
As for variables, I suppose each person has their own style. I usually proceed all objects (things that must be set) with o (e.g., oRng, oFld, oShp, oBM, etc.)
I use str text e.g., strText
I rarely use integers and use longs instead lngCount, lngIndex etc.
bSet, bState etc. for bollean
Just for you to compare, this is what I thought you might be after:
Code:Sub ScratchMacro2()
'A basic Word macro coded by Greg Maxey
Dim oSection As Section
Dim oHF As HeaderFooter
Dim oRng As Range, oTextRng As Range
Dim oRng2 As Range, oRngTag As Range
Dim strOTag As String, strCTag As String
strOTag = "##"
strCTag = "**"
For Each oSection In ActiveDocument.Sections
For Each oHF In oSection.Headers
With oHF
If .LinkToPrevious = False Or oSection.Index = 1 Then
Set oRng = oHF.Range
With oRng.Find
.Text = strOTag
While .Execute
Set oRng2 = oHF.Range
oRng2.Start = oRng.End
With oRng2.Find
.Text = strCTag
If .Execute Then
Set oTextRng = oRng.Duplicate
oTextRng.Collapse wdCollapseEnd
oTextRng.End = oRng2.Start
oRng.Delete
oRng2.Delete
oTextRng.HighlightColorIndex = wdBrightGreen
If InStr(oTextRng.Text, strOTag) > 0 Then
Set oRngTag = oTextRng.Duplicate
With oRngTag.Find
.Text = strOTag
While .Execute
oRngTag.HighlightColorIndex = wdRed
oRngTag.Collapse wdCollapseEnd
Wend
End With
End If
oRng.Start = oTextRng.End
End If
End With
Wend
End With
End If
End With
Next oHF
Next oSection
End Sub
I'm glad you updated the oRng.Start to the end of the strings, as I was actually wondering yesterday if this was required or not in ranges! I was also wondering how the .Find function really works on ranges. Would the execution be faster if the position of the "cursor" is closer to the next result (rather than at the beginning of the range)? I know the cursor is not really involved here, though, but it is more the range being shortened.
I also wonder what is the difference between oTextRng.Collapse wdCollapseEnd and oTextRng.Start = oTextRng.End. I assume both do the same? I only ask for my own information, as that question popped my mind when seeing the collapse function in your code.
Also, you raised another good point I didn't think: checking if the opening string appears several times within the string. I forgot to implement that one indeed!
Now, I don't fancy the idea of highlighting the whole string when extra opening tags exist, but that's only a matter of "taste"... I managed to find a good compromise, and the final code is both optimized and working well, due to your help! Thank you very much Greg! :-)
glencoe,
The little I know about VBA is self-taught so anything I say or do is based just on experience and what I've picked up along the way. Jason is probably more qualified to answer the technical questions.
You might find this interesting: http://gregmaxey.mvps.org/word_tip_p..._property.html
I don't know if there is any performance difference between .Start = .End or .Collapse wdCollaspeEnd or not. If there is I wouldn't think it would matter a nit.
Glad you have learned something from the experience.
Well, whatever your experience is in VBA, it certainly is much greater than mine. So, the "little" you know cannot be compared to the little I know! What I know is also pretty much self-taught, I am just 20 years behind you! :-p
I will certainly check your page shortly! I also really appreciate learning more about ranges, as I do see the major benefits it brings to coding... It's like switching a horse for a Ferrari (ok, a Ferrari also has a "horse" logo, but you got my point).
As we have gone extensively through the topic, I think I will post the thread as "solved" shortly (unless Jason wants to add something).
I will certainly request more help when time comes. And hopefully I will be able to help others as well, once my own knowledge is more robust!
Not a nit, as they are numerically - and ranges are just numbers - identical. There is no change to the memory allocation. That is the advantage of ranges over Selection; a range of 1,000 characters (.Start = .End -1,000) uses exactly the same memory as a range of 5 characters (.Start = .End -5). Unlike Selections of the same number of characters, as Selection uses GUI memory allocations.Quote:
I don't know if there is any performance difference between .Start = .End or .Collapse wdCollaspeEnd or not. If there is I wouldn't think it would matter a nit.
We should, as a best practise, only use Long as Integers are no longer used at all by VBA. You can declare them, but the parser converts them to Long on its initial parse pass. While I suppose you could say that the conversion itself takes time, and is true that the memory blocks (being larger for Long) need destroying and reallocation (and thus you really should never use integers), but I suspect that you would have to declare thousands and thousands of integers before you would be able to even detect anything. So yes you CAN declare integers, but why not just use what VBA uses...Longs.Quote:
I rarely use integers and use longs
Very interesting, fumei! Thank you for sharing!
Please all of you, give yourself a "good job tap on the shoulder" as you did convince me to forget (as much as possible) about Selection, and to move to Range instead... The memory allocation is just another extra benefit!
And I will also move all Integers to Longs instead. Now, I mostly use Integers for loops like "For i=0 to MaxValue Next", where MaxValue would be a low number (e.g. less than 100). In this case, would you still recommend moving to Long as well?
Is this an operation that is done by the parser (I assume so), or that we have to add in the code?Quote:
is true that the memory blocks (being larger for Long) need destroying and reallocation
No the VBA garbage collector takes care of most memory allocations (including destroying). Allocation happens the moment you declare a variable such as a Long. You do not need to do anything else. It has been a hot debate for years whether you should explicitly destroy objects with a Set object = Nothing. For all "minor" objects IMO you do not need to, but it IS a good idea with ranges, and definitely a good idea with application objects. When you use application instances you should always destroy them when you are done. As for ranges, there are circumstances where it is best to explicitly destroy them as part of your code blocks. In theory, when you exit a routine with a Set object (for example Set oHF = a headerfooter object), that object is destroyed. But there is still persistent issues that pop up with application objects - i.e. "extra" instances of Word.
For loops like that I still use Long. I never use Integers at all. Like I said, VBA converts all integers to long anyway.
How about local variables declared in a sub routine? Is it still necessary to "destroy" them, knowing that they should disappear as soon as the code exits the sub routine, or is it possible that they still remain in memory, unless they have been set to Nothing through the code?
There will be lots of changes to implement in my current macro in order to follow the proper "rules" and rules of thumb! And I'm not even talking about rethinking the code to optimize it with Range functions!
No, you can ignore local variables. VBA will tidy things up on its own. As stated, the only things of any realistic concern are application objects. You are not using any.
The only things you can destroy on your own are objects you give a value to with the Set instruction. In Greg's code there a few, all Range objects. As is, there is no need to explicitly destroy them with a Set = Nothing. VBA can handle things. I only really mentioned this subject as a comment on the nada performance issue regarding .Start = .End versus .Collapse.
Local variables (or even Public ones) such a Longs, or Strings are assigned a memory block when declared. For example, a Long is assigned 4 bytes. Once it is assigned, it remains (persists) until the routine that holds it terminates. So, regardless of VALUE ( 1 or 1,500 or 2,000,000), the declared Long uses 4 bytes for the Scope of the procedure. BTW, you should (if you have not done so) take a good look at what Scope means. When the procedure terminates, the block of 4 bytes is released by the VBA garbage collector. You do not have to do anything.
Obviously with our modern computers with gobs of memory 4 bytes is almost meaningless. Thus it is true that the old best practices are in some ways being made pointless. However, the principles of both understanding what is going on, and doing good housekeeping still seem valid to me. Some may argue otherwise.
Your explanation makes perfect sense, and I also think that by keeping everything tidy, you can't mess up your code! So I'd better go that way.
At least, if I tend to follow those simple rules, you won't be mad at me next time I show you some code I produced! ;-)
Thank you so much again! Let's close and solve officially this topic for now! :-)
Gerry,
"Jason is probably more qualified to answer the technical questions."
No slight intended. It is just that Jason seems to enjoy explaining the technical side and usually is spot on.
A more appropriate statement would be: "Jason or Gerry are more qualified to answer the technical questions."