Oh, and why do you need to select the range? Is that a left over from testing? No reason to have that, except while trying to understand working with ranges (very limited exceptions to this rule. Very)
Oh, and why do you need to select the range? Is that a left over from testing? No reason to have that, except while trying to understand working with ranges (very limited exceptions to this rule. Very)
Good catch!
DoEvents was a leftover from a previous debugging attempt, but you're correct, I can remove it now, since the loop is working.
As for my wrong declaration, I truly didn't know that! Thanks for the teaching. I'll have to go through all my code to fix every other wrong declaration... :-(
As for selecting the range, well, I tried (really hard) without it, as everyone said it was not the right way to do it indeed. That part of code does work without this line indeed, but I have to say that after this part of the code, I am doing other actions on the text, and I need to play directly with the text within the document, and that's the only way I found it to work. But at least, it's only 1 extra line, while all my code has been reduced by at least 70% using the posted method (compared to the "manual" action I was using previously, which was a nightmare and awfully slow).
I would hazard a guess that you just don't quite understand working with the range object yet... 99% of the time, that's the reason why .Select is "needed" -- but the main reason to get rid of .Select is processing speed... (If word doesn't have to actually select an area of the document to work on, it doesn't have to refresh the computer screen, and verify the new selection has all the right buttons like bold/underline pressed or not pressed, etc etc).
So if you're happy with your processing speed, leave the next step to understanding range objects to another day.
The better improvement to your code, at this point, would be to encapsulate functionality.
Work on a routine that accepts a range argument (which you would pass from all your finding stuff), then build that routine to *only* work on whatever text manipulation you're going to do.
Basically, try to separate the actions of:
1. Looping through headers/footers in a document and getting each unique range
2. Finding a particular range within the range from #1 (which will really be the space between two other ranges)
3. Manipulation of the text within each range from #2
the benefit to this structure (as opposed to one big single routine which does all of the above) is that you can test the individual components without having to run the whole routine.
You can just use the immediate window to type
mytestsub Selection.Range
where mytestsub accepts one range parameter.
this may be too much for now... But down the road, keep the thought about encapsulating functionality and at some point, it'll click. Good luck!
I know you are right, but I have already come a long way from my original code (which you DO NOT want to see, you'd be really ***ed off) to the one I have now. I still need to get my mind on all this range stuff, which I certainly do not master well now, obviously. But you sound encouraging, and I like to think that way that working hard to have a clean code that is 100% optimised can gain a lot of speed in the end!
I will get there at some point, in which case I will probably need more help from you all!
If you want to give me more insight, you are more than welcome to. Here is the whole loop as it is right now. Just a few notes about it:
- I may have forgotten to include some declarations here, as I am copying/pasting them from different places, but every variable is declared in my code
- I copied only part of the code that is processing the text. I have a few other things not related to it here and there in the code
- you may see some French in the comments or variable names, as French is my first language ;-) I tried to translate quickly everything, but I may have forgotten a few things here and there
Dim oSection As Section Dim HdFt As HeaderFooter Dim r as range Dim HdFt as range Dim Tag1 as string Dim Tag2 as string With ActiveDocument For Each oSection In .Sections For Each HdFt In oSection.Headers With HdFt If .LinkToPrevious = False Or oSection.Index = 1 Then HdFt.Range.Select TempPrev = 1 'Find position of 1st Tag1 Set r = HdFt.Range Set fnd = HdFt.Range With fnd.Find .Execute (Tag1) If .Found = True Then TempStart = fnd.End End If End With While TempStart >= TempPrev + Len(Tag1) 'Find position of next tag1 Set fnd = HdFt.Range fnd.Start = TempStart With fnd.Find .Execute (Tag1) If .Found = True Then PosTag1 = fnd.Start End If End With If Tag1 <> Tag2 Then 'Find position of Tag2 Set fnd = HdFt.Range fnd.Start = TempStart With fnd.Find .Execute (Tag2) If .Found = True Then PosTag2 = fnd.Start End If End With Else 'if tag1 = tag2 PosTag2 = PosTag1 End If 'THIS IS WHERE THE .SELECT IS REQUIRED! From here, you may find ways to improve my code If PosTag2 <= PosTag1 And PosTag2 > TempStart Then 'correct order, replacing tags 'Highlighting text between tags Pos = Application.Selection.MoveLeft(, 1, wdMove) 'place cursor at the beginning of the header Pos = Application.Selection.MoveRight(, TempStart - 1, wdMove) 'move cursor after tag1 Pos = Application.Selection.MoveRight(, PosTag2 - TempStart, wdExtend) 'select text between both tags Selection.Range.HighlightColorIndex = Couleur1_Selectionnee 'highlighting text with chosen color Application.ScreenRefresh 'Deleting tags Set r = HdFt.Range Set fnd = HdFt.Range 'Replace Tag1 fnd.Start = TempStart - Len(Tag1) With fnd.Find .Forward = True .Wrap = wdFindStop .Text = Tag1 .Replacement.Text = "" .Execute Replace:=wdReplaceOne, Forward:=True, Wrap:=wdFindContinue End With 'Replace Tag2 fnd.Start = PosTag2 - Len(Tag1) With fnd.Find .Forward = True .Wrap = wdFindStop .Text = Tag2 .Replacement.Text = "" .Execute Replace:=wdReplaceOne, Forward:=True, Wrap:=wdFindContinue End With TempStart = PosTag2 - Len(Tag1) TempPrev = TempStart HdFt.Range.Select Else 'wrong order, skip case TempStart = PosTag2 TempPrev = TempStart End If 'Find next tag1 Set r = HdFt.Range Set fnd = HdFt.Range fnd.Start = TempStart With fnd.Find .Execute (Tag1) If .Found = True Then TempStart = fnd.End End If End With Wend End If End With Next HdFt Next oSection End With End If
Code never irritates me. Everyone learns at a different pace. The reason I am on this forum (and I would imagine many of the other contributors) is to help teach and also learn ourselves. I'm always learning new things, and I've been working and programming in Word for almost 20 years.
It looks to me like the reason you're using Selection is because you haven't explored the analogous .MoveStart and .MoveEnd methods available off the range object (you're using .MoveLeft and .MoveRight methods off the selection object). With minor caveats (.MoveDown and .MoveUp aren't available off a range object, whereas they are available off the Selection object), you can achieve the same thing you're doing (which is basically translating .MoveLeft with .MoveStart and .MoveRight with .MoveEnd
That said, when you're working with ranges, you have to understand that you're working with an actual number... so while .MoveLeft wdcharacter, 1 will collapse the selection (if it's not an insertion point) and move it one character to the left... you would need to use a negative number with the range, so .MoveStart wdcharacter, -1 would do the same thing. But if your range wasn't an insertion point, you might need to use the .Collapse method on the range as well.
You should play around with this stuff... you'll get the hang of it if you stick with it. And you'll end up with much faster processing (especially since you said this typically operates on long documents).
Couple other notes...
1. Try to avoid giant (and useless) With... End With blocks. The entire code snippet contained above is within a With ActiveDocument block. You only use it once, at the top of your For Each loop... just do a For Each oSection In ActiveDocument.Sections -- and you can remove one level of indenting. Same with the HdFt with chunk... you rarely use it, and it's not helpful to your coding... so just get rid of that With statement as well -- and you've removed another indented chunk for no reason.
2. You can save your .Found logic and simply use If .Execute (Tag1) = True Then... The .Execute function returns True if it's found.
3. Prefixes prefixes prefixes. Use appropriate prefixes for your variables... it's much easier to debug code that way, and easier for others to read. Tag1 should be sTag1 (since it's a string) or strTag1 (those are the two common prefixes for a string type). Range should be "rng" or "r" or "rg" .... TempStart and TempEnd should be lStart/lngStart or lEnd or lngEnd. The actual prefix doesn't matter, but consistency does...
There are lots of little rid-bits... but take them in small chunks, apply, and watch your programming get easier and easier!
Good luck!
Jason,
You and I have had a discussion about the apparent StoryRange bug before
I've gone back through my notes and found the following commented code that does (I think) a fair job of illustrating the issue and necessary fixes:
Sub SetUpExample() 'This code demonstrates peculiarities in the Word "StoryRanges" collection and how to deal with them. Dim oDoc As Word.Document Dim oRngStory As Word.Range Dim i As Long Dim lngBugExterminator As Long Set oDoc = Documents.Add Stop 'Step through code and switch between code and document to follow the process. Debug.Print oDoc.StoryRanges.Count 'Notice that a new document based on a clean normal template contains only 1 storyrange. 'Depending on the version of Word there are a total of 11 to 17 storyRanges 'Ensure formatting marks are showing. 'Notice the headers and footers will be empty (i.e., no visible paragraph mark.) 'This is because the header and footer ranges are not defined at this point. 'Add a new section. oDoc.Sections.Add Debug.Print oDoc.StoryRanges.Count 'Notice that adding a section does not effect the storyrange collection. 'Isolate section 2 header and footers from section 1. For i = 1 To 3 oDoc.Sections(2).Headers(i).LinkToPrevious = False oDoc.Sections(2).Headers(i).LinkToPrevious = False Next i 'Reference a header (do something with it, anything). oDoc.Sections(2).Headers(wdHeaderFooterPrimary).Range.Text = "Section 2 Header text" Debug.Print oDoc.StoryRanges.Count 'Notice that simply referencing a header or footer defines and expands the storyrange collection 'to include the six header and footer storyranges (6 through 11). For Each oRngStory In oDoc.StoryRanges Debug.Print oRngStory.StoryType Next oRngStory 'Every range has at least one paragraph defined. 'Accordingly, the formally (empty) section 1 header and footer will now contain a paragraph mark. 'The following code similates a user clicking in the section one header (defined with a single paragraph and no text). oDoc.ActiveWindow.View.SeekView = wdSeekCurrentPageHeader oDoc.ActiveWindow.View.SeekView = wdSeekMainDocument Debug.Print oDoc.StoryRanges.Count 'Notice how this action, which could easily be done by the user in with the UI, removes (and kills) the 'section 1 header and footer ranges. The paragraph marks are gone! 'More troublesome is that the six header and footer storyranges (6 through 11) from the storyrange collection 'have also been killed as the following illustrates!! For Each oRngStory In oDoc.StoryRanges Debug.Print oRngStory.StoryType Next oRngStory 'We know that there is a header range defined in section 2 because we can still see the text. This means that a header range does 'not necessarily mean a header storyrange. 'The following illustrates that in order to process all ranges, you must first ensure that all the storyranges are defined. On Error GoTo Err_Handler Err_ReEntry: Set oRngStory = oDoc.StoryRanges(wdPrimaryHeaderStory) Do Until oRngStory Is Nothing Debug.Print oRngStory.Text Set oRngStory = oRngStory.NextStoryRange Loop 'Since we've reference the section 1 headers and created the range, we are left with empty paragraph marks. Remove them as demostrated above. oDoc.ActiveWindow.View.SeekView = wdSeekCurrentPageHeader oDoc.ActiveWindow.View.SeekView = wdSeekMainDocument oDoc.Close wdDoNotSaveChanges Exit Sub Err_Handler: 'There is no wdPrimaryHeaderStory so the error occurred. 'It is content in Section 1 that defines the 6 header/footer storyranges. 'To ensure they are defined all you need to do is reference one of them: lngBugExterminator = oDoc.Sections(1).Headers(1).Range.StoryType Resume Err_ReEntry 'What I've learned is that lngBugExterminator and the reference to the section 1 header does "NOT" ensure that skipped headers are processed. 'It is actually section 1 that defines the storyrange collection. End Sub Sub PracticalExample() Dim oRngStory As Word.Range Dim i As Long Dim lngBugExterminator As Long 'Create reference to one of the first section header storyranges. This ensures the six header/footer storyranges are defined. lngBugExterminator = ActiveDocument.Sections(1).Headers(1).Range.StoryType For Each oRngStory In ActiveDocument.StoryRanges 'Iterate through all linked stories Do Until oRngStory Is Nothing With oRngStory Select Case .StoryType Case wdEvenPagesHeaderStory, wdFirstPageHeaderStory, wdPrimaryHeaderStory, wdEvenPagesFooterStory, wdFirstPageFooterStory, wdPrimaryFooterStory If Len(.Text) > 1 Then Debug.Print .Text End If End Select End With Set oRngStory = oRngStory.NextStoryRange Loop 'Until oRngStory Is Nothing Next End Sub
glencoe, I do wish you would actually SAY what it is you want to do, with some example of before and after. You have not done this and I strongly suspect that this is much more complicated than it needs to be. AND I still find the idea of what USERS need to do really debatable. But then, again, you do not actually state (with examples) what it is you need to happen.
oops, double post, now deleted.
Frosty,
Thank you so much for your valuable help again! You are 100% right on everything. Lots of things to improve here (and to learn, of course). As you know, fumei was the first to teach me on this thread, and you guys then hopped in and went even further. The only thing that scares me actually is that I have already done quite a bit of programming on this project, and though I always knew from the start that it would require lots of optimization, I'm afraid that what you show me here is only a small proof of what I will need to do next... Lots of work involved indeed. But I am thrilled to imagine what would be the result! Speed and performance will be so much improved! :-)
I will go through your suggestions and see how I can improve my code. I know I haven't gone through all the .range methods that would do the same, but only better. But I will now.
gmaxey,
Does your code suggest that the approach I followed here is also "bugged" because of the StoryRange method? I will have to study your code closely.
fumei,
I didn't tell you everything indeed, I only said what was strictly necessary to this thread. But you are right, this whole discussion is only part of a much bigger project. So here you go...
To start with, I am a technical translator (from English to French, though this is irrelevant here). I used to be an electronics engineer, but moved to the translation industry a couple of years ago. To make a short story, a client asked me to develop (for free, and for their internal resources) a macro that would clean French Word documents generated by different translation software. My strength here is not as much programming, but rather being able to understand the issues behind the French language and apply them in a code (though very awkwardly, I must say).
Now, what do I mean by "clean Word docs"?
As you understand, every language has grammar rules. Some of them are simple. French is rather complex. We pinpointed about 40 rules that could be automated. Rules like deleting extra (unnecessary) spaces, applying punctuation rules (like using a hard space in front of a : symbol or between specific strings of characters), replacing English quotation marks (" ") by French quotation marks (« »), applying "French" as the language of the whole document (for the spellchecker), and so on. Basically, the whole idea is to accelerate the proofreading process and deliver a clean file. Yes, we do use powerful spellcheckers (more powerful than Word's), but their process is not fully automated, so using this macro does optimize the task. And, as you probably guessed it, one of the tasks is to clean those "tags" and highlight text in-between. Now, those "tags" are not tags as such (like html coding), but only markers input by the translator in some translation software that cannot handle highlighting features. Those markers are therefore processed through the macro to highlight text. That's it. I may come up with other features later on with those markers, but right now, that's the task I am working on.
Now, because translation software are evolving and also use their own tagging system (and here, I mean tags like html coding that are used internally by the software), I didn't want to force users using specific tags (like ##text to be highlighted##), but rather let them chose them. The only restriction I am giving is to use a minimum of 2 characters per tag. Users are advised to use tags like ## ##, or #< ># or other similar strings.
Of course, the whole function is integrated within the macro and displays a real-time log of processed tasks/replacements and so forth.
As a side note, exporting files like PDFs in Word can give very unpredictable and messy results, like adding text boxes in headers/footers! So this partly answers your concern about messy text boxes in headers. I am not responsible for them| I have no choice but to process them as well...
Now, to come back to the project, I have already programmed most of it, and it works nicely (though I know speed is far from optimized due to my poor programming approach), and I will have to tackle the big task to optimize the whole thing at the end; that's what scares me most!
That's it. You know the whole story now... ;-)
gmaxey,
Does your code suggest that the approach I followed here is also "bugged" because of the StoryRange method? I will have to study your code closely.
From the example code you provided here, it does not appear that you are using StoryRanges. If you are, and if you don't create a reference to a section 1 header or footer (i.e., using lngBugExterminator) then you could have a situation where the header footer storyranges don't exist and therefor are not processed.
I'm in Gerry's camp on with his assessment of over complicated. You have told us what you are doing with tags in the headers, but you haven't shown us.
I added text like Text ##Test## and **Test** to a header and ran (after fixing compile errors) your code and nothing happens.
Can you provide an example of before and after header text?
First, the code as it is does work, even though it can probably be simplified indeed.I added text like Text ##Test## and **Test** to a header and ran (after fixing compile errors) your code and nothing happens.
Can you provide an example of before and after header text?
Tag1 and Tag2 are those ## or whatever else you want them to be.
The example you gave is correct: assuming you have the following in a header:
##text to be highlighted##
then the code I provided will replace it with
text to be highlighted
and the highlight color is selected by the user (in an interface before running the macro).
I'm not sure what else I could provide exactly...
glencoe,
Of course I won't argue if you code is working for you. Here is didn't work with the tags defined:
Tag1 = "##"
Tag2 = "**"
When I changed that to:
Tag1 = "##"
Tag2 = "##"
It still did nothing. Then I changed:
TempPrev = 1
to:
TempPrev = 0
and I had "mixed" results, but at least something happened.
Now looking at the example you have just provided, something like this works (for that example):
Sub ScratchMacro() 'A basic Word macro coded by Greg Maxey Dim oSection As Section Dim oHF As HeaderFooter Dim oRng As Range, oTextRng As Range Dim strTag As String strTag = "##" For Each oSection In ActiveDocument.Sections For Each oHF In oSection.Headers With oHF If .LinkToPrevious = False Or oSection.Index = 1 Then Set oRng = oHF.Range With oRng.Find .Text = "(" & strTag & ")(*)(" & strTag & ")" .MatchWildcards = True .Replacement.Text = "\2" While .Execute(Replace:=wdReplaceOne) oRng.HighlightColorIndex = wdBrightGreen oRng.Collapse wdCollapseEnd Wend End With End If End With Next oHF Next oSection End Sub
Which is a heck of lot simpler.
Sorry, but I still find this almost unbelievable. Are you telling me that you could - with virtually no rules apparently ("## ##, or #< ># or other similar ") - that you could have:
##text blah blah##
#<text blah blah>#
Because if there ARE rules, surely you can not have 40 DIFFERENT start and end identifiers. And even if you DO, then applying reverse rules is still going to be easier than some of the stuff that is being attempted.
And I will repeat myself for the third or fourth time...
If these tags (or identifiers) NEED to be changed, why why why why why why are the users doing ANY of this? This should be an automated process untouched by the users. Or are you saying SOME of these identifiers will stay, and some need to be changed.
If they all need to be changed, then I will flat tell you that any concerns about processing time are nothing compared to what is needed when users are involved. The time to take an input from the users will more than cover any time by processing.
gmaxey,
Your code looks so much simpler indeed! I will try it shortly. You would probably faint seeing the original code I had for the same loop (basically it was all manual search, character by character, using the same kind of ".moveright/.moveleft" procedures as I still have in the code I produced here. The whole loop must have gone from 200 lines to... less than 20! That's a heck of an optimization! :-) And I am not even talking about the effect on speed...
fumei, maybe my explanation was not clear, as I am confused at what you say. I have about 40 different tasks for the whole project (most of them being applying grammar rules). The current code of this thread is only processing ONE of the tasks, that is, processing the "commented part" added by the translator. There is no grammar rule involved here. The only "rule" concerns the markers (tags) being used to find those comments.
Does it make more sense to you?
Oh, OK. That does make sense. I will repeat what Frosty has mentioned. Break off functions into workable chunks. Trying to put everything into one routine is a BAD BAD idea. It leads to confusion and terrible debugging problems.
One confusion that still remains is why users are involved at all.
fumei,
The user (translator) adds tags in the translation software where he wants to comment text. Then he exports the translation into Word, which is the document processed through this macro, and that's where the automation starts.
As every tag is added by a human in the original document, errors are possible. Such errors are for example adding an opening tag, but forgetting to close it; as if you were opening a bracket in your sentence, but oops, you forget to close it... In such a case, I cannot process ALL tags, as one pair is not complete; my choice is to ignore tags that are not properly paired. I could possibly highlight tags that are single, to show the user something is missing.
Otherwise, if the document contains matched pairs of tags, then yes, ALL of them should be replaced (deleted and text highlighted).
Then there should be no user involvement whatsoever. Whoever has control of the files does an automated batch processing. And I have to say this is horribly sloppy. So even after all this crap, you can have instances of text with single "tags'. Obviously you do not want that (otherwise why bother trying to clear out matching tags), but you have to accept this?
And talk about time involved, the tags (identifiers of comments) are manually entered. Oh boy. And if the SAME user is adding the tags:
Then to me this is a training issue. Tell them to use ONE type of "tag".The user (translator) adds tags in the translation software where he wants to comment text. Then he exports the translation into Word, which is the document processed
Sheeech.
This reminds me of the silliness in documentation and translation when I worked for the federal government.
Oh and going back (again) WHY is this going on in the headers. Do you have code to deal with the main body. I am assuming this stuff is going on there as well.
fumei,
Please don't take it too personally! ;-)
Then there should be no user involvement whatsoever. Whoever has control of the files does an automated batch processing.
Indeed, it is automated batch processing... I didn't say otherwise.
So even after all this crap, you can have instances of text with single "tags'. Obviously you do not want that (otherwise why bother trying to clear out matching tags), but you have to accept this?
I didn't say I accept this, I only say that I cannot simply delete single tags like that. The idea, really, is to leave it so the user sees he forgot the matching tag, in which case he can add it along and run the automated task again. As I said, if I don't take human error into account, I would have a bigger mess whatsoever...
Yes, it is a training issue, but everyone makes mistakes as well, so even the most rigorous person may forget to close **some open bracket** in a sentence, and in my case, may forget to add the closing tag where it belongs... Got it?And talk about time involved, the tags (identifiers of comments) are manually entered. Oh boy. And if the SAME user is adding the tags. Then to me this is a training issue. Tell them to use ONE type of "tag".
And, as I said, I do not want to use ONE type of "tag", as it might be used as an internal code by translation software. This is a risk I don't want to take, as I do not know/use/have control on the software used to produce the Word file. Let's assume one moment that ## ## is an internal tag used by such software. What would happen to the file once it is exported into Word? I have absolutely no idea! Maybe the text would be underlined, or put in upperscript? Who knows? I'd rather let this open, so that any evolution made to translation software won't impact my code.
Don't laugh. Government might be involved behind this as well.This reminds me of the silliness in documentation and translation when I worked for the federal government.
Currently, I am working with the headers. But yes, the main body is also involved, just like any other part of the file (textboxes, end notes, etc.).Oh and going back (again) WHY is this going on in the headers. Do you have code to deal with the main body. I am assuming this stuff is going on there as well.
Last edited by glencoe; 02-08-2014 at 06:08 PM. Reason: added explanation
gmaxey, you will need to explain me a bit what you are doing in your code, as this is beyond my understanding...
I don't understand why, but the line Set oRng = oHF.Range deletes the content of my first header, while placing the mouse on "oHF.Range" says it is = "".
So I updated this line to if oHF.Range <> "" then Set oRng = oHF.Range
The setting then works.
Now, I am not very knowledgeable with wildcards, so I am not sure how your .Text works. Same for .Replacement. What is the \2 supposed to do?
Actually, I tried your code and it does work with a tag that is fixed (same opening and closing tag), but as soon as I use different opening/closing tags, the code doesn't work as expected. Of course, I would then have strTag# = "#<" and strTag2 = "#>" and the .Text would be updated to .Text = "(" & strTag1 & ")(*)(" & strTag2 & ")"
Please advise.