PDA

View Full Version : Solved: FindText and Capitalisation



aishwaryav
09-10-2009, 01:01 AM
Hi,

I've been attempting to convert numbers written in the Indian numerical system into the international system, within text, in word documents. I look for '<number> cr'' and replace it with '<number*10>' million. This works pretty well, but doesn't work for instance of <number Cr>. I have set MatchCase to False, and can't figure out what I'm doing wrong. Please help!

Here's a part of the code I'm using:

Selection.Find.ClearFormatting
With Selection.Find
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
Do While .Execute(FindText:="[0-9]{1,}.[0-9]{1,} cr", MatchCase:=False, Wrap:=wdFindContinue, Forward:=True) = True
With Selection 'convert decimal crores to million
.MoveLeft Unit:=wdWord, Count:=1, Extend:=wdExtend
.MoveLeft Unit:=wdCharacter, Count:=1, Extend:=wdExtend
.Range.Text = Selection.Range.Text * 10
.MoveEndUntil Cset:="c", Count:=wdForward
.MoveRight Unit:=wdWord, Count:=1, Extend:=wdMove
.MoveRight Unit:=wdWord, Count:=1, Extend:=wdExtend
.Range.Text = "million "
End With
Loop

Tinbendr
09-10-2009, 09:53 AM
Change

.MoveRight Unit:=wdWord, Count:=1, Extend:=wdMove
to

.MoveRight Unit:=wdWord, Count:=3, Extend:=wdMove

Word is counting the decimal and the number after it as a word.

fumei
09-10-2009, 10:06 AM
Just in case someone is curious, a crore is 10 000 000 (10 million). It is 100 lakh, another ancient numbering unit.

OK. You can do this. It is better to use range objects. Here is the code. You can see it in the attached document. Click "Crore Me" on the top toolbar.
Sub CroreToMillions()
Dim r As Range
Set r = ActiveDocument.Range
With r.Find
.ClearFormatting
.MatchWildcards = True
.Text = "[0-9]{1,}.[0-9]{1,}"
Do While .Execute(Forward:=True) = True
If UCase(r.Next(Unit:=wdWord, Count:=1)) = "CR" Or _
UCase(r.Next(Unit:=wdWord, Count:=1)) = "CR " Or _
UCase(r.Next(Unit:=wdWord, Count:=1)) = "CRORES" Then
r.Text = r.Text * 10
r.Next(Unit:=wdWord, Count:=1) = "million "
End If
r.Collapse 0
Loop
End With
End Sub
The key is to do the testing against whatever (cr, Cr, or even a typo like cR) by testing that as uppercase.

"cr" becomes "CR"
"Cr" also becomes "CR"

And you test against THAT.

I included "Crores".

NOTE! The code searches for the numeric part - "[0-9]{1,}.[0-9]{1,}" - ONLY. It then checks the Next word to see if it is CR (again, using UCASE).

If the Next word is CR (or CRORES), it does the * 10 to the range object AND changes the Next word to "millions ".

Because of the way Word works, the testing has to be:

"CR" and "CR " - the space following.

"CR"<p> - "CR" followed by a paragraph mark is TWO words to VBA.

"CR " - "CR" followed by a space is ONE word.

Therefore, there is, in fact, an error in my code. The "million" works fine for all instances EXCEPT the last one, as r.Next is followed by a paragraph mark. This can easily be fixed.

But I am not going to do it. It is - or should be - an fairly easily exercise in logic.

Also note that the search string contains a dot.

"[0-9]{1,}.[0-9]{1,}"

If the numeric example - say, 456 Cr - does NOT contain a dot (which it does not), then the code will not do anything.

Can that be fixed? Yes, absolutely. It is, again, a fairly easy exercise in logic. So I am not going to do it.

If you need these refinements - and I think you do - please try to do them on your own. If you have a serious problem, post a question.

N.B. if you take the code and play with it/change it, make sure you include the r.Collapse 0 within the loop. It is critical. The code will NOT work without it.

aishwaryav
09-10-2009, 10:31 PM
fumei, thanks for helping! I've figured out the <whole number> crores thingie.

I still have a small problem, though. In the documents I'm dealing with, the word crores is spelt in as many ways as you can imagine. So I have Cr/CR, Crs/CRs, Crores/crores, crore/Crore, and, sometimes, "cr."
Also, I've not pasted the whole thing here, but I use a similar method for converting lakhs to million. and with lakhs, the variations in spelling are even worse.

Is the only way to get this conversion done using OR several times, and hoping I've covered all imaginable spellings?

aishwaryav
09-10-2009, 10:34 PM
fumei, thanks for helping, I've figured out the <whole number> crore thingie.

I still have a small problem, though. I use a similar method to convert lakhs to million. And in the documents I'm working with, the word lakh appears in many different forms/spellings. Lac,Lakh,lc...Is using OR several times, the only way to cover all possible spellings?

Thanks again.

aishwaryav
09-10-2009, 10:41 PM
hey, the cursor already moves right to the beginning of the word "crore". So, I don't think what you suggest is the problem.

fumei
09-11-2009, 10:28 AM
I do not understand what you mean by: "So, I don't think what you suggest is the problem."

What problem? In my code the cursor does not move anywhere.

Regarding the possible spellings, nope, sorry, you will have to use a bunch of OR statements.

It may be possible - well, it IS possible - to do a test on the Next word and just test the first two characters. This would work for:

"So I have Cr/CR, Crs/CRs, Crores/crores, crore/Crore, and, sometimes, "cr."

All those have cr as the first two characters.

However, for lakh, as you point out the spelling and/or abbreviations are a lot messier.

Lac,Lakh,lc

There is nothing consistent to work with...so...a bunch of OR statements. Mind you...

You could:

1. test the first character
2. if it is "L" - using UCASE to make sure whatever character is tested as uppercase, then;
3. test the second character
4. if is "A" (lac, lakh) - or "C" (LC), then you may assume that lakh is meant.

However, while I think that this logic would be accurate most of the time, there is a real possibility that is could not be. Just think of the logic.

Test/search for a numeric string, THEN test the first two characters of the Next word for "A" or "C".

Could there be anything that would NOT be meant as lakh?

Probably not, but as it stands, the logic does not ensure that.

Again, be careful when ever you use "word" in Word VBA. In VBA "word" is not exactly what you may think it is.

aishwaryav
09-12-2009, 12:05 AM
fumei, the moving cursor reply was meant for Tinbendr.

Yeah, I think for crores, I'll check the first two characters and for lakhs, I'll use a bunch of OR statements. I also have "6-lane highways" and the like in these docs, so looking for the first two letters won't work.

Thank you so much.

fumei
09-14-2009, 10:19 AM
OK, I think you need to state everything you want to happen. You want to replace "6-lane highways" with something? What? How many of these replacements do you have? Is there a real reason for using VBA? Find and Replace - no VBA - can be very effective.

aishwaryav
09-14-2009, 10:20 PM
fumei,

I'm an editor and work with a bunch of very tired editors who don't want to keep changing crores/lakhs to millions manually in every document. We work on several documents in a day, of varying lengths degrees of boredom. I don't want to replace "6 lane highways" with anything. I don't know how you concluded that. I was saying that if I look for only "<number> LA" when looking for lakhs, in order to avoid having a bunch of OR statements with "<number> LAC", "<number> LAKH", "<number> LAC " and so on, it would not work because I also have terms like "6 lane highways" and "60 laboratories" in the documents, which would also get converted.

Thanks for the help. I've resolved the issue with your help.

and, please, breathe.

fumei
09-16-2009, 09:07 AM
I am breathing fine, thank you. Why do say that? Because I ask questions about what you write? You are not even consistent in the SAME post. Look in the one above.

"I don't want to replace "6 lane highways" with anything. I don't know how you concluded that. "

and

"because I also have terms like "6 lane highways" and "60 laboratories" in the documents, which would also get converted. "

THAT is how. Although converted to what, I have no idea.

Oh well. Yes, I am breathing.

lucas
09-16-2009, 09:30 AM
fumei,

Thanks for the help. I've resolved the issue with your help.

and, please, breathe.

And so you thank him by being insulting?

aishwaryav
09-16-2009, 08:32 PM
right, this has become a bigger issue than I imagined it would. I didn't mean to insult fumei. I just felt that fumei jumps to conclusions, but it's probably truer that I don't communicate properly. In any case, fumei has helped me solve the problem, and I'm thankful. I was only trying to tell fumei that because "6 lane" and "60 laboratories" have the <number> <word starting with la> construction, they would also get converted to millions if I looked for only "<number> la", and I don't want that to happen. Because, then, "6 lane highway" would be "0.6 million highway", which is just wrong. I never said that I wanted them converted to anything. Right now, I can't even remember why I bothered mentioned this issue in the reply. Probably because fumei had said, "All those have cr as the first two characters", which got me thinking about how the "lakhs", "lac", "lacs", and "lakh" in my docs have "la" as the first two characters.

fumei
09-17-2009, 08:36 AM
Ah, OK that makes sense. Misunderstanding. The answer to that is expand the testing length beyond 2 characters.

Because, yes, of course you do not want 6 (numeric) followed by "la"ne highway (2 characters) to become 0.6 million highway.

I think you are stuck with a bunch of OR statements for lakh/lac/lacs/lakhs/lcs etc. 'Tis a brute force method, but in your circumstance, there is not much in the way of other options.

carrrnuttt
09-24-2009, 06:28 PM
I still have a small problem, though. I use a similar method to convert lakhs to million. And in the documents I'm working with, the word lakh appears in many different forms/spellings. Lac,Lakh,lc...Is using OR several times, the only way to cover all possible spellings?

Why not use a Select statement, then pass the matching value to a variable you check against in your conditional code?

Or even a Function that returns a Boolean if any combination is matched or not. No need for an Or.

fumei
09-28-2009, 12:02 PM
I agree. A Boolean Function would probably be the best route.