PDA

View Full Version : Help with Regular Expression



samuelimtech
04-07-2014, 05:45 AM
Hi all the below code works but i dont actually understand what its doing. more specifically the .text term, it seems to be looking for something that fits the criteria but i have no clue what that is.
my task is to edit this slightly so understanding it is the first step.

thanks


With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Format = False
.Wrap = wdFindStop
.MatchWholeWord = True
.MatchWildcards = True
.MatchCase = True
'Find expressions between matched pairs of double quotes,
'allowing for the fact that 'smart quotes' may not be in use.
.Text = "([MID]{1})([:]{1})([0-9.a-z]{1,})([^t\)]{1,})"

SamT
04-07-2014, 05:42 PM
That is a Regular Expression.

I changed the thread title to get you more help.

Paul_Hossler
04-07-2014, 06:47 PM
.Text = "([MID]{1})([:]{1})([0-9.a-z]{1,})([^t\)]{1,})"


http://en.wikipedia.org/wiki/Regular_expression


As SamT says, that's a RegEx, but I seem to recall that Word uses a slightly non-standard version of one of the RegEx flavors


This is a good write up for Word Wildcards --

http://word.mvps.org/faqs/general/usingwildcards.htm


Taking a stab at it ....

Looks like 4 groups (usually used to return something to .Replace)


([MID]{1}) = Exactly one of either M or I or D

([:]{1}) = Exactly one colon

([0-9.a-z]{1,}) = One or more of 0 to 9 or a to z. The dot is a little confusing to me but I think Word is using that as a period and NOT are the RegEx metacharacter for 'Match any character'

([^t\)]{1,}) = One or more Tabs or right parens. The \ is normally the escape character to allow the paren to be a paren and not a closing token as in (........)

So I think it would find things like


M:now is the 1234<tab><tab><tab>


Might not be perfect, but should be close

Paul

macropod
04-07-2014, 07:14 PM
FWIW, the '([:]{1})' expression could be reduced to
(:)the rest is superfluous, as are the parentheses unless the Replace expression (not posted) requires them.

As for '([0-9.a-z]{1,})', that says to find any string consisting or one or more numbers, periods and/or lower-case letters.

Using "([MID]{1})([:]{1})([0-9.a-z]{1,})([^t\)]{1,})" will therefore return strings like:
M:123.45<tab>)<tab>
I:a.45)<tab>
D:abc)
M:......<tab>
etc. It will not return:
M:now is the 1234<tab><tab><tab>
as spaces are not permitted by the Find expression posted.

Paul_Hossler
04-08-2014, 05:43 AM
etc. It will not return:
M:now is the 1234<tab><tab><tab>
as spaces are not permitted by the Find expression posted.


You :thumbare correct, and I was wrong :(

Paul