PDA

View Full Version : Solved: Regex Query



cosmarchy
07-08-2008, 04:48 AM
Hello all,

I am trying to filter through files of a specific format and have a regex query of [A-Za-z][0-9][0-9][0-9][0-9][0-9][0-9]

Unfortunately I am also picking up files with this name abc123456def but I only want c123456.

I am assuming that it is doing this because c123456 is in the middle of abc123456def...

How do I limit this to only pick up the exact regex expression?

Any help is appreciated.

Thanks

Simon Lloyd
07-08-2008, 05:17 AM
Have you tried: [Cc][0-9][0-9][0-9][0-9][0-9][0-9] or [C-Cc-c][0-9][0-9][0-9][0-9][0-9][0-9]

cosmarchy
07-08-2008, 06:02 AM
Have you tried: [Cc][0-9][0-9][0-9][0-9][0-9][0-9] or [C-Cc-c][0-9][0-9][0-9][0-9][0-9][0-9]

Thanks Simon but unfortunately it did not work!!

Simon Lloyd
07-08-2008, 06:37 AM
Maybe its: [.C][0-9][0-9][0-9][0-9][0-9][0-9]

Simon Lloyd
07-08-2008, 06:38 AM
Look here (http://etext.virginia.edu/services/helpsheets/unix/regex.html) half way down!

RichardSchollar
07-08-2008, 09:00 AM
Hi

$ anchors the regex at the start of the string expression being tested, so your pattern should be:

$[A-Za-z]\d{6}

Richard

Simon Lloyd
07-08-2008, 09:08 AM
Richard i know nothing about regex as you can tell but could you explain why your suggestion will only pick out the C's and why {6} will find digits 0-9 (although i understand that it probably means 6 characters after the letter(s))?

mdmackillop
07-08-2008, 09:55 AM
KB Item (http://www.vbaexpress.com/kb/getarticle.php?kb_id=68) on RegExp

RichardSchollar
07-09-2008, 12:43 AM
Richard i know nothing about regex as you can tell but could you explain why your suggestion will only pick out the C's and why {6} will find digits 0-9 (although i understand that it probably means 6 characters after the letter(s))?

Hi Simon

It won't just pick out the Cs - it matches a pattern at the start of the string under test that begins with an alphabet character (upper or lower case doesn't matter) followed by a string of numeric digits 6 characters long (as you rightly identified).

Hence:

A000001.xls

Z123456RichardFile.txt

Q65432199999.xls

will all result in a match, but:

123456tyu.xls

ABC123456.xls

won't as they don't meet the specified pattern.

\d is a shortcut expression for [0-9]. The {6} can be used to specify a given number of characters and, additionally a minimum number of characters:

\d{2,}

a maximum number of characters:

\d{,3}

or a specified range of characters:

\d{2,3}

(which is 'between 2 and 3 numeric characters').

Richard