Results 1 to 9 of 9

Thread: Regexp question

  1. #1
    Knowledge Base Approver
    The King of Overkill!
    VBAX Master
    Joined
    Jul 2004
    Location
    Rochester, NY
    Posts
    1,727
    Location

    Regexp question

    Hi all,

    Looking to see how to create a pattern for regular expressions to say "does not include the string ____"

    More specifically, I'm looking for a single pattern to say "starts with 'abc', ends with 'ghi', but does not contain the string 'def'"
    Something along the lines of "abc.*^(def).*ghi" which obviously wont work

    Where "abcdefghi" will not be a match, but "abcfedghi" would

    Any ideas?
    Matt

  2. #2
    VBAX Contributor Aaron Blood's Avatar
    Joined
    Sep 2004
    Location
    Palm Beach, Florida, USA
    Posts
    130
    Location
    Pretty easy to do in VBA using the the LIKE operator.

    You would just have to test two conditions instead of one.

    Sub test() 
        Dim txt$, result As Boolean
        txt$ = ActiveCell.Value
    If txt$ Like "abc*ghi" And Not txt$ Like "*def*" Then
            result = True
        Else
            result = False
        End If
    MsgBox result
    End Sub


    If you wanted to do it in a cell formula you could probably make it work with FIND or SEARCH in conjunction with the LEFT and RIGHT functions. Or you could wrap the LIKE operator as a UDF like so...


    Function TextLike(Text As String, Filter As String) As Boolean
         TextLike = Text Like Filter
    End Function

    Then the cell formula would be something like...

    =textlike(A1,"abc*ghi")*NOT(textlike(A1,"*def*"))
    ...which would return a binary for true/false.

  3. #3
    Knowledge Base Approver
    The King of Overkill! VBAX Master
    Joined
    Jul 2004
    Location
    Rochester, NY
    Posts
    1,727
    Location
    Thanks Aaron,
    I do understand that, but I'm trying to find a one-line regex pattern that can do this. The user needs to use regexp, and cant loop through or use Like or anything.
    The actual use is to import a webpage as a string, and find tags starting with <img and ending in > without alt in the middle anywhere. I'm just having trouble finding the exact pattern to use

  4. #4
    VBAX Contributor Aaron Blood's Avatar
    Joined
    Sep 2004
    Location
    Palm Beach, Florida, USA
    Posts
    130
    Location
    OIC... so this isn't an Excel question. Haven't used it... but I imagine there's a regexp forum somewhere you can post to.

    This site seems to have some syntax listed.
    http://www.greenend.org.uk/rjk/2002/06/regexp.html

    Good luck.

  5. #5
    Knowledge Base Approver
    The King of Overkill! VBAX Master
    Joined
    Jul 2004
    Location
    Rochester, NY
    Posts
    1,727
    Location
    Thanks. I know its not really an excel question, but since I'm using excel for it, and so many people here are very smart, I thought it might be worth a shot
    Personally, i don't think it can be done, but i'm often outsmarted by regex.
    Thanks again!

  6. #6
    VBAX Contributor Aaron Blood's Avatar
    Joined
    Sep 2004
    Location
    Palm Beach, Florida, USA
    Posts
    130
    Location
    Only seen a few scattered posts on the topic... but not too much on the XL boards.

    Maybe one of the VB boards.

  7. #7
    VBAX Expert brettdj's Avatar
    Joined
    May 2004
    Location
    Melbourne
    Posts
    649
    Location
    Hi Matt,

    Matching a not string as opposed to a not character is a problem with vbscript regexp. Perl offers a negative lookahead which lets you do this, I've posted the Perl example below - which is likely to annoy you once you see it is exactly what you want but vbscript won't let you do it

    "(?!pattern)"
    A zero-width negative look-ahead assertion. For example
    "/foo(?!bar)/" matches any occurrence of "foo" that isn't
    followed by "bar".

    "(?<!pattern)"
    A zero-width negative look-behind assertion. For example
    "/(?<!bar)foo/" matches any occurrence of "foo" that does not
    follow "bar". Works only for fixed-width look-behind.

    But back to VBscript

    if you try this
    ^(abc).+[^def].+ghi$
    the regular expression is actually looking for
    start string.....abc...anything but d or e or f....ghi...end string
    not
    start string.....abc...anything but def....ghi...end string

    and if you try making the string a submatch you will find the regexp
    merely adds "(" and ")" to the don't match group
    ^(abc).+[^(def)].+ghi$

    so you could use two regexps, ie


    Dim RegEx As Object
    Dim TestStr As String
    ' TestStr = "abcdeftmeghi" 'invalid
    TestStr = "abcdeghi" ' valid
    'TestStr = "abcghi" ' invalid as there must be at least once character between abc and ghi
    Set RegEx = CreateObject("vbscript.regexp")
    With RegEx
    .Pattern = "(.+)(def)(.+)"
    .Global = False
    .MultiLine = True
    'test for one string of "def". "def cannot start or finish the string"
    If .test(TestStr) = False Then
    'replaced "def" with "" and test for "abc" at front, and "ghi" at end
    TestStr = .Replace(TestStr, "$1$3")
    .Pattern = "^(abc).+ghi$"
    MsgBox "String test is " & .test(TestStr)
    Else
    MsgBox "string not tested as it contains ""def"" somewhere between the first and last characters"
    End If
    End With
    Now, gimme points

    Cheers

    Dave

  8. #8
    Knowledge Base Approver
    The King of Overkill! VBAX Master
    Joined
    Jul 2004
    Location
    Rochester, NY
    Posts
    1,727
    Location
    Thanks Dave,

    I was trying to use excel to do this, to test different patterns for the user. Makes sense as to why I couldn't do it, too bad we cant use perl syntax in excel The user is going to be using an editing program that allows different syntax's (but doesnt allow scripting for whatever reason), so I'm gonna go see if that will work or not. Personally I'd be glad to be rid of this whole situation, so I'm really hoping it will work!

    I've got some time tonight to play around with your code, and have been paying attention to the refedit thread. I'll see if I can't find some of the more obscure errors and possible workarounds for you

    Now, gimme points
    Where am I, the lounge?

  9. #9
    VBAX Expert brettdj's Avatar
    Joined
    May 2004
    Location
    Melbourne
    Posts
    649
    Location
    Matt,

    You might want to try

    ^(abc).*ghi$
    rather than
    "^(abc).+ghi$

    if you wanted a string such as
    "abcghi"
    to be passed

    ^(abc).+ghi$
    requires
    abcXghi
    where X is one or more characters to be passed

    Thanks for looking at the code and refedit

    Cheers

    Dave

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •