matthewspatrick
01-24-2006, 10:01 AM
I use the following function to put a nice little wrapper on RegExp.Find:
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos)
' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
' pattern (PatternStr). Use Pos to indicate which match you want:
' Pos omitted : function returns a zero-based array of all matches
' Pos = 0 : the last match
' Pos = 1 : the first match
' Pos = 2 : the second match
' Pos = <positive integer> : the Nth match
' If Pos is greater than the number of matches, is negative, or is non-numeric, the function
' returns an empty string. If no match is found, the function returns an empty string
' If you use this function in Excel, you can use range references for any of the arguments.
' If you use this in Excel and return the full array, make sure to set up the formula as an
' array formula. If you need the array formula to go down a column, use TRANSPOSE()
Dim re As Object
Dim TheMatches As Object
Dim Answer() As String
Dim Counter As Long
' Evaluate Pos. If it is there, it must be numeric and converted to Long
If Not IsMissing(Pos) Then
If Not IsNumeric(Pos) Then
RegExpFind = ""
Exit Function
Else
Pos = CLng(Pos)
End If
End If
' Create instance of RegExp object
Set re = CreateObject("VBScript.RegExp")
With re
.Pattern = PatternStr
.Global = True
End With
' Test to see if there are any matches
If re.test(LookIn) Then
' Run RegExp to get the matches, which are returned as a zero-based collection
Set TheMatches = re.Execute(LookIn)
' If Pos is missing, user wants array of all matches. Build it and assign it as the
' function's return value
If IsMissing(Pos) Then
ReDim Answer(0 To TheMatches.Count - 1) As String
For Counter = 0 To UBound(Answer)
Answer(Counter) = TheMatches(Counter)
Next
RegExpFind = Answer
' User wanted the Nth match (or last match, if Pos = 0). Get the Nth value, if possible
Else
Select Case Pos
Case 0 ' Last match
RegExpFind = TheMatches(TheMatches.Count - 1)
Case 1 To TheMatches.Count ' Nth match
RegExpFind = TheMatches(Pos - 1)
Case Else ' Invalid item number
RegExpFind = ""
End Select
End If
' If there are no matches, return empty string
Else
RegExpFind = ""
End If
' Release object variables
Set re = Nothing
Set TheMatches = Nothing
End Function
I tried to use it on a string like this; I want to grab just the www.msn.com (http://www.msn.com) part:
http://www.msn.com/home.asp
In Excel:
=RegExpFind(A1,"//[^/]*/",1)
That gets me:
//www.msn.com/
I can kill the slashes this way:
=SUBSTITUTE(RegExpFind(A1,"//[^/]*/",1),"/","")
However, I would rather do it by setting up the pattern string correctly. Can someone help? :help
Patrick
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos)
' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
' pattern (PatternStr). Use Pos to indicate which match you want:
' Pos omitted : function returns a zero-based array of all matches
' Pos = 0 : the last match
' Pos = 1 : the first match
' Pos = 2 : the second match
' Pos = <positive integer> : the Nth match
' If Pos is greater than the number of matches, is negative, or is non-numeric, the function
' returns an empty string. If no match is found, the function returns an empty string
' If you use this function in Excel, you can use range references for any of the arguments.
' If you use this in Excel and return the full array, make sure to set up the formula as an
' array formula. If you need the array formula to go down a column, use TRANSPOSE()
Dim re As Object
Dim TheMatches As Object
Dim Answer() As String
Dim Counter As Long
' Evaluate Pos. If it is there, it must be numeric and converted to Long
If Not IsMissing(Pos) Then
If Not IsNumeric(Pos) Then
RegExpFind = ""
Exit Function
Else
Pos = CLng(Pos)
End If
End If
' Create instance of RegExp object
Set re = CreateObject("VBScript.RegExp")
With re
.Pattern = PatternStr
.Global = True
End With
' Test to see if there are any matches
If re.test(LookIn) Then
' Run RegExp to get the matches, which are returned as a zero-based collection
Set TheMatches = re.Execute(LookIn)
' If Pos is missing, user wants array of all matches. Build it and assign it as the
' function's return value
If IsMissing(Pos) Then
ReDim Answer(0 To TheMatches.Count - 1) As String
For Counter = 0 To UBound(Answer)
Answer(Counter) = TheMatches(Counter)
Next
RegExpFind = Answer
' User wanted the Nth match (or last match, if Pos = 0). Get the Nth value, if possible
Else
Select Case Pos
Case 0 ' Last match
RegExpFind = TheMatches(TheMatches.Count - 1)
Case 1 To TheMatches.Count ' Nth match
RegExpFind = TheMatches(Pos - 1)
Case Else ' Invalid item number
RegExpFind = ""
End Select
End If
' If there are no matches, return empty string
Else
RegExpFind = ""
End If
' Release object variables
Set re = Nothing
Set TheMatches = Nothing
End Function
I tried to use it on a string like this; I want to grab just the www.msn.com (http://www.msn.com) part:
http://www.msn.com/home.asp
In Excel:
=RegExpFind(A1,"//[^/]*/",1)
That gets me:
//www.msn.com/
I can kill the slashes this way:
=SUBSTITUTE(RegExpFind(A1,"//[^/]*/",1),"/","")
However, I would rather do it by setting up the pattern string correctly. Can someone help? :help
Patrick