View Full Version : [SOLVED:] Extract Numeric Substring From longer String
I have a rather long string, actually a short paragraph, from which I want to extract a numeric string and put it into a new string.
The length of the numeric string will vary but between 7 and 12 digits and it will be the only numeric string of this length within the longer main string.
This will be a recurring scenario that I need to regularly run.
I've been looking through various string manipulation methods but don't see a solution.
Note, although I posted this in Excel thread I want to work with the string in VBA, not interact with a worksheet to resolve the string.
Any assistance?
Paul_Hossler
04-25-2017, 06:37 PM
I'd try something like this
Option Explicit
Sub test()
Dim s As String
s = "asfadsfasdf12345678qwerqwerqwer"
MsgBox ExtractNumbers(s)
s = "12345678qwerqwerqwer"
MsgBox ExtractNumbers(s)
s = "asfadsfasdf12345678"
MsgBox ExtractNumbers(s)
s = "asfadsfasdf123qwerqwerqwer"
MsgBox ExtractNumbers(s)
s = "asfadsfasdf12345678qwer987654321qwerqwer"
MsgBox ExtractNumbers(s)
End Sub
Function ExtractNumbers(s As String) As String ' could be a Long
Dim s1 As String
Dim i As Long
s1 = s
'remove letters at left
Do While Not Left(s1, 1) Like "[0-9]" And Len(s1) > 0
s1 = Right(s1, Len(s1) - 1)
Loop
'remove letters at right
Do While Not Right(s1, 1) Like "[0-9]" And Len(s1) > 0
s1 = Left(s1, Len(s1) - 1)
Loop
If Len(s1) >= 7 And Len(s1) <= 12 And IsNumeric(s1) Then
ExtractNumbers = s1
End If
End Function
Thank you for that.
I tried an approximation of the string I will be using (below) but got a blank message box. Trying to extract the "50994121"
s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer 50994121 Owner Propriétaire"
mikerickson
04-25-2017, 07:33 PM
There are several numerals in the string that in post #3.
What value do you want returned from that string?
Trying to extract "50994121"
Will always want to extract the number having between 7 and 12 digits
Another approach might be to find the number between key words as they will always be as below:
"art.33 wer 50994121 Owner Propriétaire"
Or maybe Regex, I have been reading but am just not familiar with it yet.
There is probably a clever way to do it using regular expressions, but you can also parse the string as a collection of words:
Sub Test()
Dim I As Long, J As Long, WCnt As Long
Dim S As String, WordStr As String, ResultStr As String
S = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer 50994121 Owner Propriétaire"
ResultStr = ""
WCnt = WordCount(S, " ")
For I = 1 To WCnt
WordStr = ExtractWord(S, " ", I)
J = Len(WordStr)
If J >= 7 And J <= 12 And IsNumeric(WordStr) Then
ResultStr = WordStr
End If
Next I
If ResultStr <> "" Then
MsgBox "Result = " & ResultStr, , "Test Result"
Else
MsgBox "String not found", , "Test Result"
End If
End Sub
Function ExtractWord(ByVal AnyString As String, ByVal WordDelimiters As String, ByVal WordNumber As Long) As String
Dim SA() As String
Dim ResultWord As String
NormalizeDelimiters AnyString, WordDelimiters 'reduce multiple word delimiters to a single delimiter
AnyString = CollapseString(AnyString, WordDelimiters)
SA = Split(AnyString, WordDelimiters)
If WordNumber > 0 And WordNumber <= UBound(SA) + 1 Then
ResultWord = SA(WordNumber - 1)
Else
ResultWord = vbNullString
End If
ExtractWord = ResultWord
End Function
Function WordCount(ByVal AnyString As String, ByVal WordDelimiters As String) As Long
Dim SA() As String
NormalizeDelimiters AnyString, WordDelimiters 'reduce multiple word delimiters to a single delimiter
AnyString = CollapseString(AnyString, WordDelimiters)
SA = Split(AnyString, WordDelimiters)
WordCount = UBound(SA) + 1
End Function
'Replace multiple word delmiters with one word delimiter
Sub NormalizeDelimiters(ByRef AnyString As String, ByRef WordDelimiters As String)
Dim Delimiter As String
Dim SLen As Long
Dim I As Long
SLen = Len(WordDelimiters)
If SLen > 1 Then
If InStr(WordDelimiters, Chr(30)) > 0 Then 'preference for space char
Delimiter = Chr(30)
Else
Delimiter = Left(WordDelimiters, 1)
End If
For I = 1 To SLen
AnyString = VBA.Replace(AnyString, Mid(WordDelimiters, I, 1), Delimiter)
Next I
WordDelimiters = Delimiter
End If
End Sub
'Remove multiple delimiters in any string
Function CollapseString(ByVal AnyString As String, ByVal Delimiter As String) As String
Dim MultiDelims As String
Dim DCnt As Long 'Delimiter count
Dim SPos As Long
For DCnt = 20 To 2 Step -2
MultiDelims = String(DCnt, Delimiter)
SPos = InStr(AnyString, MultiDelims)
If SPos > 0 Then
AnyString = Replace(AnyString, MultiDelims, Delimiter)
End If
Next
CollapseString = AnyString
End Function
p45cal
04-26-2017, 02:30 AM
Another approach might be to find the number between key words as they will always be as below:
"art.33 wer 50994121 Owner Propriétaire"
In that case try:
Sub test()
Dim s As String
s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer 50994121 Owner Propriétaire"
MsgBox GetMyNumber(s)
End Sub
Function GetMyNumber(s)
GetMyNumber = Application.Trim(Split(Split(s, " wer ")(1), " Owner ")(0))
End Function
Or maybe Regex, I have been reading but am just not familiar with it yet.…another minefield!
Really appreciate the responses - both are working.
p45cal - Your function is very concise - which is great of course, and I get the general idea of how it works. But could you briefly explain what is happening, especially with respect to the indexes (1) and (0). Thanks
At the risk of speaking for p45cal, the Split function returns an array of strings. (0) and (1) refer to specific strings in the zero-based array. Think of it like this:
String1 = Split(s, " wer ")(1)
String2 = Split(String1, " Owner ")(0)
GetMyNumber = Application.Trim(String2)
Paul_Hossler
04-26-2017, 06:34 AM
Thank you for that.
I tried an approximation of the string I will be using (below) but got a blank message box. Trying to extract the "50994121"
s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer 50994121 Owner Propriétaire"
You didn't say that the number string would be 'stand alone'
The example string helps
If that's always true, then something like this is simpler
Option Explicit
Sub test()
Dim s As String
s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer 50994121 Owner Propriétaire"
MsgBox ExtractNumbers(s)
End Sub
Function ExtractNumbers(s As String) As String ' could be a Long
Dim v As Variant
Dim i As Long
v = Split(s, " ")
For i = LBound(v) To UBound(v)
If Len(v(i)) >= 7 And Len(v(i)) <= 12 And IsNumeric(v(i)) Then
ExtractNumbers = v(i)
Exit Function
End If
Next I
ExtractNumbers = vbNullString
End Function
Thanks to everyone, solved a number of ways - good to see the options. And explained as well -- great!!
mdmackillop
04-26-2017, 02:33 PM
Never answered with a regexp before. This should work if the string is a "word"
Sub test()
Dim strPattern As String
strPattern = "\b[0-9]{7,12}\b"
MsgBox RegxFunc([A1], strPattern)
End Sub
Function RegxFunc(strInput As String, regexPattern As String) As String
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = regexPattern
End With
If regEx.test(strInput) Then
Set matches = regEx.Execute(strInput)
RegxFunc = matches(0).Value
Else
RegxFunc = "not matched"
End If
End Function
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.