Consulting

Results 1 to 13 of 13

Thread: Extract Numeric Substring From longer String

  1. #1
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location

    Extract Numeric Substring From longer String

    I have a rather long string, actually a short paragraph, from which I want to extract a numeric string and put it into a new string.

    The length of the numeric string will vary but between 7 and 12 digits and it will be the only numeric string of this length within the longer main string.

    This will be a recurring scenario that I need to regularly run.

    I've been looking through various string manipulation methods but don't see a solution.

    Note, although I posted this in Excel thread I want to work with the string in VBA, not interact with a worksheet to resolve the string.

    Any assistance?
    Last edited by BrI; 04-25-2017 at 06:16 PM.

  2. #2
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,724
    Location
    I'd try something like this


    Option Explicit
    
    Sub test()
        Dim s As String
        s = "asfadsfasdf12345678qwerqwerqwer"
        MsgBox ExtractNumbers(s)
    
        s = "12345678qwerqwerqwer"
        MsgBox ExtractNumbers(s)
        s = "asfadsfasdf12345678"
        MsgBox ExtractNumbers(s)
        s = "asfadsfasdf123qwerqwerqwer"
        MsgBox ExtractNumbers(s)
        s = "asfadsfasdf12345678qwer987654321qwerqwer"
        MsgBox ExtractNumbers(s)
    
    End Sub
    
    
    Function ExtractNumbers(s As String) As String  '   could be a Long
        Dim s1 As String
        Dim i As Long
        
        s1 = s
            
        'remove letters at left
        Do While Not Left(s1, 1) Like "[0-9]" And Len(s1) > 0
            s1 = Right(s1, Len(s1) - 1)
        Loop
        
        'remove letters at right
        Do While Not Right(s1, 1) Like "[0-9]" And Len(s1) > 0
            s1 = Left(s1, Len(s1) - 1)
        Loop
        
        If Len(s1) >= 7 And Len(s1) <= 12 And IsNumeric(s1) Then
            ExtractNumbers = s1
        End If
    End Function
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  3. #3
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Thank you for that.

    I tried an approximation of the string I will be using (below) but got a blank message box. Trying to extract the "50994121"

    s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer  50994121  Owner  Propriétaire"

  4. #4
    Mac Moderator VBAX Guru mikerickson's Avatar
    Joined
    May 2007
    Location
    Davis CA
    Posts
    2,778
    There are several numerals in the string that in post #3.
    What value do you want returned from that string?

  5. #5
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Trying to extract "50994121"
    Will always want to extract the number having between 7 and 12 digits

  6. #6
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Another approach might be to find the number between key words as they will always be as below:

    "art.33 wer 50994121 Owner Propriétaire"

    Or maybe Regex, I have been reading but am just not familiar with it yet.

  7. #7
    There is probably a clever way to do it using regular expressions, but you can also parse the string as a collection of words:

    Sub Test()
        Dim I As Long, J As Long, WCnt As Long
        Dim S As String, WordStr As String, ResultStr As String
    
        S = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer  50994121  Owner  Propriétaire"
    
        ResultStr = ""
        WCnt = WordCount(S, " ")
        For I = 1 To WCnt
            WordStr = ExtractWord(S, " ", I)
            J = Len(WordStr)
            If J >= 7 And J <= 12 And IsNumeric(WordStr) Then
                ResultStr = WordStr
            End If
        Next I
    
        If ResultStr <> "" Then
            MsgBox "Result = " & ResultStr, , "Test Result"
        Else
            MsgBox "String not found", , "Test Result"
        End If
    End Sub
    
    Function ExtractWord(ByVal AnyString As String, ByVal WordDelimiters As String, ByVal WordNumber As Long) As String
        Dim SA() As String
        Dim ResultWord As String
    
        NormalizeDelimiters AnyString, WordDelimiters    'reduce multiple word delimiters to a single delimiter
        AnyString = CollapseString(AnyString, WordDelimiters)
        SA = Split(AnyString, WordDelimiters)
        If WordNumber > 0 And WordNumber <= UBound(SA) + 1 Then
            ResultWord = SA(WordNumber - 1)
        Else
            ResultWord = vbNullString
        End If
        ExtractWord = ResultWord
    End Function
    
    
    Function WordCount(ByVal AnyString As String, ByVal WordDelimiters As String) As Long
        Dim SA() As String
    
        NormalizeDelimiters AnyString, WordDelimiters    'reduce multiple word delimiters to a single delimiter
        AnyString = CollapseString(AnyString, WordDelimiters)
        SA = Split(AnyString, WordDelimiters)
        WordCount = UBound(SA) + 1
    End Function
    
    
    'Replace multiple word delmiters with one word delimiter
    Sub NormalizeDelimiters(ByRef AnyString As String, ByRef WordDelimiters As String)
        Dim Delimiter As String
        Dim SLen As Long
        Dim I As Long
    
        SLen = Len(WordDelimiters)
        If SLen > 1 Then
            If InStr(WordDelimiters, Chr(30)) > 0 Then    'preference for space char
                Delimiter = Chr(30)
            Else
                Delimiter = Left(WordDelimiters, 1)
            End If
            For I = 1 To SLen
                AnyString = VBA.Replace(AnyString, Mid(WordDelimiters, I, 1), Delimiter)
            Next I
            WordDelimiters = Delimiter
        End If
    End Sub
    
    
    'Remove multiple delimiters in any string
    Function CollapseString(ByVal AnyString As String, ByVal Delimiter As String) As String
        Dim MultiDelims As String
        Dim DCnt As Long    'Delimiter count
        Dim SPos As Long
    
        For DCnt = 20 To 2 Step -2
            MultiDelims = String(DCnt, Delimiter)
            SPos = InStr(AnyString, MultiDelims)
            If SPos > 0 Then
                AnyString = Replace(AnyString, MultiDelims, Delimiter)
            End If
        Next
        CollapseString = AnyString
    End Function

  8. #8
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,873
    Quote Originally Posted by BrI View Post
    Another approach might be to find the number between key words as they will always be as below:

    "art.33 wer 50994121 Owner Propriétaire"
    In that case try:
    Sub test()
    Dim s As String
    s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer  50994121  Owner  Propriétaire"
    MsgBox GetMyNumber(s)
    End Sub
    
    Function GetMyNumber(s)
    GetMyNumber = Application.Trim(Split(Split(s, " wer ")(1), " Owner ")(0))
    End Function


    Quote Originally Posted by BrI View Post
    Or maybe Regex, I have been reading but am just not familiar with it yet.
    …another minefield!
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  9. #9
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Really appreciate the responses - both are working.

    p45cal - Your function is very concise - which is great of course, and I get the general idea of how it works. But could you briefly explain what is happening, especially with respect to the indexes (1) and (0). Thanks
    Last edited by BrI; 04-26-2017 at 04:50 AM.

  10. #10
    At the risk of speaking for p45cal, the Split function returns an array of strings. (0) and (1) refer to specific strings in the zero-based array. Think of it like this:

       String1 = Split(s, " wer ")(1)   
       String2 = Split(String1, " Owner ")(0)  
       GetMyNumber = Application.Trim(String2)

  11. #11
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,724
    Location
    Quote Originally Posted by BrI View Post
    Thank you for that.

    I tried an approximation of the string I will be using (below) but got a blank message box. Trying to extract the "50994121"

    s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer  50994121  Owner  Propriétaire"

    You didn't say that the number string would be 'stand alone'

    The example string helps

    If that's always true, then something like this is simpler


    Option Explicit
    Sub test()
        Dim s As String
        
        s = "Form 4 Formule 4 CERTIFICATE OF OWNERSHIP CERTIFICAT ENREGISTRE Land Act,S.N. 1981, c. L-1.1, s.63 Loi sur l'enregistrement foncier, L.N.B. de 1981, chap. L-11, art.33 wer  50994121  Owner  Propriétaire"
        
        MsgBox ExtractNumbers(s)
         
    End Sub
     
     
    Function ExtractNumbers(s As String) As String '   could be a Long
        Dim v As Variant
        
        Dim i As Long
         
        v = Split(s, " ")
        
        For i = LBound(v) To UBound(v)
            If Len(v(i)) >= 7 And Len(v(i)) <= 12 And IsNumeric(v(i)) Then
                ExtractNumbers = v(i)
                Exit Function
            End If
        Next I
        
        ExtractNumbers = vbNullString
    End Function
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  12. #12
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Thanks to everyone, solved a number of ways - good to see the options. And explained as well -- great!!

  13. #13
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Never answered with a regexp before. This should work if the string is a "word"
    Sub test()
    Dim strPattern As String
    strPattern = "\b[0-9]{7,12}\b"
    MsgBox RegxFunc([A1], strPattern)
    End Sub
    
    
    Function RegxFunc(strInput As String, regexPattern As String) As String
        Dim regEx As New RegExp
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = regexPattern
        End With
    
    
        If regEx.test(strInput) Then
            Set matches = regEx.Execute(strInput)
            RegxFunc = matches(0).Value
        Else
            RegxFunc = "not matched"
        End If
    End Function
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •