Consulting

Results 1 to 13 of 13

Thread: Extract numbers from texts and do substraction

  1. #1

    Extract numbers from texts and do substraction

    Hi,

    If I'm given the following lines...


    18-19 Years,
    20-24 Years,
    25-29 Years,
    30-34 Years,
    35-39 Years,
    40-44 Years,
    45-49 Years,

    I want to let VBA to read the numbers only,
    18-19, 20-24,25-29....45-49
    find the mid-point for each group,
    that's 18.5, 22, 27, ..47.

    Any suggestions?
    Thanks in advance.

  2. #2
    VBAX Expert xCav8r's Avatar
    Joined
    May 2005
    Location
    Minneapolis, MN, USA
    Posts
    912
    Location
    There are a number of ways you can approach this. I'd divide it up into two parts. First, get the numbers out. Second, perform the calculation.

    When getting the numbers out, if the format in which you'll be given this information will never change, then you can easily use left and mid to get out the years. If you want to make something that will be more adaptable, you might consider looping through the characters in the string checking where numeric characters start and end.

    For the calculation, just use the Average function.

  3. #3
    VBAX Mentor Justinlabenne's Avatar
    Joined
    Jul 2004
    Location
    Clyde, Ohio
    Posts
    408
    Location
    Not sure how you wanted these number returned and if the format is the same every time:
    IE: 18-19 Years,

    but this assumes a lot, that your text is in column A, and it can be put into columns B,C,D

    Option Explicit
    
    Sub Extract()
        Dim x As Long
        x = Cells(Rows.Count, 1).End(xlUp).Row
        Dim r As Range
        For Each r In Range("A1:A" & x)
            With r.Offset(, 1)
                .Formula = Left$(r, 2)
                .Value = .Value
            End With
            With r.Offset(, 2)
                .Formula = Mid$(r, 4, 2)
                .Value = .Value
            End With
            With r.Offset(, 3)
                .Formula = WorksheetFunction.Average(r.Offset(, 1), r.Offset(, 2))
                .Value = .Value
            End With
        Next r
    End Sub
    Justin Labenne

  4. #4
    Knowledge Base Approver VBAX Expert brettdj's Avatar
    Joined
    May 2004
    Location
    Melbourne
    Posts
    649
    Location
    If your data set was long I'd use a variant array to perform the maniupulation and then dump to the next row. For now I've stuck with a simple For loop through the range

    This code also assumes that the data starts in A1 and it dumps the average to B. If the XX-XX pattern is not fully met then no average is returned


    Option Explicit
    
    Sub gETmID()
    Dim Myrange As Range, C As Range
    Dim RegEx As Object
    Set Myrange = Range([a1], [a65536].End(xlUp))
    Set RegEx = CreateObject("vbscript.regexp")
    With RegEx
    .Pattern = "(\d+)-(\d+).*"
    .Global = True
    End With
    For Each C In Myrange
    If RegEx.test(C) = True Then C.Offset(0, 1) = Application.Average(RegEx.Replace(C, "$1"), RegEx.Replace(C, "$2"))
    Next
    Set RegEx = Nothing
    End Sub
    Cheers

    Dave

  5. #5
    Hi,

    Thanks to all!

    I hv a little problem here..
    When I try following lines with dollar value,

    500,000-550,000,
    450,000-500,000,
    400,000-450,000,
    350,000-400,000,
    300,000-350,000,
    275,000-300,000,
    250,000-275,000,
    225,000-250,000,
    200,000-225,000,

    The result comes out as:
    500275
    450250
    400225
    350200
    300175
    275150
    250137.5
    225125
    200112.5

    How to solve it if the dollar value contains "," and I can't delete it?

    Also, when I deal with following lines,
    US 500000-550000,
    US 450000-500000,
    US 400000-450000,
    US 350000-400000,
    US 300000-350000,
    US 275000-300000,
    US 250000-275000,

    It doesn't work if I change this ..

    .Pattern = "*.(\d+)-(\d+).*"
    Please advise, thank you so much...

  6. #6
    VBAX Expert xCav8r's Avatar
    Joined
    May 2005
    Location
    Minneapolis, MN, USA
    Posts
    912
    Location
    Well, ya edited your previous post, so this is based on the last values. Stick this in a standard module, and then you can within a cell do something like this:

    = GetAverageFromString(A1)
    Option Explicit
     
    Function ExtractNumbersFromString(InputString As String) As Variant
    'Returns a single dimension array with all the numbers found in a string
    'does not support decimals
    Dim avarNumbers() As Variant
    Dim strThisCharacter As String
    Dim strNextCharacter As String
    Dim intNumericStart As Integer
    Dim intNumericEnd As Integer
    Dim intCharacterCounter As Integer
    Dim intStringLength As Integer
    Dim blnNumberFound As Boolean
    Dim intArraySize As Integer
    Dim sngCurrentNumber As Single
    intStringLength = Len(InputString)
    blnNumberFound = False
    intNumericStart = 0
    intNumericEnd = 0
    intArraySize = 0
    For intCharacterCounter = 1 To intStringLength
    Select Case intCharacterCounter
    Case 1
    strThisCharacter = Mid(InputString, intCharacterCounter, 1)
    strNextCharacter = IIf(intStringLength > 1, Mid(InputString, intCharacterCounter + 1, 1), "")
    Case intStringLength
    strThisCharacter = Mid(InputString, intCharacterCounter, 1)
    strNextCharacter = ""
    Case Else
    strThisCharacter = Mid(InputString, intCharacterCounter, 1)
    strNextCharacter = Mid(InputString, intCharacterCounter + 1, 1)
    End Select
    Select Case blnNumberFound
    Case True
    If IsNumeric(strNextCharacter) = False Then intNumericEnd = intCharacterCounter
    Case False
    If IsNumeric(strThisCharacter) = True Then
    intNumericStart = intCharacterCounter
    If IsNumeric(strNextCharacter) = False Then intNumericEnd = intCharacterCounter
    End If
    End Select
    If intNumericStart <> 0 And intNumericEnd <> 0 Then
    If intArraySize = 0 Then
    ReDim Preserve avarNumbers(intArraySize)
    sngCurrentNumber = Mid(InputString, intNumericStart, intNumericEnd - intNumericStart + 1)
    avarNumbers(intArraySize) = sngCurrentNumber
    intArraySize = intArraySize + 1
    Else
    ReDim Preserve avarNumbers(intArraySize)
    sngCurrentNumber = Mid(InputString, intNumericStart, intNumericEnd - intNumericStart + 1)
    avarNumbers(intArraySize) = sngCurrentNumber
    End If
    blnNumberFound = False
    intNumericStart = 0
    intNumericEnd = 0
    ElseIf intNumericStart = 0 And intNumericEnd = 0 Then
    blnNumberFound = False
    ElseIf intNumericStart <> 0 And intNumericEnd = 0 Then
    blnNumberFound = True
    End If
    Next intCharacterCounter
    ExtractNumbersFromString = avarNumbers
    End Function
     
    Function GetAverageFromString(InputString As String) As Single
    Dim avarNumbers As Variant
    Dim intArraySize As Integer
    Dim sngSum As Single
    Dim intCounter As Integer
    InputString = Replace(InputString, ",", "", 1, -1, vbTextCompare)
    avarNumbers = ExtractNumbersFromString(InputString)
    sngSum = 0
    intArraySize = UBound(avarNumbers)
    For intCounter = 0 To intArraySize
    sngSum = sngSum + avarNumbers(intCounter)
    Next intCounter
    GetAverageFromString = sngSum / (intArraySize + 1)
    End Function
     
    Sub test()
    Debug.Print GetAverageFromString("US 500000-550000,US 450000-500000,US" _
    & "400000-450000,US 350000-400000,US 300000-350000,US 275000-300000,US 250000-275000,")
    Debug.Print GetAverageFromString("500,000-550,000,450,000-500,000," _
    & " 400,000-450,000,350,000-400,000,300,000-350,000,275,000-300,000,250,000-275,000,225,000-250,000,200,000-225,000,")
    End Sub
    Results are...

    387500
    362500

  7. #7
    VBAX Expert xCav8r's Avatar
    Joined
    May 2005
    Location
    Minneapolis, MN, USA
    Posts
    912
    Location
    Actually, that's not gonna work since I scrub out all the commas. Can you get rid of the commas on your end? It's a problem that they're used both to delimit ### and numbers themselves. Without knowing the exact formatting, I don't know how to tell when a comma is being used for which purpose. If you give me all the formatting possibilities, then hopefully I can code for it. If, however, commas will be used to delineate three digits and to separate numbers (with no spacing or other consistent formatting), then I don't know what to do.

    Can you provide exhaustive examples of the types of strings that you need to deal with?

  8. #8
    Hi,

    I can get rid of the comma at the end of the string but have to keep the dollar signs..
    There are 2 formats most often use: With years/With $ amount

    For dollar values, two types 1)US 500000-550000 & 2) 500,000-550,000 are commonly used.

  9. #9
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    How about a worksheet formula

    =IF(LEFT(A1,2)="US",(MID(A1,FIND(" ",A1),FIND("-",A1)-FIND(" ",A1))+RIGHT(A1,LEN(A1)-FIND("-",A1))),(LEFT(A10,FIND("-",A10)-1)+RIGHT(A10,LEN(A10)-FIND("-",A10))))/2
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  10. #10
    VBAX Expert xCav8r's Avatar
    Joined
    May 2005
    Location
    Minneapolis, MN, USA
    Posts
    912
    Location
    Quote Originally Posted by Dreamer
    Hi,

    I can get rid of the comma at the end of the string but have to keep the dollar signs..
    There are 2 formats most often use: With years/With $ amount

    For dollar values, two types 1)US 500000-550000 & 2) 500,000-550,000 are commonly used.
    What I wrote will pull every number out of a string. If you've gotten rid of the commas at the end of the numbers, then it will work without problems (so I think).

    Here's how it works: If a number is touching another number, I assume it's a digit of larger number. If not, it's a single digit number. So if you plugged in the last line you wrote, it would spit out:
    • 1
    • 500000
    • 550000
    • 2
    • 500000
    • 550000
    Then it would average these numbers. Obviously, that wouldn't work because 1 and 2 aren't supposed to be in there. I can filter things like that out, but only when I know that anything numeric that's touching a ) should be disregarded. I just use that as an example. As it stands now, I don't know all the ways in which numbers might be in your strings, so I don't know how to filter out undesired numbers.

  11. #11
    Hi,

    Thanks a lot! So impressive..
    I think it really helps me a lot... Thank you

  12. #12
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    Quote Originally Posted by Dreamer
    Hi,

    Thanks a lot! So impressive..
    I think it really helps me a lot... Thank you
    which one?
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  13. #13
    VBAX Expert xCav8r's Avatar
    Joined
    May 2005
    Location
    Minneapolis, MN, USA
    Posts
    912
    Location
    Good question, xld. Dreamer, for people searching this forum with a similar problem, it would be helpful to know which solution you implemented. I assume it was mine. Muhahahaa!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •