Consulting

Results 1 to 14 of 14

Thread: Looking to code a function to efficiently remove multiple commas in a string

  1. #1

    Looking to code a function to efficiently remove multiple commas in a string

    I've been experimenting with different ways to efficiently remove multiple commas from a text string. So far the best performers are these three:

    'Use VBA replace function, looping until all multiple commas are replaced
    Function CollapseCommas2(ByVal AnyString As String) As String
        Dim DDelim As String
        Dim S1 As String
        Dim Delimiter As String
    
    
        Delimiter = ","                                                       'in this case, commas
        DDelim = Delimiter & Delimiter                                        'double delimiters
        S1 = AnyString
        If InStr(AnyString, DDelim) > 0 Then
            Do
                S1 = VBA.Replace(S1, DDelim, Delimiter, 1, -1)
            Loop Until InStr(S1, DDelim) = 0
        End If
        CollapseCommas2 = S1
    End Function


    'Use excel Application.WorksheetFunction.Substitute, looping until all multiple commas are replaced
    Function CollapseCommas2b(ByVal AnyString As String) As String
        Dim DDelim As String
        Dim S1 As String
        Dim Delimiter As String
    
    
        Delimiter = ","                                                       'in this case, commas
        DDelim = Delimiter & Delimiter                                        'double delimiters
        S1 = AnyString
        If InStr(AnyString, DDelim) > 0 Then
            Do
                S1 = Application.WorksheetFunction.Substitute(S1, DDelim, Delimiter)
            Loop Until InStr(S1, DDelim) = 0
        End If
        CollapseCommas2b = S1
    End Function

    'Slight variation on Fn3, using Application.WorksheetFunction.Trim instead of Application.Trim
    Function CollapseCommas3a(ByVal AnyString As String) As String
        Dim DDelim As String
        Dim S1 As String
        Dim Delimiter As String
    
    
        Delimiter = ","                                                       'in this case, commas
        DDelim = Delimiter & Delimiter                                        'double delimiters
    
    
        S1 = AnyString
        S1 = VBA.Replace(S1, " ", Chr(248), 1, -1)                            'replace spaces with obscure character unlikely to be found in the string
        S1 = VBA.Replace(S1, Delimiter, " ", 1, -1)                           'replace commas with spaces
        S1 = Application.WorksheetFunction.Trim(S1)                           'this is fast and unlike vba.Trim, collapses internal spaces
        S1 = VBA.Replace(S1, " ", Delimiter, 1, -1)                           'restore the commas
        S1 = VBA.Replace(S1, Chr(248), " ", 1, -1)                            'restore the spaces
    
    
        CollapseCommas3a = S1
    End Function
    Is there another approach I'm overlooking? Other experiments attached.


    Thanks.
    Attached Files Attached Files

  2. #2
    VBAX Regular
    Joined
    Dec 2016
    Posts
    29
    Location
    This will work

    Sub abc()
     Const cDelimiter As String = ","
     Const cReplaceDelimiter As String = " "
     Dim cell As Range
     
     For Each cell In Range("d2", Cells(Rows.Count, "d").End(xlUp))
        Cells(cell.Row, 2).Value = Replace(WorksheetFunction.Trim(Replace(cell.Value, cDelimiter, " ")), cReplaceDelimiter, cDelimiter)
     Next
    End Sub

  3. #3
    Hi Mike. Thanks for taking a crack at it. I think my sample spreadsheet may have mislead you into thinking I wanted to loop through cells D3 to D8. Really, I was just using those cells to store different trial strings to paste into D2. Sorry for the confusion.


    Your sub takes a similar approach to the Function CollapseCommas3a I posted above, though it did eat the existing space in "thank you". When I extended it extended to preserve any existing spaces, it worked fine.
    Sub def()
        Const cDelimiter As String = ","
        Const cReplaceDelimiter As String = " "
        Dim cell As Range
    
    
        For Each cell In Range("d2", Cells(Rows.count, "d").End(xlUp))
            Cells(cell.Row, 2).Value = Replace(Replace(Application.WorksheetFunction.Trim(Replace(Replace(cell.Value, cReplaceDelimiter, Chr(248)), cDelimiter, cReplaceDelimiter)), cReplaceDelimiter, cDelimiter), Chr(248), " ")   'restore the spaces
        Next
    End Sub
    Attached Files Attached Files

  4. #4
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,875
    try:
    Function RegexpReplace1(ByVal parmString As String) As String
    '================================================
    'Use local static regexp object to remove duplicate spaces
    '================================================
    Static oRE As Object
    If oRE Is Nothing Then
      Set oRE = CreateObject("vbscript.regexp")
      oRE.Global = True
      oRE.Pattern = "(,)\1+"
    End If
    RegexpReplace1 = oRE.Replace(parmString, ",")
    End Function
    stolen and adapted from https://www.experts-exchange.com/art...te-Spaces.html
    Here, it seems to be about 4x or 5x faster than the fastest so far. Most of the time is saved by making oRE a static variable (it's 8 times slower than the fastest if the object has to be created at every run of the function), but it remains a regexp object until the project is reset for some reason, so using it as a worksheet function in multiple cells should still be quite fast.

    Early binding seems to speed it up a bit more, about 10% faster.
    Last edited by p45cal; 04-12-2017 at 01:35 PM.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  5. #5
    Wow that's a really speedy function. Thanks. The more repeating commas in the string, the greater the speed difference. The link is just as interesting. I guess I need to pay more attention to the regexp object. You know, I was doing some experimentation with the regexp object a couple of weeks ago as an alternate to instr for finding out if a substring existed in a string and it was really s l o w - but of course I was not declaring the object as static as you have done here. Live and learn. Thanks again.
    Attached Files Attached Files

  6. #6
    VBAX Regular
    Joined
    Dec 2016
    Posts
    29
    Location
    Quote Originally Posted by rlv View Post
    Wow that's a really speedy function. Thanks. The more repeating commas in the string, the greater the speed difference. The link is just as interesting. I guess I need to pay more attention to the regexp object. You know, I was doing some experimentation with the regexp object a couple of weeks ago as an alternate to instr for finding out if a substring existed in a string and it was really s l o w - but of course I was not declaring the object as static as you have done here. Live and learn. Thanks again.

    So what is wrong with the solution. I changed the code to only to look at D2. What an I missing?

    Sub abc()
        Const cDelimiter As String = ","
        Const cReplaceDelimiter As String = " "
        Dim cell As Range
         
        For Each cell In Range("d2", Cells(Rows.Count, "d").End(xlUp))
            Cells(cell.Row, 3).Value = Replace(WorksheetFunction.Trim(Replace(cell.Value, cDelimiter, " ")), cReplaceDelimiter, cDelimiter)
        Next
    End Sub

  7. #7
    Quote Originally Posted by mike7952 View Post
    So what is wrong with the solution.
    Nothing at all (well, except perhaps for overlooking the possibility that the input comma delimited string could have existing spaces in it which need to be preserved.). In my original post in this thread, I posted 3 functions I've tried using different approaches. The approach you took in sub abc to removing multiple commas

    1. Replace "," with " "
    2. Use WorksheetFunction.Trim to collapse multiple spaces
    3. Replace " ", with ","

    is essentially identical to my 3rd posted function, and if I repackage abc as a function
    Function abc(AnyString As String) As String
        Const cDelimiter As String = ","
        Const cReplaceDelimiter As String = " "
        'Dim cell As Range
    
        'For Each cell In Range("d2", Cells(Rows.count, "d").End(xlUp))
        abc = Replace(WorksheetFunction.Trim(Replace(AnyString, cDelimiter, " ")), cReplaceDelimiter, cDelimiter)
        'Next
    End Function
    and plug it in into my test sub, the execution speed of function abc is on par with that one. No faster and no slower.

  8. #8
    VBAX Regular
    Joined
    Dec 2016
    Posts
    29
    Location
    So do you have a working solution? If not I don't know what I'm missing as per your request.

  9. #9
    Yes I do have a solution. The function that p45cal provided is 2x to 4x faster than any of my functions.

  10. #10
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    Why don't you

    Sub M_snb()
        For j = 20 To 1 Step -1
            Sheets(1).Cells.Replace String(j, ","), ","
        Next
    End Sub

  11. #11
    Seems like the RegexpReplace function still has the performance edge. I'm seeing a ~3x speed advantage (YMMV).


    M_snb Elapsed: 1.3789
    RegexpReplace1 Elapsed: 0.5117


    Sub M_snb_SpeedTest()
        Dim R As Range, T1 As Double
        Dim J As Long, RS As String
    
        Sheets(1).Range("d2").Copy
        Set R = Sheets(2).Range("A1").Resize(10000, 1)
        R.PasteSpecial (xlPasteValues)
    
        T1 = Timer
        For J = 20 To 1 Step -1
            Sheets(2).Cells.Replace String(J, ","), ","
        Next
    
        RS = "M_snb Elapsed: " & VBA.Format(Timer - T1, "0.0000")
        Debug.Print vbCr & RS
    
        Sheets(1).Range("d2").Copy
        R.PasteSpecial (xlPasteValues)
    
        T1 = Timer
        For Each R In Sheets(2).UsedRange
            R.Value = RegexpReplace1(R.Value)
        Next R
        RS = "RegexpReplace1 Elapsed: " & VBA.Format(Timer - T1, "0.0000")
        Debug.Print RS
    End Sub

  12. #12
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    .8 seconds ? I woudn't bother.

  13. #13
    Quote Originally Posted by snb View Post
    .8 seconds ? I woudn't bother.
    I agree it's a judgment call. For me personally, using a sheet-level global replace has some downsides. That's not to say if it offered a huge speed advantage, I would not consider it. But to use that approach means all the strings need to be loaded into a worksheet somewhere, and that's not always feasible. It's relatively rare that I want to globally remove every double delimiter everywhere, and usually removing duplicates is a prelude to doing something else to the string - both of which tend to reduce the value of a global sheet-level replacement over a string-level replacement for me, since I'll have to loop through the cells at some point anyway. Also, a UDF is more flexible and more portable. I mainly program in the excel flavor of vba, but not exclusively, and having something that will transfer to a word or access VBA environment has value for me.

    Btw, I've been enjoying your 'VBA for smarties' web page. It's a great resource and I've found it hugely useful. Thank you for putting it together.

  14. #14
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    another method

    sub M_snb()
       c00=cells(2,4)
       For j = 20 To 1 Step -1          
         c00= Replace(c00, String(j, ","), ",")
       Next
       cells(2,4)=c00
      End Sub

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •