Consulting

Results 1 to 13 of 13

Thread: Looking for fast VBA alpha sort

  1. #1

    Question Looking for fast VBA alpha sort

    I've got long, long lists of labels from which I want only the unique labels.

    Right now I'm sorting the initial list; setting duplicates to the value of 'z'; resorting; and then resetting my counter to the position of the last unique label. It's surely not the most efficient method of eliminating dups and sorting but it works very well for short lists.

    However, the 'bubble sort' I'm using (from Walkenbach's VBA book) does not work well for long, long lists. The other two faster sorts he offers handle numbers but not alpha strings.

    Anybody have a faster VBA sort for alphas? or a better way to eliminate the duplicates and sort the remaining unique labels? Thanks.


    'iLastRow is counter that represents the total number of unsorted entries
    'FYI, data starts in row 2, not row 1, hence the iLastRow - 1 for the counter
    ReDim list(0 To iLastRow - 1) As String
    'read spreadsheet column data into array
    For p = 0 To iLastRow - 1
        list(p) = ActiveCell.Offset(p, 0).value
    Next p
    'this is the first sort
    First = LBound(list)
    Last = UBound(list)
    For q = First To Last - 1
        For j = q + 1 To Last - 1
            If list(q) > list(j) Then
                Temp = list(j)
                list(j) = list(q)
                list(q) = Temp
            End If
        Next j
    Next q
    'Set duplicates to z
    For q = First To Last
        For j = q + 1 To Last - 1
            If list(q) = list(j) Then
                list(j) = "z"
            End If
        Next j
    Next q
    'Resort z's to bottom
    For q = First To Last - 1
        For j = q + 1 To Last - 1
            If list(q) > list(j) Then
                Temp = list(j)
                list(j) = list(q)
                list(q) = Temp
            End If
        Next j
    Next q
    'Reset the value of j for complete list WITHOUT DUPS
    For q = First To Last - 1
        If list(q) <> "z" Then j = q
    Next q
    'Goto the correct position to write out unique labels
    'Write out unique labels
    For q = 0 To j
        ActiveCell.Offset(q, 0) = list(q)
    Next q

  2. #2
    Site Admin
    Urban Myth
    VBAX Guru
    Joined
    May 2004
    Location
    Oregon, United States
    Posts
    4,940
    Location
    Hi Susan,

    Take a look at the VBA tags when posting. It'll really clean up your code, and make it look real pretty too! You can check them out here.

  3. #3
    Moderator VBAX Master Tommy's Avatar
    Joined
    May 2004
    Location
    Houston, TX
    Posts
    1,184
    Location
    This is a ShellSort I have translated from "FORTRAN programs for scientists and engineers" by Alan R. Miller

    This will take a array of strings (MaxLimit) and sort them smallest to largest.

    If you have any trouble let me know

    Public Sub ShellSort(MaxLimit() As String)
        Dim J As Integer
        Dim i As Integer
        Dim Jump As Integer
        Dim J2 As Integer
        Dim J3 As Integer
        Dim HldDis As String
        Jump = UBound(MaxLimit)
    10  Jump = Jump / 2
        If Jump > 0 Then
        J2 = UBound(MaxLimit) - Jump
        J = 0
        While J <= J2
            i = J
           20          J3 = i + Jump
            If MaxLimit(J3) <= MaxLimit(i) Then GoTo 30
            HldDis = MaxLimit(J3)
            MaxLimit(J3) = MaxLimit(i)
            MaxLimit(i) = HldDis
            i = i - Jump
            If i >= 0 Then GoTo 20
            30          J = J + 1
        Wend
        GoTo 10
        End If
    End Sub


    Happy Coding

  4. #4
    WOW. Thanks. Fabulous. It's perfect.

  5. #5
    Moderator VBAX Master Tommy's Avatar
    Joined
    May 2004
    Location
    Houston, TX
    Posts
    1,184
    Location
    Using the code you posted I came up with this. I tested on a sheet with 1108 rows of data. Enjoy


    ReDim List(0 To iLastRow - 1) As String
    'read spreadsheet column data into array
    For p = 0 To iLastRow - 1
        List(p) = ActiveCell.Offset(p, 0).Value
    Next p
    Last = UBound(List)
    Call ShellSort(List)
    q = 0
    J = 1
    While q <= Last
        If q = 0 Then
            ActiveCell.Offset(J, 0) = List(q)
            J = J + 1
        Else
            If List(q) <> List(q - 1) Then
                ActiveCell.Offset(J, 0) = List(q)
                J = J + 1
            End If
        End If
        q = q + 1
    Wend

  6. #6
    WOW. That's an incredibly handsome improvement on what I wrote...in this case, the sequel is far superior to the original.

    I took your ShellSort and incorporated it into my spreadsheet. I'm testing it now on a sample of 15,000 x about 10 different sort ranges. I may wait til the end of the day to rerun the bubble sort though.

  7. #7
    9,998 records x 17 different sorts ...
    9.8 BubbleSort minutes but only 8.0 ShellSort minutes, about an 18% time savings.

  8. #8
    VBAX Regular
    Joined
    May 2004
    Location
    London
    Posts
    8
    Location
    Have you tried filtering the data directly in the worksheet instead (Autofilter or Advanced filter)? Filtering out duplicates from 10,000 records takes seconds, not minutes. For example, this will remove all duplicate values from the data in column A.

    Sub main()
    Dim ws As Worksheet
    Dim rng As Range
    Dim rngToDelete As Range
    Set ws = Worksheets("Sheet1")
    'Advanced Filter requires a header row - let's add a temporary one
    ws.Rows(1).Insert
    ws.Cells(1, 1).Value = "temp header"
    Set rng = ws.Range("A1:A10000")
    rng.AdvancedFilter xlFilterInPlace, unique:=True
    Set rngToDelete = rng.SpecialCells(xlCellTypeVisible)
    ws.ShowAllData
    rngToDelete.EntireRow.Hidden = True
    rng.SpecialCells(xlCellTypeVisible).EntireRow.Delete
    rngToDelete.EntireRow.Hidden = False
    'remove the temporary row
    ws.Rows(1).Delete
    End Sub

  9. #9

    Lightbulb Autofilter is the close-to-perfect answer

    The autofilter approach works very well except....
    there's an autofilter bug known and acknowledged by microsoft for XL2000.

    http://support.microsoft.com/default...b;en-us;213886

    Autofilter is very fast but the bug messes up not only the sheet that I'm autofiltering on but other sheets as well. Not nice, that.

    I was picking up and copying whole columns to get the data I want but, if a user had been working on the data page, the columns sometimes included additional rows of "stray" equations. Going to go back and recode to copy only ranges of data, not whole columns.

    Other than the bug, autofilter was the perfect answer. Thanks.

  10. #10
    Final answer.

    The XL2000 autofilter bug is provoked for reasons that Microsoft hasn't explained in its support document.

    Copying the data rather than the column didn't correct the problem.

    Guessing based on M's support page, I'm turning Calculation to Manual, autofiltering, and then resetting Calculation to Automatic.

    So far, this seems to work but I haven't finished testing yet.

  11. #11
    BoardCoder
    Licensed Coder
    VBAX Expert mark007's Avatar
    Joined
    May 2004
    Location
    Leeds, UK
    Posts
    622
    Location
    For a discussion on sorting algorithms see:

    http://www.visualbasicforum.com/showthread.php?t=78889

    "Computers are useless. They can only give you answers." - Pablo Picasso
    Mark Rowlinson FIA | The Code Net

  12. #12
    VBAX Regular Mike_R's Avatar
    Joined
    May 2004
    Location
    Plainsboro, NJ
    Posts
    46
    Location
    Skulakowski, that page you show indicates merely that the StatusBar can be incorrect if your Filtered area has too many Worksheet Forumulas in it... (And sorting Ranges with formulas in it is not common, nor really recommended either...)

    You have found that AutoFilter can deliver incorrect results and/or mess up data on other worksheets?? This does not sound right... Anyone else ever experience anything like this?
    Try out the [VBA] tags!
    Option Explicit, don't leave home without it...


  13. #13
    Mike, Yup, it's strange.

    What actually happens is that rows get filtered out and Excel refuses to "show all" when asked. If Excel hadn't refused to "show all", I probably wouldn't have noticed. Rows go missing on other sheets, too.

    I've found that Microsoft's suggested work-around, turning off AutoRecalc and then manually undoing the filter, works so I'm preemptively doing that.

    I updated the macro to turn off the AutoRecalc, run the autofilter for unique records, count unique records and, if that is less than the total records, continue on to "show all" etc. However, if all records are unique records, I bypass the rest of the macro because there's nothing to delete and that's problematic, too.

    Susan

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •