PDA

View Full Version : [SOLVED] Looking for fast VBA alpha sort



skulakowski
06-30-2004, 10:53 AM
I've got long, long lists of labels from which I want only the unique labels.

Right now I'm sorting the initial list; setting duplicates to the value of 'z'; resorting; and then resetting my counter to the position of the last unique label. It's surely not the most efficient method of eliminating dups and sorting but it works very well for short lists.

However, the 'bubble sort' I'm using (from Walkenbach's VBA book) does not work well for long, long lists. The other two faster sorts he offers handle numbers but not alpha strings.

Anybody have a faster VBA sort for alphas? or a better way to eliminate the duplicates and sort the remaining unique labels? Thanks.



'iLastRow is counter that represents the total number of unsorted entries
'FYI, data starts in row 2, not row 1, hence the iLastRow - 1 for the counter
ReDim list(0 To iLastRow - 1) As String
'read spreadsheet column data into array
For p = 0 To iLastRow - 1
list(p) = ActiveCell.Offset(p, 0).value
Next p
'this is the first sort
First = LBound(list)
Last = UBound(list)
For q = First To Last - 1
For j = q + 1 To Last - 1
If list(q) > list(j) Then
Temp = list(j)
list(j) = list(q)
list(q) = Temp
End If
Next j
Next q
'Set duplicates to z
For q = First To Last
For j = q + 1 To Last - 1
If list(q) = list(j) Then
list(j) = "z"
End If
Next j
Next q
'Resort z's to bottom
For q = First To Last - 1
For j = q + 1 To Last - 1
If list(q) > list(j) Then
Temp = list(j)
list(j) = list(q)
list(q) = Temp
End If
Next j
Next q
'Reset the value of j for complete list WITHOUT DUPS
For q = First To Last - 1
If list(q) <> "z" Then j = q
Next q
'Goto the correct position to write out unique labels
'Write out unique labels
For q = 0 To j
ActiveCell.Offset(q, 0) = list(q)
Next q

Zack Barresse
06-30-2004, 10:57 AM
Hi Susan,

Take a look at the VBA tags when posting. It'll really clean up your code, and make it look real pretty too! :) You can check them out here (http://www.vbaexpress.com/forum/faq.php?faq=vb_read_and_post#faq_vb_special_codes).

Tommy
06-30-2004, 11:55 AM
This is a ShellSort I have translated from "FORTRAN programs for scientists and engineers" by Alan R. Miller

This will take a array of strings (MaxLimit) and sort them smallest to largest.

If you have any trouble let me know :)


Public Sub ShellSort(MaxLimit() As String)
Dim J As Integer
Dim i As Integer
Dim Jump As Integer
Dim J2 As Integer
Dim J3 As Integer
Dim HldDis As String
Jump = UBound(MaxLimit)
10 Jump = Jump / 2
If Jump > 0 Then
J2 = UBound(MaxLimit) - Jump
J = 0
While J <= J2
i = J
20 J3 = i + Jump
If MaxLimit(J3) <= MaxLimit(i) Then GoTo 30
HldDis = MaxLimit(J3)
MaxLimit(J3) = MaxLimit(i)
MaxLimit(i) = HldDis
i = i - Jump
If i >= 0 Then GoTo 20
30 J = J + 1
Wend
GoTo 10
End If
End Sub



Happy Coding :type

skulakowski
06-30-2004, 12:20 PM
WOW. Thanks. Fabulous. It's perfect.

Tommy
06-30-2004, 01:50 PM
Using the code you posted I came up with this. I tested on a sheet with 1108 rows of data. Enjoy :)



ReDim List(0 To iLastRow - 1) As String
'read spreadsheet column data into array
For p = 0 To iLastRow - 1
List(p) = ActiveCell.Offset(p, 0).Value
Next p
Last = UBound(List)
Call ShellSort(List)
q = 0
J = 1
While q <= Last
If q = 0 Then
ActiveCell.Offset(J, 0) = List(q)
J = J + 1
Else
If List(q) <> List(q - 1) Then
ActiveCell.Offset(J, 0) = List(q)
J = J + 1
End If
End If
q = q + 1
Wend

skulakowski
06-30-2004, 02:07 PM
WOW. That's an incredibly handsome improvement on what I wrote...in this case, the sequel is far superior to the original.

I took your ShellSort and incorporated it into my spreadsheet. I'm testing it now on a sample of 15,000 x about 10 different sort ranges. I may wait til the end of the day to rerun the bubble sort though.

skulakowski
06-30-2004, 02:47 PM
9,998 records x 17 different sorts ...
9.8 BubbleSort minutes but only 8.0 ShellSort minutes, about an 18% time savings.

herilane
06-30-2004, 03:07 PM
Have you tried filtering the data directly in the worksheet instead (Autofilter or Advanced filter)? Filtering out duplicates from 10,000 records takes seconds, not minutes. For example, this will remove all duplicate values from the data in column A.


Sub main()
Dim ws As Worksheet
Dim rng As Range
Dim rngToDelete As Range
Set ws = Worksheets("Sheet1")
'Advanced Filter requires a header row - let's add a temporary one
ws.Rows(1).Insert
ws.Cells(1, 1).Value = "temp header"
Set rng = ws.Range("A1:A10000")
rng.AdvancedFilter xlFilterInPlace, unique:=True
Set rngToDelete = rng.SpecialCells(xlCellTypeVisible)
ws.ShowAllData
rngToDelete.EntireRow.Hidden = True
rng.SpecialCells(xlCellTypeVisible).EntireRow.Delete
rngToDelete.EntireRow.Hidden = False
'remove the temporary row
ws.Rows(1).Delete
End Sub

skulakowski
07-22-2004, 10:28 AM
The autofilter approach works very well except....
there's an autofilter bug known and acknowledged by microsoft for XL2000.

http://support.microsoft.com/default.aspx?scid=kb;en-us;213886

Autofilter is very fast but the bug messes up not only the sheet that I'm autofiltering on but other sheets as well. Not nice, that.

I was picking up and copying whole columns to get the data I want but, if a user had been working on the data page, the columns sometimes included additional rows of "stray" equations. Going to go back and recode to copy only ranges of data, not whole columns.

Other than the bug, autofilter was the perfect answer. Thanks.

skulakowski
07-22-2004, 03:14 PM
Final answer.

The XL2000 autofilter bug is provoked for reasons that Microsoft hasn't explained in its support document.

Copying the data rather than the column didn't correct the problem.

Guessing based on M's support page, I'm turning Calculation to Manual, autofiltering, and then resetting Calculation to Automatic.

So far, this seems to work but I haven't finished testing yet.

mark007
07-22-2004, 05:43 PM
For a discussion on sorting algorithms see:

http://www.visualbasicforum.com/showthread.php?t=78889

:)

Mike_R
07-23-2004, 11:37 AM
Skulakowski, that page you show indicates merely that the StatusBar can be incorrect if your Filtered area has too many Worksheet Forumulas in it... (And sorting Ranges with formulas in it is not common, nor really recommended either...)

You have found that AutoFilter can deliver incorrect results and/or mess up data on other worksheets?? This does not sound right... Anyone else ever experience anything like this?

skulakowski
07-23-2004, 02:04 PM
Mike, Yup, it's strange.

What actually happens is that rows get filtered out and Excel refuses to "show all" when asked. If Excel hadn't refused to "show all", I probably wouldn't have noticed. Rows go missing on other sheets, too.

I've found that Microsoft's suggested work-around, turning off AutoRecalc and then manually undoing the filter, works so I'm preemptively doing that.

I updated the macro to turn off the AutoRecalc, run the autofilter for unique records, count unique records and, if that is less than the total records, continue on to "show all" etc. However, if all records are unique records, I bypass the rest of the macro because there's nothing to delete and that's problematic, too.

Susan