PDA

View Full Version : [SOLVED:] Optimizing VBA code so that it can execute on a workbook of 50,000 rows w/o crashing



ravl13
07-18-2014, 09:37 AM
I have VBA code that I want to run on an excel spreadhsheet of about 50,000 rows. It crashes with a generic application error message when I run it on a sheet with 20,000 or more rows, but if I run it on the same sheet with only 10,000 rows (with the other rows deleted), then it runs sucessfully. So, I'm guessing I need to optimize the code in order to get it to run successfully on a larger worksheet.

The purpose of this macro is to go through every row, and delete all rows that do not have a government email. I'm sure there is a better way to identify the End of File than my patchwork loop condition, but I don't know the proper way to determine the end of a file.

Any critique on how this code could be made more efficient would be appreciated. As you can probably tell, from my code, I am extremely amateur.


'Note that Column K must be the Email column for this macro to work
'Additionally, Column E should be City, and Column B should be either FirstName or LastName, for the End of File to be properly detected


'Code for the CombineIntoHouseholds Excel Macro
Sub DeleteNonGovtPeople()




'Specify which sheet we are working with in the workbook
Sheets("Sheet1").Select

'Clean all cells of excess whitespace
For Each cell In ActiveSheet.UsedRange
cell.Value = Trim(cell)
Next cell






'Select the first entry of the "K" (Email) column. K1 is the column header, so K2 would be the first entry
Range("K2").Select

'This loop will continue until the selected cell (which would be in the email column) and the
'column E (which would be in the City column) and Column B (FirstName) are all blank.
'Additionally, the row after that must also be blank in those fields in order for the loop to stop
Do Until Selection.Value = "" And Selection.Offset(0, -6).Value = "" And Selection.Offset(0, -9).Value = "" And Selection.Offset(1, 0).Value = "" And Selection.Offset(1, -6).Value = ""

'Identify whether it's a government email
If Not Selection.Value Like "*.gov" And Not Selection.Value Like "*.mil" Then

'Delete the entire Row if the Person has a govt email
Selection.EntireRow.Delete


Else

'Move the currently selected cell down one row
Selection.Offset(1, 0).Select


End If

Loop




End Sub


And here is a screenshot of my sample spreadsheet that I am running the code on for coding purposes, for context:

11983

p45cal
07-18-2014, 11:24 AM
just as an experiment, see if the following code selects the right rows to delete - more or less (we still have to get the extent of the autofilter right). It works on the active sheet.
Sub blah()
With Range("K1:K50000") 'this is the line we'll have to tweak to get the right range for the autofilter.
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).SpecialCells(xlCellTypeVisible).EntireRow.Select
.AutoFilter
End With
End Sub
If there's nothing below the table it could be:
Sub blah2()
lr = Cells(Rows.Count, "A").End(xlUp).Row' last row with something in column A
With Range("K1:K" & lr)
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).SpecialCells(xlCellTypeVisible).EntireRow.Select 'replace .Select with .Delete if you're happy that the selected rows can be deleted.
.AutoFilter
End With
End Sub

and should 50000 rows be too heavy for .SpecialCells try (beware, this one deletes):
Sub blah3()
lr = Cells(Rows.Count, "A").End(xlUp).Row
With Range("K1:K" & lr)
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).EntireRow.Delete
.AutoFilter
End With
End Sub



Your:
For Each cell In ActiveSheet.UsedRange
cell.Value = Trim(cell)
Next cellmight be replaceable with:

ActiveSheet.UsedRange.Value = Application.Trim(ActiveSheet.UsedRange.Value)or:
ActiveSheet.UsedRange.Value = Trim(ActiveSheet.UsedRange.Value)Application.Trim removes multiple adjacent spaces between words as well as trailing and leading spaces whereas missing out Application removes only leading and trailing spaces.

ravl13
07-19-2014, 03:26 PM
Thank you very much p45cal!

FYI, this code worked:

ActiveSheet.UsedRange.Value = Application.Trim(ActiveSheet.UsedRange.Value)

but your example with "Application" taken out gave a Type Mismatch Error when I tried to run it. The Application.Trim works fine though, so no real need to troubleshoot the error

Your first email filter code worked for the most part as well (I'll explain the minor problem in a minute), so I didn't try the others:

Sub blah()
With Range("K1:K50000") 'this is the line we'll have to tweak to get the right range for the autofilter.
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).SpecialCells(xlCellTypeVisible).EntireRow.Select
.AutoFilter
End With
End Sub

So with p45cal's help, I have this little routine which can complete running in a couple minutes:


Sub DeleteNonGovtPeople()


ActiveSheet.UsedRange.Value = Application.Trim(ActiveSheet.UsedRange.Value)


With Range("K1:K50000") 'this is the line we'll have to tweak to get the right range for the autofilter.
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).SpecialCells(xlCellTypeVisible).EntireRow.Select
.AutoFilter
End With


Selection.EntireRow.Delete




End Sub

The only problem with this function vs my original function is that this new function is no longer case-insensitive, whereas my original function using a LIKE comparison was case-insensitive. So, if there is a row that has an email ending with ".MIL" or ".Mil", this routine will delete that row, which I do not want to happen. '

Is there a way to make the filter criteria case-insensitive?

p45cal
07-19-2014, 04:31 PM
I've tried some of the routines here, including yours, and it is case insensitive - it keeps the .Mil, .MIL, .Gov, .gov etc.
What version of Excel are you using?
Do you have any Option statements at the top of the module?
What kind of module is the code in? Sheet code-module, ThisWorkbook code-module or a standard code-module?

Try starting with an empty virgin workbook, with no other workbook open, put the data and code in that, and see if it's still case sensitive.

jolivanes
07-19-2014, 11:09 PM
Put the following right at the top of your code window.
Option Compare Text.

ravl13
07-20-2014, 09:20 AM
I've tried some of the routines here, including yours, and it is case insensitive - it keeps the .Mil, .MIL, .Gov, .gov etc.
What version of Excel are you using?
Do you have any Option statements at the top of the module?
What kind of module is the code in? Sheet code-module, ThisWorkbook code-module or a standard code-module?

Try starting with an empty virgin workbook, with no other workbook open, put the data and code in that, and see if it's still case sensitive.

Oh Geez, I'm sorry I had it backwards. I reran the optimized version and compared it to my original code results, and yes the optimized code is case insensitive, while my original code was unfortunately case sensitive. My bad!

So, the routine does what it's supposed to now, yay. Thanks everyone

ravl13
07-20-2014, 09:21 AM
Thanks for the suggestion Joli, it was user error on my part

p45cal
07-20-2014, 10:10 AM
so I have this little routine which can complete running in a couple minutes:A couple of minutes still!? Try the blah3 from msg#2, I suspect it might be quicker (no selecting). I've also added a couple of lines to prevent it updating the screen while it's operating.
Sub blah3()
On Error GoTo there
Application.ScreenUpdating = False
lr = Cells(Rows.Count, "A").End(xlUp).Row
With Range("K1:K" & lr)
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).EntireRow.Delete
.AutoFilter
End With
there:
Application.ScreenUpdating = True
End Sub

mancubus
07-20-2014, 01:11 PM
trim function consumes a lot of time.

so limiting the range to those cells that contain only text may speed up the code.



For Each Cll In ActiveSheet.UsedRange.SpecialCells(xlCellTypeConstants, xlTextValues)
Cll.Value = Trim(Cll.Value)
Next Cll



and... as p45cal recommended, i would also use filter/delete to delete rows based on condition.

ravl13
07-20-2014, 08:09 PM
trim function consumes a lot of time.

so limiting the range to those cells that contain only text may speed up the code.



For Each Cll In ActiveSheet.UsedRange.SpecialCells(xlCellTypeConstants, xlTextValues)
Cll.Value = Trim(Cll.Value)
Next Cll



and... as p45cal recommended, i would also use filter/delete to delete rows based on condition.

That code block causes excel to freeze for me, but I'm fine with the original trim function - it does what I need.

ravl13
07-20-2014, 08:13 PM
A couple of minutes still!? Try the blah3 from msg#2, I suspect it might be quicker (no selecting). I've also added a couple of lines to prevent it updating the screen while it's operating.
Sub blah3()
On Error GoTo there
Application.ScreenUpdating = False
lr = Cells(Rows.Count, "A").End(xlUp).Row
With Range("K1:K" & lr)
.AutoFilter Field:=1, Criteria1:="<>*.mil", Operator:=xlAnd, Criteria2:="<>*.gov"
.Resize(.Rows.Count - 1).Offset(1).EntireRow.Delete
.AutoFilter
End With
there:
Application.ScreenUpdating = True
End Sub


That performs about the same speed if I also include the Trimming part. Honestly though, I am absolutely content with the Sub I have - it's still FAR faster at under 2 minutes, compared to taking about 20 minutes when I finally got the original Sub to run on another computer for the full 50000 file.

(Also keep in mind I am using rather weak hardware - these are definitely not powerhouse machines)

Thanks everyone!

mancubus
07-21-2014, 12:38 AM
@ravl13

you dont need to quote the messages. :)

i tested the code against 220K rows X 10 cols of data. it took 13 minutes to trim all range.

but below code, which i lifted from ozgrid, completed the same task in 0,55 minutes (33 seconds)



Sub TrimRange2()
Dim calc As Long, t As Double

With Application
.ScreenUpdating = False
.DisplayAlerts = False
.EnableEvents = False
calc = .Calculation
.Calculation = xlCalculationManual
End With

t = Timer

'http://www.ozgrid.com/forum/showthread.php?t=171841
'Sub Jindon
With ActiveSheet.UsedRange
.Value = Evaluate("If(Row(" & .Address & "),Clean(Trim(" & .Address & ")))")
End With
'End Sub

MsgBox (Timer - t) / 60

With Application
.EnableEvents = True
.Calculation = calc
End With
End Sub