PDA

View Full Version : Solved: Consolidating & Cleaning Data - Is there a better way to Do it?



Elvis
07-13-2009, 07:35 AM
I'm using the macros below to consolidate raw data into a new workbook and delete any rows where any columns have empty cells.

It's a two stage process:

1) The first macro copys and pastes the relevant data into a new workbook. Currently the data extends from row 4 to 8755, each named range represents a column. This macro runs relatively quickly.

2) Another macro is then calls another macro that deletes rows with empty cells in any given column. This seems to take a lot of time to run.

Am wondering if there is a more efficient faster way of doing this?

Thanks,

Elvis
----------------------------------------



Sub Consolidate_index_Data() 'collates raw data
Application.ScreenUpdating = False
Workbooks.Add
ActiveWorkbook.SaveAs Filename:= _
"C:\Documents and Settings\Test\Risk.xls", FileFormat:=xlNormal, _
Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
CreateBackup:=False
'Dates
ThisWorkbook.Sheets("Table").Activate 'use activate when working between workbooks
Range("A1:a10000").Copy
Workbooks("riskindex.xls").Sheets("sheet1").Activate
Range("a1").Select
ActiveSheet.Paste

'next data series
ThisWorkbook.Sheets("Table").Activate
Range("swissfranc").Select
Selection.Copy
Workbooks("riskindex.xls").Sheets("Sheet1").Activate
Range("B1").Select
Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
:=False, Transpose:=False

'next
ThisWorkbook.Sheets("Table").Activate
Range("gold").Select
Selection.Copy
Workbooks("riskindex.xls").Sheets("Sheet1").Activate
Range("C1").Select
Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
:=False, Transpose:=False

'Next
ThisWorkbook.Sheets("Table").Activate
Range("DAAA").Select
Selection.Copy
Workbooks("riskindex.xls").Sheets("Sheet1").Activate
Range("D1").Select
Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
:=False, Transpose:=False

'ditto above for another 12 series pasted to column P, not shown here to save space
call delete_rows
End Sub



2nd Macro

Sub Delete_rows()
Dim calcmode As Long
Dim ViewMode As Long
Dim myStrings As Variant
Dim FoundCell As Range
Dim i As Long
Dim myRng As Range
Dim sh As Worksheet
With Application
calcmode = .Calculation
.Calculation = xlCalculationManual
.ScreenUpdating = False
End With
'We use the ActiveSheet but you can also use Sheets("MySheet")
Set sh = ActiveSheet
'We search in column A in this example
Set myRng = sh.Range("A:J")
'Add more search strings if you need
myStrings = Array("#N/A", "")

With sh
'We select the sheet so we can change the window view
.Select
'If you are in Page Break Preview Or Page Layout view go
'back to normal view, we do this for speed
ViewMode = ActiveWindow.View
ActiveWindow.View = xlNormalView
'Turn off Page Breaks, we do this for speed
.DisplayPageBreaks = False
'We will search the values in MyRng in this example
With myRng
For i = LBound(myStrings) To UBound(myStrings)
Do
Set FoundCell = myRng.Find(What:=myStrings(i), _
After:=.Cells(.Cells.Count), _
LookIn:=xlFormulas, _
LookAt:=xlWhole, _
SearchOrder:=xlByRows, _
SearchDirection:=xlNext, _
MatchCase:=False)
'Use xlPart If you want to search in a part of the FoundCell
'If you use LookIn:=xlValues it will also delete rows with a
'formula that evaluates to "Ron"
If FoundCell Is Nothing Then
Exit Do
Else
FoundCell.EntireRow.Delete
End If
Loop
Next i
End With
End With
ActiveWindow.View = ViewMode
With Application
.ScreenUpdating = True
.Calculation = calcmode
End With
End Sub

JimmyTheHand
07-13-2009, 10:22 AM
For the loop part of Macro2, I think this will do:

Dim col As Range

On Error Resume Next
For Each col In Range("A:J").Columns
col.Replace what:="#N/A", Replacement:="", LookAt:=xlWhole
col.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
Next

JimmyTheHand
07-13-2009, 10:25 AM
For the loop part of Macro2, I think this will do:

Dim col As Range

On Error Resume Next
For Each col In Range("A:J").Columns
col.Replace what:="#N/A", Replacement:="", LookAt:=xlWhole
col.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
Next

Elvis
07-13-2009, 11:06 AM
Thanks Jimmy,

That small bit of code works great for the deleting part. The code I was using I found on the internet which I adapted for my requirements. Just goes to show there's a lot rubbish inefficent code out there, better to utilise the knowledge of experts on a board such as this!

Regarding the first bit, currently I'm copying the entire column/all data but now it occurs to me that maybe I don't need all of it, just the column headings and say the last populated 2000 rows. But there's a couple of things.

a) Would copying just a subset of the data neccesarily result in faster execution? If not then I could just use a simple macro to delete rows that I don't need in the consolidated sheet.

b) My named ranges refer to the whole columns as opposed to a range in the column. Is it possible in VBA to reference only a subset of a named range (e.g for the last 2000 populated rows in column A bring back the equivalent 2000 rows of the named range/column X?)

Depending on the answers to A and B, it might just be easier to copy the entire columns and delete rows in the new workbook.

Hope I'm making sense!

Thanks all,

Elvis

JimmyTheHand
07-14-2009, 01:25 AM
Just goes to show there's a lot rubbish inefficent code out there
That's probably true, but I wouldn't, on first sight, disparage the code you posted. Finding empty cells is easy because of the SpecialCells method. However, finding other values is not so simple. In many cases you can't spare looping.


a) Would copying just a subset of the data neccesarily result in faster execution? I'm not 100% sure. There may be a slight difference in speed between copying one column and two columns at a time, but I think the difference is so small that it won't noticeably influence the overall speed.

Reducing the number of Copy operations is quite another matter. You can achieve great speed improvement if you're able, for example, to do the copying in one single step instead of looping.


b) My named ranges refer to the whole columns as opposed to a range in the column. Is it possible in VBA to reference only a subset of a named range [...]?Yes.
You can do a lot of things with range objects, they are extremely versatile. To demonstrate what I mean:
Imagine that I have a big data table. For some reason I want to select the first 10 cells in the 3rd row from the bottom of the data. The code would be:
Range("A" & Rows.Count).End(xlUp).Offset(-2).Resize(, 10).Select I recommend studying the following methods and properties:
Range.End
Range.Offset
Range.Resize
Intersect
Union


Depending on the answers to A and B, it might just be easier to copy the entire columns and delete rows in the new workbook. That's for you to decide. I can't tell you, because I don't know the structure of the source data. One thing I'm sure of: there's a lot of selecting in Macro1. Selecting ranges before copying or pasting makes sense only if you want to visually follow the code's execution. Otherwise, it's an absolute waste of time and resources.
Try the following code with and without the red line.
Sub test()
Dim c As Range, rng As Range, s As String
Set rng = Range("A1:A10000")
For Each c In rng.Cells
c.Select
s = c.Value
Next
End Sub

mdmackillop
07-14-2009, 10:43 AM
Hi Elvis,
When working between workbooks, create a variable for each and use these variables to qualify/refer to the ranges and sheets. That way you don't have to activate to copy paste/data or get values.

The start of your first code should be something like this (Always use Option Explicit)


Option Explicit
Sub Consolidate_index_Data() 'collates raw data
Dim wbNEW As Workbook
Dim wbEXISTS As Workbook

Application.ScreenUpdating = False
Set wbEXISTS = ActiveWorkbook
Set wbNEW = Workbooks.Add

wbNEW.SaveAs FileName:= _
"C:\Documents and Settings\Test\Risk.xls", FileFormat:=xlNormal, _
Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
CreateBackup:=False
'Dates
wbEXISTS.Sheets("Table").Range("A1:a10000").Copy wbNEW.Sheets("sheet1").Range("a1")

Krishna Kumar
07-15-2009, 09:44 AM
Hi,

First create a backup of your file.

Not tested.

Option Compare Text
Sub Consolidate_index_Data() 'collates raw data
Dim aWB As Workbook
Dim aWS As Worksheet
Dim dWB As Workbook
Dim dWS As Worksheet
Dim i As Long, Nms

Application.ScreenUpdating = False

Nms = Array("swissfranc", "gold", "daaa")
Set aWB = ThisWorkbook
Set aWS = aWB.Sheets("Table")

Set dWB = Workbooks.Add

dWB.SaveAs Filename:= _
"C:\Documents and Settings\Test\Risk.xls", FileFormat:=xlNormal, _
Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
CreateBackup:=False

Set dWS = dWB.Sheets(1)

aWS.Range("a1:a10000").Copy
dWS.Range("a1").PasteSpecial xlPasteValues
For i = 0 To UBound(Nms)
aWS.Range(Nms(i)).Copy
dWS.Cells(1, i + 2).PasteSpecial xlPasteValues
Next
On Error Resume Next
With dWS.Columns("a:j")
.SpecialCells(xlCellTypeConstants, 16).EntireRow.Delete
.SpecialCells(xlCellTypeFormulas, 16).EntireRow.Delete
.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
End With
On Error GoTo 0
End Sub
HTH

Elvis
07-22-2009, 03:38 AM
Thanks for your suggestions.

FYI - Below is what I developed using the union method before I saw Krishna's and mdmackillop code. It keeps only a subset of the data that it copies over.

Of course Krishna's code which I have tested and works fine, is much more efficient!

Thanks all,

Elvis

------------------------------------

Sub Consolidate_index_Data() 'collates raw data
Application.ScreenUpdating = False
Workbooks.Add
ActiveWorkbook.SaveAs Filename:= _
"C:\Documents and Settings\Test\RiskIndex.xls", FileFormat:=xlNormal, _
Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
CreateBackup:=False


'Create Copy of Key Variables
ThisWorkbook.Sheets("Table").Activate
Union(Range("dates"), Range("swissfranc"), Range("gold"), Range("DAAA"), Range("DBAA")).Copy


Workbooks("riskindex.xls").Sheets("sheet1").Activate
Range("A1").PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
:=False, Transpose:=False

'LAST 500 Values Only
x = Range("A20000").End(xlUp).Offset(-501, 0).Select
Range(ActiveCell, ActiveCell.End(xlUp)).Select
Selection.EntireRow.Delete
'call delete cells macro
End Sub