Consulting

Results 1 to 8 of 8

Thread: Solved: Consolidating & Cleaning Data - Is there a better way to Do it?

  1. #1

    Solved: Consolidating & Cleaning Data - Is there a better way to Do it?

    I'm using the macros below to consolidate raw data into a new workbook and delete any rows where any columns have empty cells.

    It's a two stage process:

    1) The first macro copys and pastes the relevant data into a new workbook. Currently the data extends from row 4 to 8755, each named range represents a column. This macro runs relatively quickly.

    2) Another macro is then calls another macro that deletes rows with empty cells in any given column. This seems to take a lot of time to run.

    Am wondering if there is a more efficient faster way of doing this?

    Thanks,

    Elvis
    ----------------------------------------

    [VBA]

    Sub Consolidate_index_Data() 'collates raw data
    Application.ScreenUpdating = False
    Workbooks.Add
    ActiveWorkbook.SaveAs Filename:= _
    "C:\Documents and Settings\Test\Risk.xls", FileFormat:=xlNormal, _
    Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
    CreateBackup:=False
    'Dates
    ThisWorkbook.Sheets("Table").Activate 'use activate when working between workbooks
    Range("A1:a10000").Copy
    Workbooks("riskindex.xls").Sheets("sheet1").Activate
    Range("a1").Select
    ActiveSheet.Paste

    'next data series
    ThisWorkbook.Sheets("Table").Activate
    Range("swissfranc").Select
    Selection.Copy
    Workbooks("riskindex.xls").Sheets("Sheet1").Activate
    Range("B1").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
    :=False, Transpose:=False

    'next
    ThisWorkbook.Sheets("Table").Activate
    Range("gold").Select
    Selection.Copy
    Workbooks("riskindex.xls").Sheets("Sheet1").Activate
    Range("C1").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
    :=False, Transpose:=False

    'Next
    ThisWorkbook.Sheets("Table").Activate
    Range("DAAA").Select
    Selection.Copy
    Workbooks("riskindex.xls").Sheets("Sheet1").Activate
    Range("D1").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
    :=False, Transpose:=False

    'ditto above for another 12 series pasted to column P, not shown here to save space
    call delete_rows
    End Sub
    [/VBA]


    2nd Macro
    [VBA]
    Sub Delete_rows()
    Dim calcmode As Long
    Dim ViewMode As Long
    Dim myStrings As Variant
    Dim FoundCell As Range
    Dim i As Long
    Dim myRng As Range
    Dim sh As Worksheet
    With Application
    calcmode = .Calculation
    .Calculation = xlCalculationManual
    .ScreenUpdating = False
    End With
    'We use the ActiveSheet but you can also use Sheets("MySheet")
    Set sh = ActiveSheet
    'We search in column A in this example
    Set myRng = sh.Range("A:J")
    'Add more search strings if you need
    myStrings = Array("#N/A", "")

    With sh
    'We select the sheet so we can change the window view
    .Select
    'If you are in Page Break Preview Or Page Layout view go
    'back to normal view, we do this for speed
    ViewMode = ActiveWindow.View
    ActiveWindow.View = xlNormalView
    'Turn off Page Breaks, we do this for speed
    .DisplayPageBreaks = False
    'We will search the values in MyRng in this example
    With myRng
    For i = LBound(myStrings) To UBound(myStrings)
    Do
    Set FoundCell = myRng.Find(What:=myStrings(i), _
    After:=.Cells(.Cells.Count), _
    LookIn:=xlFormulas, _
    LookAt:=xlWhole, _
    SearchOrder:=xlByRows, _
    SearchDirection:=xlNext, _
    MatchCase:=False)
    'Use xlPart If you want to search in a part of the FoundCell
    'If you use LookIn:=xlValues it will also delete rows with a
    'formula that evaluates to "Ron"
    If FoundCell Is Nothing Then
    Exit Do
    Else
    FoundCell.EntireRow.Delete
    End If
    Loop
    Next i
    End With
    End With
    ActiveWindow.View = ViewMode
    With Application
    .ScreenUpdating = True
    .Calculation = calcmode
    End With
    End Sub

    [/VBA]

  2. #2
    For the loop part of Macro2, I think this will do:

    [vba] Dim col As Range

    On Error Resume Next
    For Each col In Range("A:J").Columns
    col.Replace what:="#N/A", Replacement:="", LookAt:=xlWhole
    col.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
    Next[/vba]
    -------------------------------------------------
    The more details you give, the easier it is to understand your question. Don't save the effort, tell us twice rather than not at all. The amount of info you give strongly influences the quality of answer, and also how fast you get it.

  3. #3
    For the loop part of Macro2, I think this will do:

    [vba] Dim col As Range

    On Error Resume Next
    For Each col In Range("A:J").Columns
    col.Replace what:="#N/A", Replacement:="", LookAt:=xlWhole
    col.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
    Next[/vba]
    -------------------------------------------------
    The more details you give, the easier it is to understand your question. Don't save the effort, tell us twice rather than not at all. The amount of info you give strongly influences the quality of answer, and also how fast you get it.

  4. #4
    Thanks Jimmy,

    That small bit of code works great for the deleting part. The code I was using I found on the internet which I adapted for my requirements. Just goes to show there's a lot rubbish inefficent code out there, better to utilise the knowledge of experts on a board such as this!

    Regarding the first bit, currently I'm copying the entire column/all data but now it occurs to me that maybe I don't need all of it, just the column headings and say the last populated 2000 rows. But there's a couple of things.

    a) Would copying just a subset of the data neccesarily result in faster execution? If not then I could just use a simple macro to delete rows that I don't need in the consolidated sheet.

    b) My named ranges refer to the whole columns as opposed to a range in the column. Is it possible in VBA to reference only a subset of a named range (e.g for the last 2000 populated rows in column A bring back the equivalent 2000 rows of the named range/column X?)

    Depending on the answers to A and B, it might just be easier to copy the entire columns and delete rows in the new workbook.

    Hope I'm making sense!

    Thanks all,

    Elvis

  5. #5
    Just goes to show there's a lot rubbish inefficent code out there
    That's probably true, but I wouldn't, on first sight, disparage the code you posted. Finding empty cells is easy because of the SpecialCells method. However, finding other values is not so simple. In many cases you can't spare looping.

    a) Would copying just a subset of the data neccesarily result in faster execution?
    I'm not 100% sure. There may be a slight difference in speed between copying one column and two columns at a time, but I think the difference is so small that it won't noticeably influence the overall speed.

    Reducing the number of Copy operations is quite another matter. You can achieve great speed improvement if you're able, for example, to do the copying in one single step instead of looping.

    b) My named ranges refer to the whole columns as opposed to a range in the column. Is it possible in VBA to reference only a subset of a named range [...]?
    Yes.
    You can do a lot of things with range objects, they are extremely versatile. To demonstrate what I mean:
    Imagine that I have a big data table. For some reason I want to select the first 10 cells in the 3rd row from the bottom of the data. The code would be:
    [vba]Range("A" & Rows.Count).End(xlUp).Offset(-2).Resize(, 10).Select[/vba] I recommend studying the following methods and properties:
    Range.End
    Range.Offset
    Range.Resize
    Intersect
    Union


    Depending on the answers to A and B, it might just be easier to copy the entire columns and delete rows in the new workbook.
    That's for you to decide. I can't tell you, because I don't know the structure of the source data. One thing I'm sure of: there's a lot of selecting in Macro1. Selecting ranges before copying or pasting makes sense only if you want to visually follow the code's execution. Otherwise, it's an absolute waste of time and resources.
    Try the following code with and without the red line.
    [vba]Sub test()
    Dim c As Range, rng As Range, s As String
    Set rng = Range("A1:A10000")
    For Each c In rng.Cells
    c.Select
    s = c.Value
    Next
    End Sub[/vba]
    Last edited by JimmyTheHand; 07-14-2009 at 03:37 AM.
    -------------------------------------------------
    The more details you give, the easier it is to understand your question. Don't save the effort, tell us twice rather than not at all. The amount of info you give strongly influences the quality of answer, and also how fast you get it.

  6. #6
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Hi Elvis,
    When working between workbooks, create a variable for each and use these variables to qualify/refer to the ranges and sheets. That way you don't have to activate to copy paste/data or get values.

    The start of your first code should be something like this (Always use Option Explicit)

    [vba]
    Option Explicit
    Sub Consolidate_index_Data() 'collates raw data
    Dim wbNEW As Workbook
    Dim wbEXISTS As Workbook

    Application.ScreenUpdating = False
    Set wbEXISTS = ActiveWorkbook
    Set wbNEW = Workbooks.Add

    wbNEW.SaveAs FileName:= _
    "C:\Documents and Settings\Test\Risk.xls", FileFormat:=xlNormal, _
    Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
    CreateBackup:=False
    'Dates
    wbEXISTS.Sheets("Table").Range("A1:a10000").Copy wbNEW.Sheets("sheet1").Range("a1")

    [/vba]
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  7. #7
    VBAX Contributor
    Joined
    Jul 2004
    Location
    Gurgaon, India
    Posts
    148
    Location
    Hi,

    First create a backup of your file.

    Not tested.

    [vba]Option Compare Text
    Sub Consolidate_index_Data() 'collates raw data
    Dim aWB As Workbook
    Dim aWS As Worksheet
    Dim dWB As Workbook
    Dim dWS As Worksheet
    Dim i As Long, Nms

    Application.ScreenUpdating = False

    Nms = Array("swissfranc", "gold", "daaa")
    Set aWB = ThisWorkbook
    Set aWS = aWB.Sheets("Table")

    Set dWB = Workbooks.Add

    dWB.SaveAs Filename:= _
    "C:\Documents and Settings\Test\Risk.xls", FileFormat:=xlNormal, _
    Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
    CreateBackup:=False

    Set dWS = dWB.Sheets(1)

    aWS.Range("a1:a10000").Copy
    dWS.Range("a1").PasteSpecial xlPasteValues
    For i = 0 To UBound(Nms)
    aWS.Range(Nms(i)).Copy
    dWS.Cells(1, i + 2).PasteSpecial xlPasteValues
    Next
    On Error Resume Next
    With dWS.Columns("a:j")
    .SpecialCells(xlCellTypeConstants, 16).EntireRow.Delete
    .SpecialCells(xlCellTypeFormulas, 16).EntireRow.Delete
    .SpecialCells(xlCellTypeBlanks).EntireRow.Delete
    End With
    On Error GoTo 0
    End Sub[/vba]
    HTH
    Last edited by Krishna Kumar; 07-15-2009 at 10:29 AM.

  8. #8
    Thanks for your suggestions.

    FYI - Below is what I developed using the union method before I saw Krishna's and mdmackillop code. It keeps only a subset of the data that it copies over.

    Of course Krishna's code which I have tested and works fine, is much more efficient!

    Thanks all,

    Elvis

    ------------------------------------
    [VBA]
    Sub Consolidate_index_Data() 'collates raw data
    Application.ScreenUpdating = False
    Workbooks.Add
    ActiveWorkbook.SaveAs Filename:= _
    "C:\Documents and Settings\Test\RiskIndex.xls", FileFormat:=xlNormal, _
    Password:="", WriteResPassword:="", ReadOnlyRecommended:=False, _
    CreateBackup:=False


    'Create Copy of Key Variables
    ThisWorkbook.Sheets("Table").Activate
    Union(Range("dates"), Range("swissfranc"), Range("gold"), Range("DAAA"), Range("DBAA")).Copy


    Workbooks("riskindex.xls").Sheets("sheet1").Activate
    Range("A1").PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
    :=False, Transpose:=False

    'LAST 500 Values Only
    x = Range("A20000").End(xlUp).Offset(-501, 0).Select
    Range(ActiveCell, ActiveCell.End(xlUp)).Select
    Selection.EntireRow.Delete
    'call delete cells macro
    End Sub
    [/VBA]

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •