Consulting

Results 1 to 12 of 12

Thread: Solved: Duplicates removals!

  1. #1
    VBAX Regular
    Joined
    Dec 2006
    Posts
    20
    Location

    Question Solved: Duplicates removals!

    Hiya,

    Not even sure if this is possible, Basically I have a spreadsheet with 48 columns, and over 30000 rows.

    The rows of data do have duplicates in them but some of the duplicate rows have more information then the others. I want to get rid of the rows with the least information.

    So basically is it possible to make a macro that finds duplicates and compares them, deleting the row with the least information?
    I have included a small sample spreadsheet same heading etc but no real data (and only 5 rows). The column that it needs to check for similarities is "Q" named "REG NR"

    Please let me know if this is possible and how I would go about doing something like this?

    Thanks guys.

  2. #2
    Moderator VBAX Guru Ken Puls's Avatar
    Joined
    Aug 2004
    Location
    Nanaimo, BC, Canada
    Posts
    4,001
    Location
    I have not looked at your data, but...

    What I tend to do when I need to delete duplicates is use a variation of this routine from my site.

    It works like this:

    • A column of index numbers is created after the last column of information in the spreadsheet. This is to preserve the original order.
    • The data is then sorted based on the specified column, with the secondary sort key being the original order. (This ensures that the first instance of a record will be maintained.)
    • A column of formulas is inserted at the end of the sheet, comparing the Target Column's data for that row against the previous row. If it does not match, it is a new record, if it does match, then the record is a duplicate.
    • Using the autofilter, all data that has been flagged as a duplicate is deleted.
    • The data is sorted back into the original order and the "helper columns" are deleted
    The key for you would be to make that second formula robust enough to check if the row is a duplicate, and if so, is is the shortest of the bunch. (And then, of course, implement that formula in the VBE.)

    Does that make sense?
    Ken Puls, CMA - Microsoft MVP (Excel)
    I hate it when my computer does what I tell it to, and not what I want it to.

    Learn how to use our KB tags! -||- Ken's Excel Website -||- Ken's Excel Forums -||- My Blog -||- Excel Training Calendar

    This is a shameless plug for my new book "RibbonX - Customizing the Office 2007 Ribbon". Find out more about it here!

    Help keep VBAX clean! Use the 'Thread Tools' menu to mark your own threads solved!





  3. #3
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Hi James
    Here's some code to try. Tried to minimise looping 30000 rows, but couldn't avoid it completely!
    [vba]Sub DelDups()
    Dim Rw As Long, Col As Range
    Application.ScreenUpdating = False
    Rw = Cells(Rows.Count, 1).End(xlUp).Row
    Set Col = Range("AX2:AX" & Rw)

    'Titles
    [AX1] = "Index"
    [AY1] = "Registration"
    [AZ1] = "Count"


    'Create index
    Col.Formula = "=ROW()"
    Col.Copy
    Col.PasteSpecial xlPasteValues

    'Copy registration
    Set Col = Col.Offset(, 1)
    Col.FormulaR1C1 = "=RC[-34]"
    Col.Copy
    Col.PasteSpecial xlPasteValues

    'Count data cells
    Set Col = Col.Offset(, 1)
    Col.FormulaR1C1 = "=COUNTA(RC[-51]:RC[-4])"
    Col.Copy
    Col.PasteSpecial xlPasteValues

    'Sort Data
    Columns("A:AZ").Select
    Selection.Sort Key1:=Range("AY2"), Order1:=xlAscending, Key2:=Range("AZ2" _
    ), Order2:=xlDescending, Header:=xlYes, OrderCustom:=1, MatchCase:=False _
    , Orientation:=xlTopToBottom, DataOption1:=xlSortNormal, DataOption2:= _
    xlSortNormal
    [AX1].Select

    'Clear dup. count cells
    For i = 2 To Rw
    If Range("AY" & i) = Range("AY" & i - 1) Then Range("AZ" & i).ClearContents
    Next

    'Filter for blanks
    With Columns("A:AZ")
    .AutoFilter Field:=52, Criteria1:="="
    Rows("2:" & Rw).ClearContents
    .AutoFilter
    End With
    'Restore order
    Columns("A:AZ").Sort Key1:=Range("AX2"), Order1:=xlAscending, Header:=xlYes, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal

    [AX1].Select
    Application.ScreenUpdating = True
    End Sub
    [/vba]
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  4. #4
    VBAX Regular
    Joined
    Dec 2006
    Posts
    20
    Location
    Stops with 1004 error

    "aplication-defined or object-defined error" debug highlights this code:

    [VBA] Selection.Sort Key1:=Range("AY2"), Order1:=xlAscending, Key2:=Range("AZ2" _
    ), Order2:=xlDescending, Header:=xlYes, OrderCustom:=1, MatchCase:=False _
    , Orientation:=xlTopToBottom, DataOption1:=xlSortNormal, DataOption2:= _
    xlSortNormal[/VBA]

    Does this on both the one i downloaded from your post and the real sheet.

  5. #5
    Moderator VBAX Guru Ken Puls's Avatar
    Joined
    Aug 2004
    Location
    Nanaimo, BC, Canada
    Posts
    4,001
    Location
    HI James,

    What version of Excel are you running? Maybe try trimming out some of the (most likely) unnecessary arguments:
    [vba]
    Selection.Sort Key1:=Range("AY2"), Order1:=xlAscending, Key2:=Range("AZ2"), _
    Order2:=xlDescending, Header:=xlYes, Orientation:=xlTopToBottom[/vba]
    Ken Puls, CMA - Microsoft MVP (Excel)
    I hate it when my computer does what I tell it to, and not what I want it to.

    Learn how to use our KB tags! -||- Ken's Excel Website -||- Ken's Excel Forums -||- My Blog -||- Excel Training Calendar

    This is a shameless plug for my new book "RibbonX - Customizing the Office 2007 Ribbon". Find out more about it here!

    Help keep VBAX clean! Use the 'Thread Tools' menu to mark your own threads solved!





  6. #6
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Sorry James,
    I usually remember to delete these options which cause problems in pre 2003 Excel
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  7. #7
    VBAX Regular
    Joined
    Dec 2006
    Posts
    20
    Location
    yer im using 2000 bit rubish i know!

    Can you repost the code with the bits taken out. Think i may be trying to cut out too much as not work at all now! :S

  8. #8
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Modified for Office 2000
    [vba]Sub DelDups()
    Dim Rw As Long, Col As Range
    Application.ScreenUpdating = False
    Rw = Cells(Rows.Count, 1).End(xlUp).Row
    Set Col = Range("AX2:AX" & Rw)

    'Titles
    [AX1] = "Index"
    [AY1] = "Registration"
    [AZ1] = "Count"


    'Create index
    Col.Formula = "=ROW()"
    Col.Copy
    Col.PasteSpecial xlPasteValues

    'Copy registration
    Set Col = Col.Offset(, 1)
    Col.FormulaR1C1 = "=RC[-34]"
    Col.Copy
    Col.PasteSpecial xlPasteValues

    'Count data cells
    Set Col = Col.Offset(, 1)
    Col.FormulaR1C1 = "=COUNTA(RC[-51]:RC[-4])"
    Col.Copy
    Col.PasteSpecial xlPasteValues

    'Sort Data
    Columns("A:AZ").Select
    Selection.Sort Key1:=Range("AY2"), Order1:=xlAscending, Key2:=Range("AZ2" _
    ), Order2:=xlDescending, Header:=xlYes, OrderCustom:=1, MatchCase:=False _
    , Orientation:=xlTopToBottom
    [AX1].Select

    'Clear dup. count cells
    For i = 2 To Rw
    If Range("AY" & i) = Range("AY" & i - 1) Then Range("AZ" & i).ClearContents
    Next

    'Filter for blanks
    With Columns("A:AZ")
    .AutoFilter Field:=52, Criteria1:="="
    Rows("2:" & Rw).ClearContents
    .AutoFilter
    End With
    'Restore order
    Columns("A:AZ").Sort Key1:=Range("AX2"), Order1:=xlAscending, Header:=xlYes, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom

    [AX1].Select
    Application.ScreenUpdating = True
    End Sub[/vba]
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  9. #9
    VBAX Regular
    Joined
    Dec 2006
    Posts
    20
    Location
    Thanks guys, Awsome

  10. #10
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    How long did it run?
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  11. #11
    VBAX Regular
    Joined
    Dec 2006
    Posts
    20
    Location
    took 60-70seconds! very good considering the age of the laptop work have given me.

  12. #12
    Moderator VBAX Guru Ken Puls's Avatar
    Joined
    Aug 2004
    Location
    Nanaimo, BC, Canada
    Posts
    4,001
    Location
    For 30,000 rows? Wow, that's pretty good, actually. Nice work, Malcolm!
    Ken Puls, CMA - Microsoft MVP (Excel)
    I hate it when my computer does what I tell it to, and not what I want it to.

    Learn how to use our KB tags! -||- Ken's Excel Website -||- Ken's Excel Forums -||- My Blog -||- Excel Training Calendar

    This is a shameless plug for my new book "RibbonX - Customizing the Office 2007 Ribbon". Find out more about it here!

    Help keep VBAX clean! Use the 'Thread Tools' menu to mark your own threads solved!





Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •