PDA

View Full Version : Solved: Delete Unique Records



InJustice
11-07-2006, 10:52 AM
If given a few days im sure i could figure this out. Thats why i need your help.

I have this worksheet that has about 15,000 records. Each record is represented in a separate row.

I want to delete every unique record which defined as NO MATCH with any other record's column A, AND column B, AND column C.

only records with at least one duplicate should remiain.

After that I need to count the number of duplicated records. for example if we have:

John Doe
John Doe
John Doe
jane Doe
jane Doe
John Doe

The count would result in 2 becuase i only want to count John Doe once regardless of how many times he is duplicated.

i have attached part of the sheet that i am working with.

Thanks so much for any help you can provide!
- Mike

lucas
11-07-2006, 11:18 AM
Try this to get you started:

Option Explicit
Sub main()

Dim ws As Worksheet
Dim rng As Range
Dim rngToDelete As Range

Set ws = Worksheets("Sheet1")
'Advanced Filter requires a header row - let's add a temporary one
ws.Rows(1).Insert
ws.Cells(1, 1).Value = "temp header"

Set rng = ws.Range("A1:c10000")
rng.AdvancedFilter xlFilterInPlace, unique:=True
Set rngToDelete = rng.SpecialCells(xlCellTypeVisible)
ws.ShowAllData
rngToDelete.EntireRow.Hidden = True
rng.SpecialCells(xlCellTypeVisible).EntireRow.Delete
rngToDelete.EntireRow.Hidden = False

'remove the temporary row
ws.Rows(1).Delete

End Sub

InJustice
11-07-2006, 11:30 AM
Lucas --

Thanks for your reply. I tried the code you posted and it deleted duplicate records leaving one representative unique record for all duplicates and uniques.

it might be very important to note here that only columns A, B, and C should be compared. There is always going to be unique data in the rest of the row. perhpas thats why i got the result i did

i needed to have the records that do not match any other record's colum a, b, and c, to be removed. in other words .... any records that are not duplicated should be removed. afterwards i should have a sheet that has groupings of duplicate records.

I know it sounds like a strange thing to want.

-- Mike

Bob Phillips
11-07-2006, 11:44 AM
Sub Test()
Dim iLastRow As Long
Dim i As Long, j As Long

iLastRow = Cells(Rows.Count, "A").End(xlUp).row
For i = 1 To iLastRow
If CountIf(Columns(1), Cells(i, "A").Value) = 1 And _
CountIf(Columns(2), Cells(i, "B").Value) = 1 And _
CountIf(Columns(3), Cells(i, "C").Value) = 1 Then
Rows(i).Delete
End If
Next i

End Sub

InJustice
11-07-2006, 01:28 PM
it gets to the first "CountIf" and then i get the error "Compile Error - Sub or Function not defined"

should it it be defined?

Bob Phillips
11-07-2006, 01:47 PM
My error, should be



Sub Test()
Dim iLastRow As Long
Dim i As Long, j As Long

iLastRow = Cells(Rows.Count, "A").End(xlUp).Row
For i = 1 To iLastRow
If Application.CountIf(Columns(1), Cells(i, "A").Value) = 1 And _
Application.CountIf(Columns(2), Cells(i, "B").Value) = 1 And _
Application.CountIf(Columns(3), Cells(i, "C").Value) = 1 Then
Rows(i).Delete
End If
Next i

End Sub

Zack Barresse
11-07-2006, 01:56 PM
Also, check out brettdj's Duplicate Master...

http://members.iinet.net.au/~brettdj/

HTH

InJustice
11-07-2006, 04:35 PM
xld --

well... now it runs. but it doesnt seem to do anything to the data.

Firefytr --
its very close to what i need. if only it would select all the duplicates in a grouping rather than leaving the first unselected.

-- Mike

Bob Phillips
11-07-2006, 04:39 PM
That's because all the data you showed share a date with another row.

InJustice
11-07-2006, 05:04 PM
AH!

i understand now.

i was looking to do the compare using the fields in column A B and C as a set, all three would have to match another record in the worksheet to be retained. if the unit of a, b, and c, did not match another records a, b, and c, then it would be discarded.

sorry i may not have defined the problem well enough.

thanks again for all your effort.

SamT
11-07-2006, 08:05 PM
This edited edition LOOKs like it should work. at least to this beginner.

Option Explicit
Sub main()
'Original by Lucas
'Edited by SamT

Dim ws As Worksheet
Dim rng As Range
Dim rngToDelete As Range

Set ws = Worksheets("Sheet1")
'Advanced Filter requires a header row - let's add a temporary one
ws.Rows(1).Insert
ws.Cells(1, 1).Value = "temp header"

Set rng = ws.Range("A1:c10000")
rng.AdvancedFilter xlFilterInPlace, unique:=True
Set rngToDelete = rng.SpecialCells(xlCellTypeVisible)
ws.ShowAllData
rngToDelete.EntireRow.Delete 'Added by SamT

'rngToDelete.EntireRow.Hidden = True
'rng.SpecialCells(xlCellTypeVisible).EntireRow.Delete
'rngToDelete.EntireRow.Hidden = False

'remove the temporary row
ws.Rows(1).Delete

End Sub



SamT

InJustice
11-07-2006, 09:16 PM
SamT --

I think we are getting closer. Attached is the result after running that macro on the data. Interestingly it dropped off the first grouping of duplicates and then dropped off the first row of each of the remaining groupings of duplicates. i.e. if there were three john smiths before, there would be only two john smiths remaining ....

Note that the original "sample" file is included with my first post.

would it help if i posted a file with the correct result in it?

- Mike

InJustice
11-07-2006, 09:21 PM
attached to this post is the sample file with what the result should look like after running the code on the sample file attached to the first post in the thread.

Again... thanks guys for all your help.

-- Mike

Bob Phillips
11-08-2006, 02:46 AM
Couple of errors in the final results.



Sub DeleteRows()
Dim iLastRow As Long
Dim i As Long
Dim rng As Range

iLastRow = Cells(Rows.Count, "A").End(xlUp).Row
Columns("A:B").Insert
Range("A2").Resize(iLastRow - 1).Formula = "=C2&D2&E2"
Range("B2").Resize(iLastRow - 1).Formula = "=COUNTIF(A:A,C2&D2&E2)=1"
For i = 2 To iLastRow
If Cells(i, "B") Then
If rng Is Nothing Then
Set rng = Rows(i)
Else
Set rng = Union(rng, Rows(i))
End If
End If
Next i

If Not rng Is Nothing Then rng.Delete
Columns("A:B").Delete

End Sub

InJustice
11-08-2006, 08:55 AM
xld --

it worked perfectly! it took a minute to go through 15k of records but the results were right on the money. even if there there are a few imperfections in the results (i did not notice any) it would not matter for my puposes.

Again, thanks so much for your help. You saved me a great deal of pain.

-- Mike

Bob Phillips
11-08-2006, 09:19 AM
even if there there are a few imperfections in the results (i did not notice any) it would not matter for my puposes.

I was referring to your example files, you cut out a couple that should have remained. No big deal, it was obvious when comparing my achieved results with your predicteds.

brettdj
12-02-2006, 12:07 AM
xld --

well... now it runs. but it doesnt seem to do anything to the data.

Firefytr --
its very close to what i need. if only it would select all the duplicates in a grouping rather than leaving the first unselected.

-- Mike Mike, there is an option to select the first cell as a "duplicate" as well.

Cheers

Dave