PDA

View Full Version : Remove all Duplicate rows in Excel based on multiple colum



k_v_deepu
11-18-2011, 05:46 AM
Hi All,

I would require help in solving my duplicate rows problem.

I have 10 columns(A to H) and around 100 rows in my excel, i would like to delete all the rows which is duplicating from colum D to H(columns A, B & C will always be unique so i dont wont to check this 3 columns).

Ex:
Col1 Col2 Col3 Col4
1 b c d - dup1
2 a b c
3 b c d - dup1
4 a b d - dup2
5 a b d - dup2
6 a c d

Output should be:

Col1 Col2 Col3 Col4
2 a b c
6 a c d


Thanks in advance for the support

Deepu

mancubus
11-18-2011, 12:00 PM
hi deepu.

this may help:
http://www.vbaexpress.com/forum/showthread.php?t=10600

Paul_Hossler
11-19-2011, 08:20 PM
Excel 2007/2010 has Delete Duplicates on the Data tab

Paul

k_v_deepu
11-21-2011, 12:11 AM
Hi Mancubus, Hi Paul,

Thanks for your help,

I had checked both the possiblities but in both the cases one entry of the duplicate record will remain.

In my case i want to delete all the duplicate records, and i want only the records which occur once in the file.

Hope you understand my requirement.

Regards
Deepak

k_v_deepu
11-21-2011, 01:58 AM
Hi Mancubus, Hi Paul,

Please see below my requirement.

Ex:
Col1, Col2, Col3, Col4
1, b, c, d - dup1
2, a, b, c
3, b, c, d - dup1
4, a, b, d - dup2
5, a, b, d - dup2
6, a, c, d

I want Output similar to below (This are the records which are appearing only once)

Col1, Col2, Col3, Col4
2, a, b, c
6, a, c, d

Hope you understand my requirement.

Regards
Deepak

nilem
11-21-2011, 04:45 AM
Hi k_v_deepu
The link in post #2 given excellent macro. In your case something like this:
Sub ProcessData3()
Columns("A:B").Insert
With Range("A1:A" & Cells(Rows.Count, "C").End(xlUp).Row)
.Formula = "=F1&G1&H1&I1&J1"
.Offset(, 1).Formula = "=COUNTIF($A$1:$A$100,A1)"
With .Resize(, 6)
.AutoFilter Field:=2, Criteria1:=">1"
.Offset(1).SpecialCells(12).EntireRow.Delete
.AutoFilter
End With
End With
Columns("A:B").Delete
End Sub

mancubus
11-21-2011, 05:19 AM
and using xld's sumproduct formula with a helper column


Public Sub ProcessData()

Dim i As Long, LastRow As Long, calc As Long
Dim rng As Range

With Application
.ScreenUpdating = False
.EnableEvents = False
calc = .Calculation
.Calculation = xlCalculationManual
End With

With ActiveSheet
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
For i = 2 To LastRow
Cells(i, 5).Formula = "=SUMPRODUCT(--(B2:B" & LastRow & "=B" & i & ")," & _
"--(C2:C" & LastRow & "=C" & i & ")," & _
"--(D2:D" & LastRow & "=D" & i & "))"
Next
.AutoFilterMode = False
With .Range("A1:F" & LastRow)
.AutoFilter Field:=5, Criteria1:=">1"
.Offset(1, 0).SpecialCells(xlCellTypeVisible).EntireRow.Delete
End With
.AutoFilterMode = False
.Range("F1:F" & LastRow).Clear
End With

With Application
.Calculation = calc
.EnableEvents = True
.ScreenUpdating = True
End With

End Sub

Paul_Hossler
11-21-2011, 06:42 AM
In my case i want to delete all the duplicate records, and i want only the records which occur once in the file.

Hope you understand my requirement.


Sorry -- I didn't understand your requirement accurately


You're correct -- 'Delete Dups' will leave one

Paul

mancubus
11-21-2011, 06:58 AM
Sorry -- I didn't understand your requirement accurately


You're correct -- 'Delete Dups' will leave one

Paul

deepak wants to delete all repeating rows based on col B, C, D and keep olny single rows

k_v_deepu
11-21-2011, 08:52 AM
Hi All,

Thanks for you support,
I have one more doubt, when i concatenate all the 10 columns the number of characters will be more than 256. which is more than excels per cell limit. If i use the sumproduct formula or if i use normal concatenate formula i think formula will truncate the characters which are more than 256 characters.

Please correct me if i am wrong. Anyways i will check the codes given by Mancubus, Paul and Nilem and get back to you guys.

Once again thanks a ton.

Regards
Deepak

Paul_Hossler
11-21-2011, 01:37 PM
Now that I understand a little better, maybe something like this

It's a little brute force, and uses a temporary 'marker' column


Option Explicit
Sub OnlyLeaveUnique()
Dim r As Long, c As Long, n As Long, x As Long

Dim rData As Range

Application.ScreenUpdating = False


n = ActiveSheet.Cells(1, 1).CurrentRegion.Columns.Count + 1

ActiveSheet.Cells(1, n).Value = "TEMP"

For r = 2 To ActiveSheet.Cells(1, 1).CurrentRegion.Rows.Count
ActiveSheet.Cells(r, n).Value = r
Next r

Set rData = ActiveSheet.Cells(1, 1).CurrentRegion

With ActiveSheet.Sort
.SortFields.Clear

For c = 4 To 10
.SortFields.Add Key:=rData.Cells(1, c).Resize(rData.Rows.Count - 1, 1)
Next c

.SetRange rData
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With

With rData
For r = 2 To .Rows.Count

x = 0

For c = 4 To 10
If .Cells(r, c).Value <> .Cells(r + 1, c).Value Then
x = x + 1
Exit For
End If

Next c

If x = 0 Then
.Cells(r, n).Value = True
.Cells(r + 1, n).Value = True
End If

Next r
End With

With ActiveSheet.Sort
.SortFields.Clear
.SortFields.Add Key:=rData.Cells(1, n).Resize(rData.Rows.Count - 1, 1)
.SetRange rData
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With

On Error Resume Next
rData.Columns(n).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
On Error GoTo 0

rData.Columns(n).EntireColumn.Delete
Application.ScreenUpdating = True

End Sub


Paul

k_v_deepu
11-22-2011, 12:13 AM
Hi Paul,

this is what i wanted, thanks a lot for you support it has saved almost a days work for me.

Hi Mancubus, Hi Nilem,

Thanks for your support also.

This site Rocks:beerchug:

Regards
Deepak

k_v_deepu
11-23-2011, 12:05 AM
Hi Paul,

This code is working fine in Excel 2007, but in 2002 it is throwing error at the line Activesheet.Sort. COuld you please help me again.

Regards
Deepak

k_v_deepu
11-24-2011, 02:36 AM
Can any body help me on this.

Deepak

Paul_Hossler
11-24-2011, 09:08 AM
This code is working fine in Excel 2007, but in 2002 it is throwing error at the line Activesheet.Sort. COuld you please help me again.


The Excel VBA Sort was changed in 2007, and I don't have access to older versions anymore

If you use the 2003 macro recorder to record the Sorts, I'll take a shot as including it (unless some one else jumps in first)

I think conditional compilation would allow the same macro to be used in both versions. That's not too hard to do, since VBXers have explained it to me .... several times, since I'm a slow learner :think:

It'll take a while though, since we're moving to a holiday period here, AND my eldest daughter is getting married in a couple of days

Paul

k_v_deepu
11-25-2011, 03:04 AM
Hi Paul,

No Prob, i will give a shot. If i will get some anwer i will post it or else please let me know once you are back. Convey my hearty wishes to your daugther.

Regards
Deepak