PDA

View Full Version : Remove duplicate rows and retain the first duplicate row



mvandhu
10-17-2011, 08:42 AM
Hi All,

Need a help..
The below mentioned vba code is to remove the duplicate rows and retain the last duplicate row.(Duplicates are removed based on 2 columns)..

My question is what changes have to be made in this code so that i can retain my first duplicate row?


Public Sub DelDupRows()
Const TEST_COLUMN As String = "A" ' <=== change to suit
Dim i As Long
Dim iLastRow As Long
Dim rng As Range
With ActiveSheet
iLastRow = .Cells(.Rows.Count, TEST_COLUMN).End(xlUp).Row
For i = 1 To iLastRow
If .Evaluate("SUMPRODUCT(--(A" & i & ":A" & iLastRow & "=A" & i & ")," & _
"--(B" & i & ":B" & iLastRow & "=B" & i & "))") > 1 Then
If rng Is Nothing Then
Set rng = .Cells(i, "A").Resize(, 3)
Else
Set rng = Union(rng, .Cells(i, "A").Resize(, 3))
End If
End If
Next i
If Not rng Is Nothing Then rng.Delete
End With
End Sub

shrivallabha
10-17-2011, 09:01 AM
Untested but should work:
Public Sub DelDupRows()
Const TEST_COLUMN As String = "A" ' <=== Change to suit
Dim i As Long, lCounter As Long ' <=== Added Variable to Separate Row1
Dim iLastRow As Long
Dim rng As Range
With ActiveSheet
iLastRow = .Cells(.Rows.Count, TEST_COLUMN).End(xlUp).Row
For i = 1 To iLastRow
If .Evaluate("SUMPRODUCT(--(A" & i & ":A" & iLastRow & "=A" & i & ")," & _
"--(B" & i & ":B" & iLastRow & "=B" & i & "))") >= 1 Then ' <== Changed to eliminate last row
lCounter = lCounter + 1
If lCounter = 1 Then ' <== Added Loop
'Do Nothing
Else 'Start Adding from the second instance
If rng Is Nothing Then
Set rng = .Cells(i, "A").Resize(, 3)
Else
Set rng = Union(rng, .Cells(i, "A").Resize(, 3))
End If
End If
End If
Next i
If Not rng Is Nothing Then rng.Delete
End With
End Sub

mvandhu
10-18-2011, 08:13 AM
Hi Shrivallabha,

Thanks for your response.
But unfortunately the code is not giving the expected results.
It deletes all the rows.

shrivallabha
10-18-2011, 09:48 AM
Sorry. Should have looked at it more carefully. Try this:
Public Sub DelDupRows2()
Const TEST_COLUMN As String = "A" ' <=== change to suit
Dim i As Long
Dim iLastRow As Long, iStarRow As Long
Dim rng As Range
With ActiveSheet
iLastRow = .Cells(.Rows.Count, TEST_COLUMN).End(xlUp).Row
iStarRow = 1 ' <=== change to suit
For i = iStarRow To iLastRow
If .Evaluate("SUMPRODUCT(--(A" & iStarRow & ":A" & i & "=A" & i & ")," & _
"--(B" & iStarRow & ":B" & i & "=B" & i & "))") > 1 Then
If rng Is Nothing Then
Set rng = .Cells(i, "A").Resize(, 3)
Else
Set rng = Union(rng, .Cells(i, "A").Resize(, 3))
End If
End If
Next i
If Not rng Is Nothing Then rng.Delete
End With
End Sub

mvandhu
10-19-2011, 08:41 AM
Hey Shrivallabha,
Thanks for the code... its working good....
Another help..
Can u explian the contents in for loop alone...?
and also this statement
If Not rng Is Nothing Then rng.Delete

shrivallabha
10-19-2011, 09:54 AM
First the formula part

If .Evaluate("SUMPRODUCT(--(A" & iStarRow & ":A" & i & "=A" & i & ")," & _
"--(B" & iStarRow & ":B" & i & "=B" & i & "))") > 1 Then
This is the main argument which finds duplicates. It uses SUMPRODUCT formula which is explained nicely by Bob here:
http://www.vbaexpress.com/forum/forumdisplay.php?f=98
The Evaluate part literally evaluates the outcome of the formula as if it was written in Excel. If there's only one entry then the above formula will return 1. In all other cases, it will return a value greater than 1.

There are few ways in which a For Loop is written
1. For Next
2. For Each
You can find some explanation here:
http://www.ozgrid.com/VBA/VBALoops.htm

Now coming to the construct used in this code.
a. 'For Next' loops through the complete range in the specified column.
b. The SUMPRODUCT formula evaluates the result for each cell.
c. Whenever the SUMPRODUCT evaluation is greater than 1 then the inside loop is executed.
If rng Is Nothing Then
Set rng = .Cells(i, "A").Resize(, 3)
Else
Set rng = Union(rng, .Cells(i, "A").Resize(, 3))
End If
The variable 'rng' is declared as Range. So VBA assigns a special value Nothing when it doesn't hold / refer to a range. So at first instance rng will be nothing, and then VBA will execute the 'If' condition. Then on it will keep on executing 'Else' part. Union method, as the name suggests, keeps adding to the 'rng'.
d. When For Loop ends then there are two possibilities: either 'rng' will have something or nothing. If it is nothing then the
rng.Delete
will raise error. So to avoid this error VBA checks if rng is Not Nothing. And if it's affirmative then executes rng.delete.