PDA

View Full Version : Solved: Compare and delete duplicated data



parscon
11-19-2012, 03:48 PM
I have 2 column A and B , Now I want to column A compare with column B and delete the duplicate date on column B .

Thank you very much .

GTO
11-19-2012, 06:35 PM
As long as speed is not of the essence, an easy way is to use COUNTIF().

In a Standard Module:

Option Explicit

Sub example()
Dim rngLastCell As Range
Dim rngColA As Range
Dim rngColB As Range
Dim n As Long

With Sheet1 '<--Using worksheet's CodeName, or, using tab name-->ThisWorkbook.Worksheets ("Sheet1")
'// Find the last cell in each column, setting a reference to each column's range//
'// that contains data. //
Set rngLastCell = RangeFound(.Columns(1), , .Cells(1, 1))
If Not rngLastCell Is Nothing Then Set rngColA = .Range(.Cells(1), rngLastCell)
Set rngLastCell = RangeFound(.Columns(2), , .Cells(1, 2))
If Not rngLastCell Is Nothing Then Set rngColB = .Range(.Cells(1, 2), rngLastCell)

'// In case either column was empty, provide a bailout point. //
If rngColA Is Nothing Or rngColB Is Nothing Then
MsgBox "No data"
Exit Sub
End If

'// Loop upwards from the bottom, deleting cells containing duplicates. //
For n = rngColB.Cells(rngColB.Cells.Count).Row To 1 Step -1
If CBool(Evaluate("COUNTIF(" & rngColA.Address & "," & rngColB.Cells(n) & ")")) Then
rngColB.Cells(n).Delete xlUp
End If
Next
End With
End Sub

Function RangeFound(SearchRange As Range, _
Optional ByVal FindWhat As String = "*", _
Optional StartingAfter As Range, _
Optional LookAtTextOrFormula As XlFindLookIn = xlValues, _
Optional LookAtWholeOrPart As XlLookAt = xlPart, _
Optional SearchRowCol As XlSearchOrder = xlByRows, _
Optional SearchUpDn As XlSearchDirection = xlPrevious, _
Optional bMatchCase As Boolean = False) As Range

If StartingAfter Is Nothing Then
Set StartingAfter = SearchRange(1)
End If

Set RangeFound = SearchRange.Find(What:=FindWhat, _
After:=StartingAfter, _
LookIn:=LookAtTextOrFormula, _
LookAt:=LookAtWholeOrPart, _
SearchOrder:=SearchRowCol, _
SearchDirection:=SearchUpDn, _
MatchCase:=bMatchCase)
End Function
Hope that helps,

Mark

GarysStudent
11-19-2012, 06:56 PM
Here is an example.

parscon
11-20-2012, 12:15 AM
Thank you very much . really it was the big help .

GTO
11-20-2012, 01:35 AM
Oops :-(

Hopefully you used GarysStudent's, as mine had an error in a line, which should have been:
If CBool(Evaluate("COUNTIF(" & rngColA.Address(, , , True) & "," & rngColB.Cells(n).Address(, , , True) & ")")) Then

If you have a lot of data, this could be quicker:

Option Explicit

Sub example2()
Dim rngLastCell As Range
Dim rngColA As Range
Dim rngColB As Range
Dim n As Long, j As Long
Dim DIC As Object ' Scripting.Dictionary
Dim aryColB As Variant
Dim aryColA As Variant
Dim aryOutput As Variant

With Sheet1 '<--Using worksheet's CodeName, or, using tab name-->ThisWorkbook.Worksheets ("Sheet1")
'// Find the last cell in each column, setting a reference to each column's range//
'// that contains data. //
Set rngLastCell = RangeFound(.Columns(1), , .Cells(1, 1))
If Not rngLastCell Is Nothing Then Set rngColA = .Range(.Cells(1), rngLastCell)
Set rngLastCell = RangeFound(.Columns(2), , .Cells(1, 2))
If Not rngLastCell Is Nothing Then Set rngColB = .Range(.Cells(1, 2), rngLastCell)

'// In case either column was empty, provide a bailout point. //
If rngColA Is Nothing Or rngColB Is Nothing Then
MsgBox "No data"
Exit Sub
End If

Set DIC = CreateObject("Scripting.Dictionary")
aryColA = rngColA.Value
'// fill the keys with unique values from Column A //
For n = 1 To UBound(aryColA, 1)
DIC.Item(CStr(aryColA(n, 1))) = Empty
Next

aryColB = rngColB.Value
'// Size an output array to the current size of data in Column B, so we can just//
'// overwrite the present values. //
ReDim aryOutput(1 To UBound(aryColB, 1), 1 To 1)

'// Loop through the current values, adding just the values we don't find in //
'// the dictionary to out output array. //
For n = 1 To UBound(aryColB)
If Not DIC.Exists(CStr(aryColB(n, 1))) Then
j = j + 1
aryOutput(j, 1) = aryColB(n, 1)
End If
Next

'// Kaplunk. //
rngColB.Value = aryOutput

Set DIC = Nothing
Erase aryColA
Erase aryColB
Erase aryOutput
End With
End Sub

Function RangeFound(SearchRange As Range, _
Optional ByVal FindWhat As String = "*", _
Optional StartingAfter As Range, _
Optional LookAtTextOrFormula As XlFindLookIn = xlValues, _
Optional LookAtWholeOrPart As XlLookAt = xlPart, _
Optional SearchRowCol As XlSearchOrder = xlByRows, _
Optional SearchUpDn As XlSearchDirection = xlPrevious, _
Optional bMatchCase As Boolean = False) As Range

If StartingAfter Is Nothing Then
Set StartingAfter = SearchRange(1)
End If

Set RangeFound = SearchRange.Find(What:=FindWhat, _
After:=StartingAfter, _
LookIn:=LookAtTextOrFormula, _
LookAt:=LookAtWholeOrPart, _
SearchOrder:=SearchRowCol, _
SearchDirection:=SearchUpDn, _
MatchCase:=bMatchCase)
End Function

Sorry about that bad line:blush

Mark

parscon
11-20-2012, 01:57 AM
Thank you but your code does not work when run your code show me this error :

Compile Error :
Variable not defined

I use Office 2013 .

Thank you .

GTO
11-20-2012, 05:10 PM
Greetings parscon,

When you run the code and receive the error, what variable is highlighted? I would guess at 'Sheet1'. If this is correct, then you would change this to the CodeName of the sheet you wanted the code to run against.

Mark

mpeterson
04-19-2013, 11:21 PM
I have found GTO's masterpiece script that compares between two columns and delete duplicated values in the second column. It is a masterpiece.

I was wondering if this script can be modified to suit my current data manipulation needs. I have 27 columns of sorted domain names by letter a-z and one extra column (27th) for domain names that start with a number or a hyphen. There's always new domain names to be added to the current worksheet "main_lists" which has about 3.5 million domain names.

If I am using GTO's script I would copy data column by column from "main_lists" worksheet to "Sheet1" where the new list of domain names exists in ColB having to do copy and past and run the script 27 times to delete duplicate data in the new data column B.

Can GTO's script be modified to compare data between 27 columns in "main_lists" worksheet, and "Sheet1" ColB new data sheet? If this is possible that would be a dream that came true to me.

Highly appreciating any assistance with this need.

Mark Peterson