PDA

View Full Version : [SOLVED] Merge two files and remove duplicates



YasserKhalil
06-17-2017, 04:11 PM
Hello everyone
I have two files 1.xlsx and 2.xlsx and need to merge data in both files relying on ID field
Sometimes some IDs are unique in the files and sometimes IDs are common (existing in both files)

I have uploaded Output file so as to see the desired output (the results are sorted by ID)
I need a macro to be put in Output file and deal with data from 1 and 2 files when they are closed
Thanks advanced for help

mana
06-17-2017, 10:41 PM
Option Explicit


Sub test()
Dim dic As Object
Dim p As String
Dim i As Long
Dim s, it, n As Long

Set dic = CreateObject("scripting.dictionary")

p = ThisWorkbook.Path

With Workbooks.Open(p & "\1.xlsx").Sheets(1)
With .Cells(1).CurrentRegion
For i = 1 To .Rows.Count
s = .Cells(i, 1).Value
If Not dic.exists(s) Then
Set dic(s) = CreateObject("system.collections.arraylist")
dic(s).Add s
dic(s).Add .Cells(i, 2).Value
dic(s).Add .Cells(i, 3).Value
dic(s).Add Empty
End If
Next
End With
.Parent.Close False
End With

With Workbooks.Open(p & "\2.xlsx").Sheets(1)
With .Cells(1).CurrentRegion
For i = 1 To .Rows.Count
s = .Cells(i, 1).Value
If Not dic.exists(s) Then
Set dic(s) = CreateObject("system.collections.arraylist")
dic(s).Add s
dic(s).Add Empty
dic(s).Add Empty
Else
dic(s).removeat 3
End If
dic(s).Add .Cells(i, 3).Value
Next
End With
.Parent.Close False
End With

With Cells(1)
.CurrentRegion.ClearContents
For Each it In dic.items
.Offset(n).Resize(, 4).Value = it.toarray
n = n + 1
Next
.Sort .Columns(1), Header:=xlYes
End With

End Sub

YasserKhalil
06-18-2017, 01:41 AM
Thank you very much Mr. Mana for this great code ..
I have noticed some incorrect results
Here's a snapshot of the differences between expected and code results
19536

mana
06-18-2017, 02:57 AM
Option Explicit


Sub test2()
Dim dic As Object
Dim p As String
Dim i As Long
Dim s, it, n As Long

Set dic = CreateObject("scripting.dictionary")

p = ThisWorkbook.Path

With Workbooks.Open(p & "\1.xlsx").Sheets(1)
With .Cells(1).CurrentRegion
For i = 1 To .Rows.Count
s = .Cells(i, 1).Value
If Not dic.exists(s) Then
Set dic(s) = CreateObject("system.collections.arraylist")
dic(s).Add s
dic(s).Add .Cells(i, 2).Value
dic(s).Add .Cells(i, 3).Value
dic(s).Add Empty
End If
Next
End With
.Parent.Close False
End With

With Workbooks.Open(p & "\2.xlsx").Sheets(1)
With .Cells(1).CurrentRegion
For i = 1 To .Rows.Count
s = .Cells(i, 1).Value
If Not dic.exists(s) Then
Set dic(s) = CreateObject("system.collections.arraylist")
dic(s).Add s
dic(s).Add .Cells(i, 2).Value
dic(s).Add Empty
Else
If dic(s)(1) = "" Then
dic(s).removeat 1
dic(s).Insert 2, .Cells(i, 2).Value
End If
dic(s).removeat 3
End If
dic(s).Add .Cells(i, 3).Value
Next
End With
.Parent.Close False
End With

With Cells(1)
.CurrentRegion.ClearContents
For Each it In dic.items
.Offset(n).Resize(, 4).Value = it.toarray
n = n + 1
Next
.Sort .Columns(1), Header:=xlYes
End With

End Sub

YasserKhalil
06-18-2017, 03:15 AM
That's amazing and fascinating Mr. Mana
Thank you very very much for this great solution
Best Regards