PDA

View Full Version : [SOLVED] Match first column sheet A with first column sheet B



marcel1410
03-27-2015, 08:24 AM
Hi all,

I am trying to learn VBA, so I'm a real beginner.

For a project I want to do the following with VBA:

Suppose we have only two sheets, sheet A and sheet B. We need for the first column in sheet A to have values that are also in the first column of sheet B (and vice versa). If we find a value in the first column of sheet B which doesn't occur in the first column of sheet A, we need to delete that entire row of sheet B (and vice versa).

So now i've sketched the project, I present below the code that i've written (which is not working).


Sub test()
'
'
'

' Declare Variables
Dim i As Integer
Dim j As Integer
Dim max_row As Integer

Dim c As Double
ReDim arr(0) 'Dynamic array with first only one entry

Dim A As String
Dim B As String

A = "A"
B = "B"
max_row = 5000


For j = 1 To max_row
c = 0
For i = 1 To max_row
If Worksheets(A).Cells(i, 1).Value = Worksheets(B).Cells(j, 1).Value Then c = c + 1
If IsEmpty(Worksheets(B).Cells(j, 1).Value) Then c = 1 'Allow for empty cells, but is redundant
Next i

If c = 0 Then arr(UBound(arr)) = j
If c = 0 Then ReDim Preserve arr(UBound(arr) + 1)
Next j
If IsEmpty(arr(UBound(arr))) Then ReDim Preserve arr(UBound(arr) - 1) 'Delete last empty entry of the array
Worksheets(B).Select
ActiveSheet.Range("A" & Join(arr, ",A")).EntireRow.Delete


End Sub

I hope somebody here can help me find out why this code doesn't work. By running it i do not get an error, excell will just move to the status 'nonresponding' and i have to exit excell.

Yongle
03-27-2015, 01:08 PM
Not looked at your code, but here is a different way of approaching the problem.
How it works
Using the Match function, the code deletes row in sheetA of any item not found in sheetB, before deleting row in sheetB of any item not found in sheetA

NOTE - that when you delete rows always start from the bottom of the range and work backwards - to avoid problems


Sub Delete_Row_If_No_Match_In_ColA()
'declare variables etc
Dim i As Integer, LastRowA As Long, LastRowB As Long
Dim TableA As Range, TableB As Range
'determine last rows and set ranges
LastRowA = Worksheets("SheetA").Range("A1048576").End(xlUp).Row
LastRowB = Worksheets("SheetB").Range("A1048576").End(xlUp).Row
Set TableA = Worksheets("SheetA").Range("A2:A" & LastRowA)
Set TableB = Worksheets("SheetB").Range("A2:A" & LastRowB)
'now delete anything in SheetA not in sheet B
For i = LastRowA To 1 Step -1 ' Loop backwards if deleting rows
If IsError(Application.Match(TableA(i).Value, TableB, 0)) Then
TableA(i).EntireRow.Delete
End If
Next i
'now delete anything in SheetB not in sheet A
For i = LastRowB To 1 Step -1 'Loop backwards if deleting rows
If IsError(Application.Match(TableB(i).Value, TableA, 0)) Then
TableB(i).EntireRow.Delete
End If
Next i
End Sub

(you will need to substitute A65536 forA1048576 if using older versions of Excel etc)

Yongle
03-29-2015, 12:03 PM
@marcel1410 - did this solve your problem?
If it did, please click on "Thread Tools" at top of thread and mark the thread as "Solved"
thanks

marcel1410
03-30-2015, 03:58 AM
Hi Yongle,

Thanks for your code and sorry for my late response. I tested your code and it does exactly what i was aiming for.

However, I have some questions:
-Your code is working slowly if i apply it on say 10.000 rows. I guess it's working slowly since the rows are deleted one by one.
I think it would be more efficient to first store the row numbers which has to be deleted.
And when all these row numbers are known, delete them together at once. I will try to code this myself today.
However, if i not succeed, can i ask you again for help? (I tried this already with my code, hence the dynamic array).
Of course i'm curious for your ideas for making the code faster.

-I'm still wondering why my code fails. Would you take a quick look at it?

Yongle
03-30-2015, 06:52 AM
One comment from very quick look at your code:
EVERY If statement MUST finish with End If
So if you have 5 X If statements, you need 5 X End if
This is probably why your code is hanging
Have a look at your code and post it again if you cannot fix it.



If CONDITION Then
do something...

Else (if you want something doing if CONDITION not met, otherwise leave out)
do something if condition not met

End If - every IF must have a matching End If

Yongle
03-30-2015, 10:00 AM
Here is my code adjusted to delete all the rows together. Please use the whole code because I have added variables etc. Hopefully this will run much faster. It will be interesting to find out.

How the code collects all the rows
Rather than use an array, I have instead made use of Union which allows you to join ranges together.

Set Del_RngA = Union(Del_RngA, TableA(i))

and then deleted the range comprising all the rows with

Del_RngA.EntireRow.Delete


In case you are wondering, the first "Set" of Del_RngA etc is the last row in Excel because I had to use something that could be safely deleted later. The Union would not work without the range being Set earlier in the code!

The whole code is now

Sub Delete_Row_If_No_Match_In_ColA()
'declare variables etc
Dim i As Integer, LastRowA As Long, LastRowB As Long
Dim TableA As Range, TableB As Range, Del_RngA As Range, Del_RngB As Range
'determine last rows and set ranges
LastRowA = Worksheets("SheetA").Range("A1048576").End(xlUp).Row
LastRowB = Worksheets("SheetB").Range("A1048576").End(xlUp).Row
Set TableA = Worksheets("SheetA").Range("A2:A" & LastRowA)
Set TableB = Worksheets("SheetB").Range("A2:A" & LastRowB)
Set Del_RngA = Worksheets("SheetA").Range("A1048576")
Set Del_RngB = Worksheets("SheetB").Range("A1048576")


'now delete anything in SheetA not in sheet B
For i = LastRowA To 1 Step -1 ' Loop backwards if deleting rows
If IsError(Application.Match(TableA(i).Value, TableB, 0)) Then
Set Del_RngA = Union(Del_RngA, TableA(i))
End If
Next i
Del_RngA.EntireRow.Delete

'now delete anything in SheetB not in sheet A
For i = LastRowB To 1 Step -1 'Loop backwards if deleting rows
If IsError(Application.Match(TableB(i).Value, TableA, 0)) Then
Set Del_RngB = Union(Del_RngB, TableB(i))
End If
Next i
Del_RngB.EntireRow.Delete



End Sub

marcel1410
04-01-2015, 01:56 AM
Thanks for your code. It works now, also for data sets with 10.000 rows. However, it still takes like 15 minutes.
But maybe it is just excell which is not comfortable with such large sheets?

btw, I finished every IF statement with ENDIF in my code, but the code is still not running correctly. Any suggestions?

If you don't have suggestions to speed things up, i will mark this thread as "solved" since you provided me with a working code.
Again, thank you for your working code.

Marcel

Yongle
04-01-2015, 08:49 AM
Hi Marcel
BUY A FASTER COMPUTER!! That is where your real problem is. You are asking the processor to do a lot of work - comparing 10000 lines against 10000 lines in 2 directions. The first comparison results in 50,000,000 calculations, the second one slightly fewer.

Having said that:
- on your PC it took 15 minutes
- on my PC it took 7 seconds

Can you try something for me - please sort the data in both sheets before running the vba, sort both tables based on column A and make sure you sort them in the same way. Then run the code. And see if it is faster.

I will look again and try a few different method to compare times.

And I will also have a look at your code this evening.

Yongle
04-01-2015, 10:03 AM
Hi Marcel

Had another very quick look at your code, and I think it would be much slower to run.
What I did was to disable several lines (below) and ran the code with a maximum of 1000 rows.

' If c = 0 Then arr(UBound(arr)) = j
' If c = 0 Then ReDim Preserve arr(UBound(arr) + 1)

'If IsEmpty(arr(UBound(arr))) Then ReDim Preserve arr(UBound(arr) - 1) 'Delete last empty entry of the array
'Worksheets(B).Select
'ActiveSheet.Range("A" & Join(arr, ",A")).EntireRow.Delete


With 1000 rows it took about 25 seconds on my PC (50 minutes on your PC) and the speed will slow significantly as the number of rows goes up, and the code dealing with the array is not being excecuted.
So with 10,000 lines in each of 2 tables and the array being filled, 3 additional IF statements and the deletion of the rows, I think you could go on holiday whilst the code is running.
Your machine will crash even if you fix the code. The code also crashed my PC when it ran this line

ActiveSheet.Range("A" & Join(arr, ",A")).EntireRow.Delete
I do not feel inclined to debug something that you could never use.

Yongle
04-01-2015, 11:18 AM
Hi Marcel

It may be possible reduce the time significantly using one method. So that I can test this:
1)How many columns are there in SheetA?
2)How many columns are there in SheetB?
3)Are all the columns in sheet A the same as all the columns in SheetB?

thank you

Yongle
04-02-2015, 01:13 AM
Hi Marcel

A different approach (and faster) using the "Remove Duplicates" feature built into Excel

How it works
There are 3 sheets (A,B & C)
SheetA with data
SheetB with data
Data from SheetA is copied to SheetC
Data from SheetB is copied to SheetC under data from SheetA
"Remove Duplicates" applied in SheetC using ColumnA as the column for checking for duplication.

What you need to do
- open the attached workbook
- copy your data into SheetA and SheetB (headings in row1, data must start in row2 in both sheets)
- run the macro
(It will detect columns and ranges to copy without any further input)

Results
SheetC will contain only items that are common to both sheets in column A

You have not told us which other columns are in your data, but this will leave you with a table that looks like Sheet1, so if Sheet2 has different columns after Column A, then VBA will need some code to lookup SheetB columns.




Sub Dup()
'declare variables
Dim wsA As Worksheet, wsB As Worksheet, wsC As Worksheet
Dim LastRowA As Long, lastRowB As Long, LastRowC As Long
Dim LastColA As Long, LastColB As Long, LastColC As Long
Dim DataA As Range, DataB As Range, DataC As Range
'set ranges
Set wsA = Worksheets("SheetA")
Set wsB = Worksheets("SheetB")
Set wsC = Worksheets("SheetC")
LastRowA = wsA.Range("A20000").End(xlUp).Row
lastRowB = wsB.Range("A20000").End(xlUp).Row
LastColA = wsA.Range("A1").End(xlToRight).Column
LastColB = wsB.Range("A1").End(xlToRight).Column
If LastColA >= LastColB Then
LastColC = LastColA
Else
LastColC = LastColB
End If
Set DataA = wsA.Cells(2, 1).Resize(LastRowA - 1, LastColA)
Set DataB = wsB.Cells(2, 1).Resize(lastRowB - 1, LastColB)
'copy the data to SheetC
DataA.Copy
wsC.Range("A2").PasteSpecial xlAll
DataB.Copy
wsC.Range("A20000").End(xlUp).Offset(1, 0).PasteSpecial xlAll
'remove duplicates in SheetC
LastRowC = wsC.Range("A40000").End(xlUp).Row
Set DataC = wsC.Cells(2, 1).Resize(LastRowC - 1, LastColC)
DataC.RemoveDuplicates Columns:=1, Header:=xlNo
End Sub

marcel1410
04-03-2015, 02:30 AM
Hi,

Sorry again for my late response.

The number of columns in sheetA and sheet B are equal, contain the same TYPE of data, but it has ofcourse different data in it.
But i guess my computer is too slow, and i will work with your code sent on 30 March.

I will mark the post as solved.

Thanks very much for your ideas and work

Marcel