Consulting

Results 1 to 12 of 12

Thread: Match first column sheet A with first column sheet B

  1. #1

    Match first column sheet A with first column sheet B

    Hi all,

    I am trying to learn VBA, so I'm a real beginner.

    For a project I want to do the following with VBA:

    Suppose we have only two sheets, sheet A and sheet B. We need for the first column in sheet A to have values that are also in the first column of sheet B (and vice versa). If we find a value in the first column of sheet B which doesn't occur in the first column of sheet A, we need to delete that entire row of sheet B (and vice versa).

    So now i've sketched the project, I present below the code that i've written (which is not working).

    Sub test()
    '
    '
    '
    
    '   Declare Variables
        Dim i As Integer
        Dim j As Integer
        Dim max_row As Integer
        
        Dim c As Double
        ReDim arr(0)                        'Dynamic array with first only one entry
        
        Dim A As String
        Dim B As String
        
        A = "A"
        B = "B"
        max_row = 5000
    
        
        For j = 1 To max_row
            c = 0
            For i = 1 To max_row
               If Worksheets(A).Cells(i, 1).Value = Worksheets(B).Cells(j, 1).Value Then c = c + 1
               If IsEmpty(Worksheets(B).Cells(j, 1).Value) Then c = 1   'Allow for empty cells, but is redundant
            Next i
            
            If c = 0 Then arr(UBound(arr)) = j
            If c = 0 Then ReDim Preserve arr(UBound(arr) + 1)
        Next j
            If IsEmpty(arr(UBound(arr))) Then ReDim Preserve arr(UBound(arr) - 1) 'Delete last empty entry of the array
            Worksheets(B).Select
            ActiveSheet.Range("A" & Join(arr, ",A")).EntireRow.Delete
            
            
    End Sub
    I hope somebody here can help me find out why this code doesn't work. By running it i do not get an error, excell will just move to the status 'nonresponding' and i have to exit excell.

  2. #2
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    Not looked at your code, but here is a different way of approaching the problem.
    How it works
    Using the Match function, the code deletes row in sheetA of any item not found in sheetB, before deleting row in sheetB of any item not found in sheetA

    NOTE - that when you delete rows always start from the bottom of the range and work backwards - to avoid problems

    Sub Delete_Row_If_No_Match_In_ColA()
    'declare variables etc
        Dim i As Integer, LastRowA As Long, LastRowB As Long
        Dim TableA As Range, TableB As Range
    'determine last rows and set ranges
        LastRowA = Worksheets("SheetA").Range("A1048576").End(xlUp).Row
        LastRowB = Worksheets("SheetB").Range("A1048576").End(xlUp).Row
        Set TableA = Worksheets("SheetA").Range("A2:A" & LastRowA)
        Set TableB = Worksheets("SheetB").Range("A2:A" & LastRowB)
    'now delete anything in SheetA not in sheet B
            For i = LastRowA To 1 Step -1   ' Loop backwards if deleting rows
                If IsError(Application.Match(TableA(i).Value, TableB, 0)) Then
                    TableA(i).EntireRow.Delete
                End If
            Next i
    'now delete anything in SheetB not in sheet A
            For i = LastRowB To 1 Step -1   'Loop backwards if deleting rows
                If IsError(Application.Match(TableB(i).Value, TableA, 0)) Then
                    TableB(i).EntireRow.Delete
                End If
            Next i
    End Sub
    (you will need to substitute A65536 forA1048576 if using older versions of Excel etc)

  3. #3
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    @marcel1410 - did this solve your problem?
    If it did, please click on "Thread Tools" at top of thread and mark the thread as "Solved"
    thanks

  4. #4
    Hi Yongle,

    Thanks for your code and sorry for my late response. I tested your code and it does exactly what i was aiming for.

    However, I have some questions:
    -Your code is working slowly if i apply it on say 10.000 rows. I guess it's working slowly since the rows are deleted one by one.
    I think it would be more efficient to first store the row numbers which has to be deleted.
    And when all these row numbers are known, delete them together at once. I will try to code this myself today.
    However, if i not succeed, can i ask you again for help? (I tried this already with my code, hence the dynamic array).
    Of course i'm curious for your ideas for making the code faster.

    -I'm still wondering why my code fails. Would you take a quick look at it?

  5. #5
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    One comment from very quick look at your code:
    EVERY If statement MUST finish with End If
    So if you have 5 X If statements, you need 5 X End if
    This is probably why your code is hanging
    Have a look at your code and post it again if you cannot fix it.


    If  CONDITION   Then
    do something...
    
    Else (if you want something doing if CONDITION not met, otherwise leave out)
    do something if condition not met
    
    End If    - every IF must have a matching End If

  6. #6
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    Here is my code adjusted to delete all the rows together. Please use the whole code because I have added variables etc. Hopefully this will run much faster. It will be interesting to find out.

    How the code collects all the rows
    Rather than use an array, I have instead made use of Union which allows you to join ranges together.
    Set Del_RngA = Union(Del_RngA, TableA(i))
    and then deleted the range comprising all the rows with
    Del_RngA.EntireRow.Delete
    In case you are wondering, the first "Set" of Del_RngA etc is the last row in Excel because I had to use something that could be safely deleted later. The Union would not work without the range being Set earlier in the code!

    The whole code is now
    Sub Delete_Row_If_No_Match_In_ColA()
    'declare variables etc
        Dim i As Integer, LastRowA As Long, LastRowB As Long
        Dim TableA As Range, TableB As Range, Del_RngA As Range, Del_RngB As Range
    'determine last rows and set ranges
        LastRowA = Worksheets("SheetA").Range("A1048576").End(xlUp).Row
        LastRowB = Worksheets("SheetB").Range("A1048576").End(xlUp).Row
        Set TableA = Worksheets("SheetA").Range("A2:A" & LastRowA)
        Set TableB = Worksheets("SheetB").Range("A2:A" & LastRowB)
        Set Del_RngA = Worksheets("SheetA").Range("A1048576")
        Set Del_RngB = Worksheets("SheetB").Range("A1048576")
    
    
    'now delete anything in SheetA not in sheet B
            For i = LastRowA To 1 Step -1   ' Loop backwards if deleting rows
                If IsError(Application.Match(TableA(i).Value, TableB, 0)) Then
                    Set Del_RngA = Union(Del_RngA, TableA(i))
                End If
              Next i
              Del_RngA.EntireRow.Delete
                         
    'now delete anything in SheetB not in sheet A
            For i = LastRowB To 1 Step -1   'Loop backwards if deleting rows
                If IsError(Application.Match(TableB(i).Value, TableA, 0)) Then
                    Set Del_RngB = Union(Del_RngB, TableB(i))
                End If
            Next i
            Del_RngB.EntireRow.Delete
        
    
    
    End Sub

  7. #7
    Thanks for your code. It works now, also for data sets with 10.000 rows. However, it still takes like 15 minutes.
    But maybe it is just excell which is not comfortable with such large sheets?

    btw, I finished every IF statement with ENDIF in my code, but the code is still not running correctly. Any suggestions?

    If you don't have suggestions to speed things up, i will mark this thread as "solved" since you provided me with a working code.
    Again, thank you for your working code.

    Marcel

  8. #8
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    Hi Marcel
    BUY A FASTER COMPUTER!! That is where your real problem is. You are asking the processor to do a lot of work - comparing 10000 lines against 10000 lines in 2 directions. The first comparison results in 50,000,000 calculations, the second one slightly fewer.

    Having said that:
    - on your PC it took 15 minutes
    - on my PC it took 7 seconds

    Can you try something for me - please sort the data in both sheets before running the vba, sort both tables based on column A and make sure you sort them in the same way. Then run the code. And see if it is faster.

    I will look again and try a few different method to compare times.

    And I will also have a look at your code this evening.

  9. #9
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    Hi Marcel

    Had another very quick look at your code, and I think it would be much slower to run.
    What I did was to disable several lines (below) and ran the code with a maximum of 1000 rows.
    ' If c = 0 Then arr(UBound(arr)) = j
           ' If c = 0 Then ReDim Preserve arr(UBound(arr) + 1)
     
            'If IsEmpty(arr(UBound(arr))) Then ReDim Preserve arr(UBound(arr) - 1) 'Delete last empty entry of the array
            'Worksheets(B).Select
            'ActiveSheet.Range("A" & Join(arr, ",A")).EntireRow.Delete
    With 1000 rows it took about 25 seconds on my PC (50 minutes on your PC) and the speed will slow significantly as the number of rows goes up, and the code dealing with the array is not being excecuted.
    So with 10,000 lines in each of 2 tables and the array being filled, 3 additional IF statements and the deletion of the rows, I think you could go on holiday whilst the code is running.
    Your machine will crash even if you fix the code. The code also crashed my PC when it ran this line
    ActiveSheet.Range("A" & Join(arr, ",A")).EntireRow.Delete
    I do not feel inclined to debug something that you could never use.

  10. #10
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location
    Hi Marcel

    It may be possible reduce the time significantly using one method. So that I can test this:
    1)How many columns are there in SheetA?
    2)How many columns are there in SheetB?
    3)Are all the columns in sheet A the same as all the columns in SheetB?

    thank you

  11. #11
    VBAX Mentor
    Joined
    Feb 2015
    Posts
    395
    Location

    Much Faster

    Hi Marcel

    A different approach (and faster) using the "Remove Duplicates" feature built into Excel

    How it works
    There are 3 sheets (A,B & C)
    SheetA with data
    SheetB with data
    Data from SheetA is copied to SheetC
    Data from SheetB is copied to SheetC under data from SheetA
    "Remove Duplicates" applied in SheetC using ColumnA as the column for checking for duplication.

    What you need to do
    - open the attached workbook
    - copy your data into SheetA and SheetB (headings in row1, data must start in row2 in both sheets)
    - run the macro
    (It will detect columns and ranges to copy without any further input)

    Results
    SheetC will contain only items that are common to both sheets in column A

    You have not told us which other columns are in your data, but this will leave you with a table that looks like Sheet1, so if Sheet2 has different columns after Column A, then VBA will need some code to lookup SheetB columns.


    Sub Dup()
    'declare variables
        Dim wsA As Worksheet, wsB As Worksheet, wsC As Worksheet
        Dim LastRowA As Long, lastRowB As Long, LastRowC As Long
        Dim LastColA As Long, LastColB As Long, LastColC As Long
        Dim DataA As Range, DataB As Range, DataC As Range
    'set ranges
        Set wsA = Worksheets("SheetA")
        Set wsB = Worksheets("SheetB")
        Set wsC = Worksheets("SheetC")
        LastRowA = wsA.Range("A20000").End(xlUp).Row
        lastRowB = wsB.Range("A20000").End(xlUp).Row
        LastColA = wsA.Range("A1").End(xlToRight).Column
        LastColB = wsB.Range("A1").End(xlToRight).Column
            If LastColA >= LastColB Then
                LastColC = LastColA
            Else
                LastColC = LastColB
            End If
        Set DataA = wsA.Cells(2, 1).Resize(LastRowA - 1, LastColA)
        Set DataB = wsB.Cells(2, 1).Resize(lastRowB - 1, LastColB)
    'copy the data to SheetC
        DataA.Copy
        wsC.Range("A2").PasteSpecial xlAll
        DataB.Copy
        wsC.Range("A20000").End(xlUp).Offset(1, 0).PasteSpecial xlAll
    'remove duplicates in SheetC
        LastRowC = wsC.Range("A40000").End(xlUp).Row
        Set DataC = wsC.Cells(2, 1).Resize(LastRowC - 1, LastColC)
        DataC.RemoveDuplicates Columns:=1, Header:=xlNo
    End Sub
    Attached Files Attached Files

  12. #12
    Hi,

    Sorry again for my late response.

    The number of columns in sheetA and sheet B are equal, contain the same TYPE of data, but it has ofcourse different data in it.
    But i guess my computer is too slow, and i will work with your code sent on 30 March.

    I will mark the post as solved.

    Thanks very much for your ideas and work

    Marcel

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •