PDA

View Full Version : Solved: Comparing columns A and B. Would like to eliminate #NA results.



frank_m
11-09-2011, 03:07 AM
Edit: After testing the code again I see that its not reporting differences correctly.
It list many items as being in A but not B, when that is not true.

So at this point I need either the code corrected, or other code - Thank you much
---------------------------------------
I'm comparing columns A and B

The code below lists in Column D, items that are in Column A but not B
and Column E is used to list items that are in B but not A

Many thousand cells from the top down, of the no match cells in Column E report #N/A and many near the bottom are empty

I'd like to know what causes the #N/A and how to eliminate it.

Also I was wondering if one of you guys has some prewritten code for comparing two columns so I can double check the results.

A sample workbook is attached. -- Thanks

Option Explicit
Sub test2()
'I aquired this code here:
'http://www.excelforum.com/excel-general/650226-comparing-two-columns-whats-wrong-with-this.html
Dim a, i As Long, b(), n As Long, x, dic As Object
a = Range("a1").CurrentRegion.Resize(, 2).Value
Set dic = CreateObject("Scripting.Dictionary")
With dic
.CompareMode = vbTextCompare
For i = 1 To UBound(a, 1)
If (Not IsEmpty(a(i, 1))) * (Not .exists(a(i, 1))) Then .Add a(i, 1), Nothing
Next
ReDim b(1 To UBound(a, 1), 1 To 1)
For i = 1 To UBound(a, 1)
If Not IsEmpty(a(i, 2)) Then
If Not .exists(a(i, 2)) Then
n = n + 1: b(n, 1) = a(i, 2)
Else
.Remove a(i, 2)
End If
End If
Next
x = .keys
End With

With Range("d1")
.CurrentRegion.ClearContents
.Resize(, 2).Value = [{"Not in A", "Not in B"}]
With .Offset(1)
If n > 0 Then .Resize(n).Value = b
End With
On Error Resume Next
''.Offset(1, 1).Resize(dic.Count).Value = Application.Transpose(x)
' Command above misses some items that should be listed in Column E
'replaced with line below .
'as suggested in the last post at this link:
'http://www.excelforum.com/excel-general/650226-comparing-two-columns-whats-wrong-with-this.html
'Edit: results are incorrect, as many items are reported as not being in B, that are in A,
'when they are actually there.
.Offset(1, 1).Resize(n).Value = Application.Transpose(x)
End With
End Sub

GTO
11-09-2011, 04:47 AM
Hi Frank,

In the attached wb, there are a couple of thousand rows of "hidden" data in column B. Mostly "Emerson" repeated ad infinitum. Does the real wb have that much 'junk' with it?

Mark

frank_m
11-09-2011, 06:50 AM
Hi Mark,

That is the actual Data, but I will work on it soon to eliminate the duplicates and post a new sample.

Might be about an hour or so.

I copy pasted the customer column from an invoicing sheet and comparing it to an older version to determine what new customer names have been added and also which names have had spelling changes. The results should only be a few. Less than 10 results I expect.

After I remove the duplicates, I could also sort it if that will help.

Thanks,

Frank

shrivallabha
11-09-2011, 07:46 AM
Hi Frank,

Run the code. Check the red colored cells (found in the other one and to be deleted). The uncolored cells are the one which are not found in the other column.

Here's the code, which uncommented but I don't think you will need commenting.
Private Sub TestMe()
Dim rFind As Range
Dim lLastRowA As Long, lLastRowB As Long
lLastRowA = Range("A" & Rows.Count).End(xlUp).Row
lLastRowB = Range("B" & Rows.Count).End(xlUp).Row
Range("A2:A" & lLastRowA).AdvancedFilter Action:=xlFilterCopy, _
CopyToRange:=Range("D2"), Unique:=True
lLastRowA = Range("D" & Rows.Count).End(xlUp).Row
Range("B2:B" & lLastRowB).AdvancedFilter Action:=xlFilterCopy, _
CopyToRange:=Range("E2"), Unique:=True
lLastRowB = Range("E" & Rows.Count).End(xlUp).Row
For i = lLastRowA To 2 Step -1
If Range("D" & i).Value = "" Then
Range("D" & i).Delete Shift:=xlUp
Else
If Application.IsNumber(Application.Match(Range("D" & i), Range("E2:E" & lLastRowB), 0)) _
Then Range("D" & i).Interior.ColorIndex = 3
End If
Next i
For i = lLastRowB To 2 Step -1
If Range("E" & i).Value = "" Then
Range("E" & i).Delete Shift:=xlUp
Else
If Application.IsNumber(Application.Match(Range("E" & i), Range("D2:D" & lLastRowA), 0)) _
Then Range("E" & i).Interior.ColorIndex = 3
End If
Next i


If the result is satisfactory then small code for deleting red colored cells can be cranked up.

I am attaching the revised workbook.

frank_m
11-09-2011, 07:56 AM
Thanks Mark.. I'll test that now.


I've attached a new workbook sample with the dupliucates removed

After removing the duplicates the code I have seems to work,
but having two versions will be great so I can verfy the result's :yes


I'll report back probably within a few minutes.

frank_m
11-09-2011, 08:18 AM
Edit #2: See next post as I mistakenly thought the code was supplied by Mark (http://www.vbaexpress.com/forum/member.php?u=27076)
Hi Mark (Correction--> shrivallabha (http://www.vbaexpress.com/forum/member.php?u=27076))


Your code works a treat

I've attached a workbook with side by side results from your code and previous code results.

I haven't fully digested the implications yet, but in your version it does seem that the shaded red items may be handy to have.

Edit: Awe, I understand better now.. No need to write more code to delete the red cells. This shows me all I need to know quite well

Thank you very much ---- Great Job :thumb

frank_m
11-09-2011, 08:35 AM
shrivallabha (http://www.vbaexpress.com/forum/member.php?u=27076)

Many applogies, I thought that was Mark

Thanks so much, that works really well. No need to delete the red cells

Great Job buddy :thumb

shrivallabha
11-09-2011, 08:43 AM
Hi Frank,

Just looking at the data F39 & F40 both are HYATT. Their length differs due to space (the one in F40 contains space). So maybe we can add a small routine to trim spaces. So the revised code would call another sub. I have added application.screenupdating part for speeding up.
Private Sub TestMe()
Dim lLastRowA As Long, lLastRowB As Long
Application.ScreenUpdating = False
Call Cleans 'Here the sub is called for trimming
lLastRowA = Range("A" & Rows.Count).End(xlUp).Row
lLastRowB = Range("B" & Rows.Count).End(xlUp).Row
Range("A2:A" & lLastRowA).AdvancedFilter Action:=xlFilterCopy, _
CopyToRange:=Range("D2"), Unique:=True
lLastRowA = Range("D" & Rows.Count).End(xlUp).Row
Range("B2:B" & lLastRowB).AdvancedFilter Action:=xlFilterCopy, _
CopyToRange:=Range("E2"), Unique:=True
lLastRowB = Range("E" & Rows.Count).End(xlUp).Row
For i = lLastRowA To 2 Step -1
If Range("D" & i).Value = "" Then
Range("D" & i).Delete Shift:=xlUp
Else
If Application.IsNumber(Application.Match(Range("D" & i), Range("E2:E" & lLastRowB), 0)) _
Then Range("D" & i).Interior.ColorIndex = 3
End If
Next i
For i = lLastRowB To 2 Step -1
If Range("E" & i).Value = "" Then
Range("E" & i).Delete Shift:=xlUp
Else
If Application.IsNumber(Application.Match(Range("E" & i), Range("D2:D" & lLastRowA), 0)) _
Then Range("E" & i).Interior.ColorIndex = 3
End If
Next i
Application.ScreenUpdating = True
End Sub

And here's the sub for trimming data:
Private Sub Cleans()
Dim r As Range
For Each r In Sheets(1).UsedRange
r.Value = Trim(r.Value)
Next r
End Sub

Needless to say, all code shall go in Sheet1 module.

frank_m
11-09-2011, 08:52 AM
HI shrivallabha,

Thanks for the ammended code. Good thinking

Have a great day :friends:

shrivallabha
11-09-2011, 09:01 AM
shrivallabha (http://www.vbaexpress.com/forum/member.php?u=27076)

Many applogies, I thought that was Mark

Thanks so much, that works really well. No need to delete the red cells

Great Job buddy :thumb
I guess, you were li'l off the Mark ;) which can happen with anyone!

Glad that it worked out.

frank_m
11-09-2011, 09:07 AM
I guess, you were li'l off the Mark ;) which can happen with anyone!


Hoping you're right that I'm not the only imperfect soul around, so that I can hang on to my self esteem -- Kidding
(nice pun there with "Mark" - He's a good guy, I'm sure he'll share in our smile's about it)

Thanks again for your work

shrivallabha
11-09-2011, 09:16 AM
I am with you...in the same boat...!

I still remember Mark's one liner, it kind of stuck with me.
I normally travel between two states, Arizona and State of Confusion... pun'chy' wot say:bow:

frank_m
11-09-2011, 02:25 PM
I guess I didn't stay in bed long enough. Now when I run the latest version, or even the version before that, I get an error:

The Extract Range has an illegal or missing field name.

Sample workbook attached

Edit: Since posting I added Option Explicit which revealed that the variable ( i ) was not dimmension. Unfortunately the error is not related to that.

Note: This is only a matter of curiousity at this point, so no one need put much time into it. I'm perfectly ok removing the duplicates by manualy using the advanced filter, which allows my original code to work properly and likely will work with this latest code as well if I remove the filtering part.

frank_m
11-09-2011, 04:08 PM
Did some further playing with this.

I found that it's not a coding mistake, as I get the same error when trying to use the advanced filter manualy.

I also removed all the hidden junk and blank cells and cell formating, but to no avail.

Also shortened the lists to 4,000 rows, from 18,000.... no help

Filtering manulay if I filter inplace instead of copying to another column, that does suceed, so my best guess is that it's related to the fact that there are thousands of duplicates and Excel simply has a bug in the advanced filter when copying the results to another column under those circumstances. (I did some googling and found others that experienced this error. The cirscumstances were a little different, but similar in some ways and in some cases the issue was never solved. - Also I found documented issues of this error by Microsoft, although they stated that it was corrected in Excel 2003)

I am able to get rid of the duplicates with some manual manipliation combined with filtering inplace,
so as previously noted, I do not need anyone to spend time on this unless they are curious as I am.

Revised Sample file attached that contains data with the junk, empty and misc cell formating all removed.
Plus I reduced the test data to 4,000 rows

Thanks

GTO
11-09-2011, 07:51 PM
Hi guys,

:giggle Well, hopefully I'm in my home state at the moment...

Here's what I came up with so far. Based on the first attachment at post#1 I believe. It misses 'untrimmed' vals so far, but wanted to see if this is close.

In a Standard Module:

Option Explicit

Sub WhatsMissing()

Dim DIC_A As Object ' Dictionary
Dim DIC_B As Object ' Dictionary
Dim DIC_Tmp As Object ' Dictionary
Dim wksData As Worksheet
Dim rngData As Range
Dim i As Long
Dim aryData As Variant
Dim aryA As Variant
Dim aryB As Variant


Set wksData = ActiveSheet '<---Change to tab or ditch and use codename
'// Attempt to set a reference to the last row in columns 1 and 2 that contains any //
'// data. //
Set rngData = wksData.Range("A:B").Find(What:="*", _
After:=wksData.Cells(1), _
LookIn:=xlValues, _
LookAt:=xlPart, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious)
'// If .Find does not find anything, OR, //
If rngData Is Nothing Then Exit Sub
'// if it only finds a header, bail out. //
If rngData.Row < 2 Then Exit Sub

'// Set a reference to the range containing data, and... //
Set rngData = Range(wksData.Cells(2, 1), wksData.Cells(rngData.Row, 2))

'// ...flip the data into an array. //
aryData = rngData.Value

'// Possibly goofy, but three dictionaries. One for ea column, and an extra to //
'// preserve the unique vals we found in Col 2 for testing (.Exists). //
Set DIC_A = CreateObject("Scripting.Dictionary")
Set DIC_B = CreateObject("Scripting.Dictionary")
Set DIC_Tmp = CreateObject("Scripting.Dictionary")

'// Just use keys, as this eliminates IF test for .Exists and .Add //
For i = 1 To UBound(aryData, 1)
DIC_A.Item(aryData(i, 1)) = Empty
Next
'// rather than testing for empty cells in ea loop above, just ditch the empty key //
'// (if one exists) at end. //
If DIC_A.Exists(vbNullString) Then DIC_A.Remove vbNullString
'// SAA //
For i = 1 To UBound(aryData, 1)
DIC_B.Item(aryData(i, 2)) = Empty
DIC_Tmp.Item(aryData(i, 2)) = Empty
Next

If DIC_B.Exists(vbNullString) Then
DIC_B.Remove vbNullString
DIC_Tmp.Remove vbNullString
End If

'// To enable late binding, dump Keys into arrays. //
aryA = DIC_A.Keys
aryB = DIC_B.Keys

'// Now we use our arrays from .Keys, where we check ea element and see if it exists//
'// in the opposite dictionary; leaving us the missing vals //
For i = 0 To DIC_B.Count - 1
If DIC_A.Exists(aryB(i)) Then DIC_B.Remove (aryB(i))
Next
'// Since Dic_B is no longer good for checking, we use its duplicate dictionary. //
For i = 0 To DIC_A.Count - 1
If DIC_Tmp.Exists(aryA(i)) Then DIC_A.Remove (aryA(i))
Next

'// Re-use aryData, resizing it to fit our output. //
ReDim aryData(0 To Application.Max(DIC_A.Count, DIC_B.Count) - 1, 0 To 1)
'// Likewise re-use, plunking remaining Keys in. //
aryA = DIC_A.Keys
aryB = DIC_B.Keys

'// Fill output array, transposing. //
For i = 0 To DIC_A.Count - 1
aryData(i, 1) = aryA(i)
Next
For i = 0 To DIC_B.Count - 1
aryData(i, 0) = aryB(i)
Next

'// Clear any previous in Cols C & D and plunk in output array. //
With rngData.Offset(, 3)
.Resize(.Parent.Rows.Count - 1).Columns.Clear
'.EntireColumn.Clear
.Rows(1).Resize(UBound(aryData, 1) + 1).Value = aryData
End With

'// All are declared locally, so not "necessary", but for clarity and explicitness, //
'// dump memory use //
Set DIC_A = Nothing
Set DIC_B = Nothing
Set DIC_Tmp = Nothing
aryA = Empty
aryB = Empty
aryData = Empty
End Sub


I did not manually go thru and see if the results are accurate, but it sounds as if you may already have that to compare to. I did test against a much shorter bit of sample data and seems to work.

Mark

frank_m
11-09-2011, 09:17 PM
Hi Mark,

The results are perfect. I can add the trim part to that, from the code shrivallabha so kindly wrote for me, unless you really feel like adding something yourself.

Your code using the Dictionary object runs super fast, which’s helps me realize better why the gentleman at the excel forum used it. (I didn't use a timer, but it seemed like a blink of an eye. (I'd guess 3/4 of a second for 1,8000 rows and many thousands of duplicates and other junk.

Thank you very much. You definitely are in your home state today and a state above. :bow:

Edit: I will need to run this every few months, so it really does save me a lot of time in the long run , as using your version requires less prep work.

GTO
11-09-2011, 09:41 PM
Glad that worked :-)

Compliments of Aflatoon in an ongoing thread, let us try:
'// ...flip the data into an array. //
'aryData = rngData.Value
aryData = Evaluate("INDEX(TRIM(" & rngData.Address(0, 0, , -1) & "),0,0)")


Mark

frank_m
11-09-2011, 09:51 PM
Nice one. - Trimming that way works very well, and FAST. :thumb

Thanks Mark

and Thanks Aflatoon

Edit: and Thanks shrivallabha (http://www.vbaexpress.com/forum/member.php?u=27076), I havn't forgotten you. I very much appreciate your time and code. I certainly learned things that I can put to use latter.

GTO
11-09-2011, 10:00 PM
You are most welcome :-)

frank_m
11-11-2011, 03:01 AM
Edit: Sorry I accidentally posted this question here when I meant to start a new thread.
Could I get a Moderator to move it. _ Thank you

I pieced the code below together by using member Aflatoon's method for trimming a range. However I'm applying it to the range instead of an array as I saw it in his example, and my range is small only columns 1 thru 24 of the selected row.

It seems to be working correctly with limited testing, but I would appreciate advice from a more intelligent and experienced programer here to tell me if anything is wrong with what I have.

Thanks

Private Sub Worksheet_SelectionChange(ByVal Target As Excel.Range)
Dim rng As Range

Set rng = Target.EntireRow.Cells(1).Resize(, 24)
rng = Evaluate("INDEX(TRIM(" & rng.Address(0, 0, , -1) & "),0,0)")

End Sub