PDA

View Full Version : Solved: finding missing entries



vzachin
02-28-2008, 12:18 PM
Hi,

i have data in Col A to Col F.
each column may contain about 20000 entries.
each column within itself will have unique entries and each entry may or may not appear in each column.
i want to identify what's missing in each column and place the data into Col J to Col O.
is this doable?

if the total entries for the 6 columns is less than the maximum allowed (i'm using excel 2003), i can put every item into one column and then do an "Advanced Filter -->unique records only" and then do a V-lookup.

any ideas?

thanks
zach

tstav
02-28-2008, 04:25 PM
Hi zach,
I hope this is what you need

Sub MissingEntries()
'Assumptions:
'1.Your data Sheet is Worksheets(1)
'2.Row 1 contains titles
'3.Entries start from row 2
Dim mySheet As Worksheet
Dim lngRow, lngCol As Long
Dim R, cell As Range
Dim oDict As Object
Dim varKey As Variant
Set mySheet = ActiveWorkbook.Worksheets(1)
'Set the range that contains all the entries
With mySheet
Set R = Intersect(.Columns("A:F"), _
.Range(.Cells(2, 1), .Cells.SpecialCells(xlCellTypeLastCell)))
End With
'Add all unique entries of this Range to a Dictionary Object
'p.s. mdmackillop gave me this idea a few days ago
Set oDict = CreateObject("Scripting.Dictionary")
On Error Resume Next
For Each cell In R.Cells
'Add only non-blank cell values
If cell.Value <> "" Then
oDict.Add cell.Value, cell.Value 'Dict contains keys and items
End If
Next 'cell
If Err Then Err.Clear
On Error GoTo 0
'Write all entries to H column starting from row 2
'If number of entries exceeds Excel2003 rows.count,
'continue on next column
lngRow = 2
lngCol = 8
For Each varKey In oDict.Keys
mySheet.Cells(lngRow, lngCol).Value = oDict.Item(varKey)
lngRow = lngRow + 1
If lngRow = mySheet.Rows.Count Then
lngRow = 2
lngCol = lngCol + 1
End If
Next
'Dispose of the Dictionary object
Set oDict = Nothing
End Sub

vzachin
02-29-2008, 04:30 AM
hi tstav,

that is very interesting coding. it's very close to what i'm after.
i was hoping to avoiding doing v-lookups. is there a way to identify what's missing in each column and place that in columns J to O?

thanks again
zach

tstav
02-29-2008, 06:54 AM
is there a way to identify what's missing in each column and place that in columns J to O?

Here's the whole stuff
Sub MissingEntries2()
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Assumptions:
'1.Your data Sheet is Worksheets(1)
'2.Row 1 contains titles
'3.Entries start from row 2
'Method:
'We will use 2 Dictionary Objects
'One to gother all the unique entries from all data
'One to gother the unique entries from each column of data
'We will cross-check the Dictionary items for the missing entries
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim mySheet As Worksheet
Dim intCol, intCol_ As Integer
Dim lngRow, lngI As Long
Dim R, cell As Range
Dim oDict1, oDict2 As Object
Dim varKey1, varKey2, myArray() As Variant
Set mySheet = ActiveWorkbook.Worksheets(1)
'Set the range that contains all the entries
With mySheet
Set R = Intersect(.Columns("A:F"), _
.Range(.Cells(2, 1), .Cells.SpecialCells(xlCellTypeLastCell)))
End With
'Add all unique entries of this Range to a Dictionary Object
Set oDict1 = CreateObject("Scripting.Dictionary")
On Error Resume Next
For Each cell In R.Cells
'Add only non-blank cell values
If cell.Value <> "" Then
oDict1.Add cell.Value, cell.Value 'Dict contains keys and items
End If
Next 'cell
If Err Then Err.Clear
'Dimension the array according to Dictionary's Items.Count
ReDim myArray(oDict1.Count - 1)
'This is the list of all the unique entries
'just in case you want to do a manual cross-check
myArray = oDict1.Items
lngRow = 2
intCol_ = 8
For lngI = LBound(myArray) To UBound(myArray)
mySheet.Cells(lngRow, intCol_).Value = myArray(lngI)
lngRow = lngRow + 1
Next 'lngI
On Error GoTo 0
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'For each column A through F, store its unique entries to oDict2,
'test them against the entries of myArray,
'and write the missing ones to columns J through O
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Set oDict2 = CreateObject("Scripting.Dictionary")
For intCol = 1 To 6
'Set the range that contains all the entries of this column
With mySheet
Set R = .Range(.Cells(2, intCol), _
.Cells(.Cells(.Rows.Count, intCol).End(xlUp).Row, intCol))
End With
'Add all unique entries of this Range to a Dictionary Object
oDict2.RemoveAll 'Clear the Dictionary
On Error Resume Next
For Each cell In R.Cells
'Add only non-blank cell values
If cell.Value <> "" Then
oDict2.Add cell.Value, cell.Value 'Dict contains keys and items
End If
Next 'cell
If Err Then Err.Clear
On Error GoTo 0
'Populate myArray from original oDict1
myArray = oDict1.Items
'Check each item of oDict2 against items in myArray
'If item is found, change it to blank in myArray
For Each varKey2 In oDict2.Keys
For lngI = LBound(myArray) To UBound(myArray)
If oDict2.Item(varKey2) = myArray(lngI) Then
myArray(lngI) = ""
Exit For
End If
Next 'lngI
Next 'varKey2
'Write all the non-blank items of myArray to column J (through O)
lngRow = 2
For lngI = LBound(myArray) To UBound(myArray)
If myArray(lngI) <> "" Then
mySheet.Cells(lngRow, intCol).Offset(0, 9).Value = myArray(lngI)
lngRow = lngRow + 1
End If
Next 'lngI
Next 'intCol
'Dispose of the Dictionary objects
Set oDict1 = Nothing
Set oDict2 = Nothing
End Sub

tstav
02-29-2008, 07:05 AM
I would've liked to have the code broken into several smaller subs but time is really pressing...

vzachin
02-29-2008, 08:03 PM
hi tstav,

very nice! thanks so much!

zach