PDA

View Full Version : Hightlighting duplicates by comparing multiple columns



dub
06-06-2011, 02:23 PM
Hello all! I have a data sheet with 30000+ rows of data and columns A through AZ. Under the "data" ribbon the "delete duplicate" button is very good and I like that it lets me choose the option of "my data has headers" then I am able to choose the columns that define a row as a duplicate. I need column A B C D E F G and I, or something like that, to match from one row to another to be considered a duplicate. The problem is that I need to only highlight the duplicates at this time and not delete them. If some one has code that will allow the same options as the "delete duplicates" button but will only hightlight them that would be awesome. Or any help on where to start would be good, I am experienced with VBA coding but not a pro. Also if one cell blank but the rest of the columns match, that would be a duplicate.
Thanks!

ntrauger
06-06-2011, 07:46 PM
Sounds like an easy job for the dictionary object. You'll need to check "Microsoft Scripting Runtime" under Tools > References. A possible example: Dim dics() As New Scripting.Dictionary
Dim c As Range
For Each c in Columns("A").UsedRange.Cells
If dics.Exists c.Value Then
c.Interior.ColorIndex = 3
Else
dics.Add c.Value, 1
End If
NextNow, this is untested and is not your solution, but for example purposes, it should highlight in red each cell in column "A" that is a duplicate. It sounds like it will easily meet your needs if you expand it to fit your situation.

There is good information in the help files on the dictionary object, but the skinny on it is that it stores keys and items in pairs, and the keys must be unique. The unique keys part makes it perfect for dealing with duplicates.

On a side note, it also automatically sorts the keys...which may be useful in some situations.

dub
06-07-2011, 09:33 AM
Thanks Nate, I have not dealt with scripting before, but will give it a try. Would I just need to add more if conditions for each column that I want to check?

Chabu
06-07-2011, 12:54 PM
The dictionary is just another class, you can use it to replace the Collection class.

here is a link with info on how to use it http://www.techbookreport.com/tutorials/vba_dictionary.html

Greetings

Chabu
06-07-2011, 12:57 PM
But to be honest, I would use a database. Can't you store your data in Access and pull it back in Excel when you need it?

dub
06-07-2011, 01:18 PM
I have never used Access but my company does have it. Why would you suggest using Access?

Chabu
06-07-2011, 01:49 PM
Because that's what a database does best, managing a large number of records.
A query to retrieve duplicates is trivial in Access and you can integrate it seemlessly using your VBA skills.

I hate to reinvent the wheel (except if just for the sake of it) and prefer to use tools where they are best at.

Greetings

dub
06-07-2011, 02:06 PM
Cool, I will look into that and give it a try,
thanks!