Delete row based on Data in two Columns [Archive]

View Full Version : Delete row based on Data in two Columns

stanbottorff

07-30-2007, 04:23 PM

I need help writing a macro that will delete all duplicate rows, based on data in column A and column B. I have (I am using Windows XP and Excel 2002)

Column A = Bowler One (Bowler One and Bowler Two make one doubles Team)
Column B = Bowler Two
Column C = Total Score

A Team can only cash once with the same two Bowlers. I usually have between 500 and 800 Teams and one Team (same two bowlers) may have bowled 10 or more times.

It can look like this

Column A Column B Column C

John Smith Mary Smith 1240
Mike Jones Mary Jones 1238
John Smith Mary Jones 1230
John Smith Mary Smith 1229 (Duplicates can be like this)
Mary Smith John Smith 1220 (Or They can be like this)
Phil Johnson Mike Miller

malik641

07-30-2007, 04:46 PM

Hi stanbottorf, welcome to VBAX!

A quick question for you. How will you know which score to keep? The lowest? The highest? Or does it not matter?

mikerickson

07-30-2007, 05:31 PM

Is the first line (John Smith Mary Smith 1240) a duplicate of the hightlited lines 4 and 5?

GreenTree

07-30-2007, 09:01 PM

Three steps come to mind. First, on each row, check if the name in Col A is > the name in Col B (i.e. alphabetically after). If so, swap the names. This way, every instance of John Smith / Mary Smith will show up with John's name first.

Second step, sort the whole thing on Col A first, Col B second, and Col C descending last. This will put all rows with Abbey Anderson as the first bowler at the top, followed by rows where Adam Smith is first, followed by Allen Whoever, etc. (note that you don't actually care that the alpha order is on the first name; what matters is that all games with the same pair of bowlers will show up one after the other, in highest to lowes score order).

Third step: step through the rows to delete lower-scoring duplicates, something like this:
For I = LastRow to (FirstRow - 1) step - 1
If Cells(I,1) = Cells(I-1,1) and Cells(I,2) = Cells(I-1,2) then
'delete the Ith row, shifting up the rows below it
' since you already know that the lower-numbered row is the higher scoring one, because of the sort
End if
Next I

When Col A and Col B are the same, your sort goes to Col C, which places the higher scoring row above (i.e. lower row number) the lower scoring row. That way, working your way up from the bottom in the loop and deleting the lower score (higher row number) of any duplicate will leave you only with rows where at least one of the names differs from the row above & below, which is, I think, what you're looking for.

That make sense?

G.T.