PDA

View Full Version : Solved: Remove Duplicate Rows



Belch
08-08-2006, 04:14 AM
Hi all,

What I am doing is this:
I import a CSV file to a worksheet "Imported CSV", and then add a header row so I can apply an advanced filter to column F.
I want to apply this filter because I want to only show rows with a unique value in column F.

The filter does this job fine, but I then want to take the rows left showing and put them on another sheet so I can go through each row using a For loop and the Offset command.
e.g.

For n1 = 1 To nTotalRows

sCurrentCode = Range("F1").Offset(n1, 0).Value
...
Next n1


As you can see, using a For loop like this would still select the rows hidden by the advanced filter, which is why I want to have just the rows left showing on their own.

Can anyone help me out with the best way to either:

Remove duplicate rows completely (i.e. not just hide them); or
Use a loop to go through rows that are showing and ignore hidden ones (this is not really efficient though as there are 10000 rows, but only 3000 when filtered so the For loop would be much bigger if it checked every rows 'hidden' property); or finally
Copy the visible rows to another sheet - I have tried this with the ActiveSheets.Paste and Range("A1").PasteSpecial commands - the ActiveSheets one seems to paste the values but then throws a run-time error everytime.I am sure there must be a simple way to do this but I can't seem to get to it!
Any advice is appreciated, thanks.

Belch
08-08-2006, 04:23 AM
Just after I posted this thread I noticed the "Copy to other location" option when applying the filter - the only thing is this just copies column F and none of the other columns.
The problem with this is I need the data in the other columns filtered as well...

mdmackillop
08-08-2006, 04:38 AM
Hi Belch,
A useful tool here is the GoTo/SpecialCells/Visible cells. You can then easily mark the visible cells in a column and delete the unmarked rows

Belch
08-08-2006, 07:03 AM
Thanks md,
I've tried that as you said, it selects the visible (unique) rows fine, but then how can I delete the hidden rows so all I am left with is unique ones?

I see how I could go through each row and if it's visible property is false then delete it, but I have 10,000+ rows and want to avoid looping through each one.
On that note, is there any way to do the above globally in one go, to all hidden rows?

I have also attached a smaller example of the data I am using (there are only 500 rows on this one).
I have commented out a routine which has already been run - it has already imported the CSV file and applied the advanced filter to hide duplicate rows.
If you step through the "test" routine and watch the two variables s1 and n1 you will see that by using a For loop and the Offset command it still picks up the hidden rows, which is why I am trying to delete them.

Thanks,

mdmackillop
08-08-2006, 07:39 AM
To avoid loops try something like the following.
Assumes your filter is in place

Sub DelRows()
Dim tmp As Range
Set tmp = Intersect(Columns(10), ActiveSheet.UsedRange)
tmp.SpecialCells(xlCellTypeVisible).Value = "x"
tmp.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
tmp.ClearContents
Set tmp = Nothing
End Sub

Belch
08-09-2006, 03:55 AM
I tried adding that code you provided after I had applied the advanced filter, but when it reaches the EntireRow.Delete line it says no cells are selected.
I added the tmp variable to the watch window and the column and row values are 10 and 1 respectively. I looked up an explanation of the Intersect method but didn't really understand it or how it applies to what i'm trying to do.

Also, it appears that the code you gave me changes the value of the visible cells to x, but I need to keep the original values once the hidden rows have been deleted.

mdmackillop
08-09-2006, 04:38 AM
The bug is that the column may not be in your usedrange.
Try this variation
Sub DelRows()
Dim tmp As Range, Rw As Long, Col As Long
Col = Selection.SpecialCells(xlCellTypeLastCell).Column() + 1
Rw = Selection.SpecialCells(xlCellTypeLastCell).Row()
Set tmp = Range(Cells(1, Col), Cells(Rw, Col))
tmp.SpecialCells(xlCellTypeVisible).Value = "x"
tmp.SpecialCells(xlCellTypeBlanks).EntireRow.Delete
tmp.ClearContents
Set tmp = Nothing
End Sub

mdmackillop
08-09-2006, 04:53 AM
One further thought. There may be a limit on the number of non-contiguous rows Excel can delete with this method, in which case a sort on the indexed row may be required, followed by a return to the original order. Let us know how you get on.

Belch
08-09-2006, 05:56 AM
md,

I gave your updated code a try and it seems to do the job perfectly.
I'm still not 100% on how it does it, but the important thing is that it does!

Thanks for the help,

mdmackillop
08-09-2006, 06:01 AM
I'm still not 100% on how it does it, but the important thing is that it does!

When you puzzle it out, you'll know more than you did when you came here!
Glad it worked out.
Regards
MD