PDA

View Full Version : Remove duplicates from Multiple columns than tally results



Pharlap
08-04-2007, 05:23 PM
Hi Folks of the world VBA community

So from the begining .... here is the CHALLENGE should you choose to help me.
I have searched the many forum solutions for duplicates however the ones I found relate to a single column. However what I would like to be able to do is starting from row 8 is to remove duplicates of a mulitple of columns of a single worksheet exmple attached - namely columns P, R, S, T, and U. Note some cells in each column are blank as per the example attached to this thread, also there may be more than one duplicate in a column and not always right underneath the original -could be anywhere in the column. Also it is important to retain/keep values that have not been duplicated.

To clarify - it would firstly check for duplicates in column P, keeping just one of the rows of the duplicated value and removing entiely the row/rows of where the duplications appeared. And also retaining rows where values in the column have not be duplicated. Then go through each of the other four columns. of the worksheet doing the same for each of the other four columns.

Then after the duplicates are removed from all five columns provide a tally at the bottom of the worksheet of how many times a value now appears in the cells of each of repective columns.

Perhaps it would be good to produce the tally by worker of a total of how many times a values appears in a column that that still remains after rows are removed perhaps output the tally to new worksheet in the same format and retain the original worksheet. see below for more detail.

So far someone has helped me with the first part being the removing the duplicates part but I would like to add to it the other things that I have since thought I would like to also do which are also listed below.

This code works great for removing the duplicates




Sub test()
Dim a, i As Long, txt As String, z As String
Dim x As Integer
Dim ws As Worksheet


Application. ScreenUpdating = False


Set ws = Worksheets("SHEET1")
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For x = 16 To 21
Again:
a = ws.Range("A65536").End(xlUp).Row
For i = 5 To a
z = ws.Cells(i, x).Value
If Not z = "" Then
If Not .exists(z) Then
.Add z, Nothing
Else
txt = txt & "," & Cells(i, 1).Address(0, 0)
If Len(txt) > 245 Then
Range(Mid$(txt, 2)).EntireRow.Delete
txt = "": .RemoveAll: Goto Again
End If
End If
End If
Next
If Len(txt) Then Range(Mid$(txt, 2)).EntireRow.Delete
txt = ""
Next x
End With
End Sub




The other things I would like to be able to acheive is


1. for the code to run on any tab name - that is to say the code auto picks up the tab name it is run on rather than me having to rekey different tab names into the code for different worksheet names. NOTE there is only ever one worksheet in the workbook but each time I extract the data from the system the tab name is different so in the code to remove dups I hve so far I ahve to manually change the worksheet name...I would prefer that the code auto detect that if that is possible.


2. Also I have thought it would be good if the code above could also delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those other rows where ;


-in column G if in any part of the name string contains word "CLASS" (note the sting can contain a upt to ten words and be up to 70 characters in totol lenght ) and there is "X" in same row in any columns of I, J, K and/or IL - then delete the row so long as NO value is present in columns P, R, S, T, and U on the same row - as I want to keep those rows. In attached the test file but I have added to it some test data for this part of it - there are 6 new rows at record numbers 5 to 11 - of these 6 new rows file are highlighted in blue and one in orange - if the code works well - as also doing what the first piece of code did to remove all duplicates from columns P, R, S, T and U the new code would now also be able delete the rows indicated by blue and leave the orange row - of course in the real files the rows would not be coloured.



3. that AFTER ALL require deletions of rows have been completed the final thing is to provide a tally for each worker in column E - of how many rows they have with values appearing in columns I, J, K, L and P, R, S, T, U - now I do have some code as below to this which some one else helped me previosly with for another workbook -which I think with some modification might be able be intigrated into the code required for this overall exercise however I havent been able to achieve it-what do you think ?




Sub test()
Dim a As Variant
Dim b() As Variant
Dim NewWs As Worksheet
Dim j As Integer, k As Integer, i As Integer, ii As Integer


a = Range(Range("IV" & 7).End(xlToLeft), Range("B" & Rows.Count).End(xlUp)).Value
Redim b(1 To UBound(a, 1), 1 To UBound(a, 2))
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For i = 1 To UBound(a, 1): k = 1 'Reset Column Counter
If Not IsEmpty(a(i, 1)) And Not .exists(a(i, 1)) Then
j = j + 1: .Add a(i, 1), j
b(j, 1) = a(i, 1) ' Add Header
End If
For ii = 5 To UBound(a, 2)
Select Case ii
Case 5, 6, 7, 8, 12, 14, 15, 16, 17
k = k + 1
If i = 1 Then b(1, k) = a(i, ii)
If Not a(i, ii) = "" And Not i = 1 Then b(.Item(a(i, 1)), k) = b(.Item(a(i, 1)), k) + 1
End Select
Next ii
Next i
End With
Set NewWs = Worksheets.Add
NewWs.Name = "WORKER TALLY"
NewWs.Range("A1").Resize(j, k).Value = (b)
With NewWs.Range("A1", NewWs.Range("IV" & 1).End(xlToLeft))
.Font.ColorIndex = 6
.Interior.ColorIndex = 11
.Font.Name = "Arial"
.Font.Bold = True
.WrapText = True
.Rows.AutoFit
End With
NewWs.Columns(1).AutoFit
End Sub




I am only a beginner with VBA so your help would be greatly appreciated, plus the result code of course can be shared on this forum.


Regards


Pharlap :-)

Pharlap
08-05-2007, 12:38 AM
Bumping

brettdj
08-05-2007, 01:35 AM
You can use my addin at http://members.iinet.net.au/~brettdj/ to do this

Application Scope - Range (Column P)
Search Option - Row Search
Output Choices - Delete Duplicates
Delete Duplicate Options - Completely remove Row (not just contents)
then repeat for other columns

Not that if you ran the same settings over column P,R,S,T and U as a single range then only rows with all 5 matching cells in these columns would be deleted

Cheers

Dave

Pharlap
08-05-2007, 01:47 AM
Thanks Dave

I am going to check out your addin
as I am always needing to delete duplicate rows

but I really need a hand now with the second and third part of my problem which I have already the second part - basically it is to also remove a row where in column G of the test file if the appears a sting of test that includes the text "CLASS" and if "X" also appears on same row at any of the four columns I to L so long as no value appears in same rows P, R, S, T, U then delete this row.

SO any ideas Dave ...your thoughts would be greatly appreciated.

Regards :-)
Pharlap (Tony)