PDA

View Full Version : Create an array removing duplicates from list but keeping last occurence



kiltro
04-07-2020, 11:52 AM
Hey everyone,

looking at the attached file what I like to do is this:

Create an array deleting duplicate names and keeping only the last occurance of that name (this array will then be pasted to a new worksheet)

The array must contain the following columns

| NAME | FRUIT | RESULT | (No days)

So for example one row will contain

| William | Watermelon | Watermelon+Orange+Peach |

Artik
04-07-2020, 04:53 PM
1. Copy the table to a new sheet.
2. Sort ascending against column A.
3. Insert the formula into E2 (in italian):
=CONFRONTA([@Name];[Name];1) (or in english)
=MATCH([@Name],[Name],1)
If necessary, copy it down.

4. Insert the formula into F3 (in italian):
=SE(E2<>E3;E2;0) ( in english)
=IF(E2<>E3,E2,0)
If necessary, copy it down.

5. Filter the column F, filter criterion = 0.
6. Delete visible lines.
7. Remove column C and E: F.

Now just write it in VBA. :)

Artik

p45cal
04-08-2020, 03:40 AM
This:
Sub blah()
a = Range("A2:B24").Value
For i = UBound(a) To 1 Step -1
If Not IsEmpty(a(i, 1)) Then
For j = i - 1 To 1 Step -1
Debug.Print i, j
If a(i, 1) = a(j, 1) Then
a(i, 2) = a(i, 2) & "+" & a(j, 2)
a(j, 1) = Empty
End If
Next j
End If
Next i
Results = a
j = 0
For i = 1 To UBound(a)
If Not IsEmpty(a(i, 1)) Then
j = j + 1
Results(j, 1) = a(i, 1)
Results(j, 2) = a(i, 2)
End If
Next i
Range("F2").Resize(j, UBound(a, 2)).Value = Results
End Subresults in:
26299
at cell F2, is that correct?

p45cal
04-08-2020, 04:03 AM
Having just seen Artik's answer, I might have got it wrong.
An alternative to Artik's:

In cell E2 enter a 1
In cell E3 enter a 2
Select E2:E3 and autofill down
Dropdown arrow in column E's header, choose Sort Largest to Smallest
On the Data tab, Data Tools, choose Remove Duplicates
In the dialogue box, have only a single tick against the Name column, click OK
Dropdown arrow in column E's header, choose Sort Smallest to Largest
Delete columns E and Day.

kiltro
04-08-2020, 06:11 AM
Having just seen Artik's answer, I might have got it wrong.
An alternative to Artik's:

In cell E2 enter a 1
In cell E3 enter a 2
Select E2:E3 and autofill down
Dropdown arrow in column E's header, choose Sort Largest to Smallest
On the Data tab, Data Tools, choose Remove Duplicates
In the dialogue box, have only a single tick against the Name column, click OK
Dropdown arrow in column E's header, choose Sort Smallest to Largest
Delete columns E and Day.



Why you say you got it wrong?

Your code seems to work fine to me

p45cal
04-08-2020, 06:44 AM
Why you say you got it wrong?Because my first answer only uses the first 2 columns and produces an array, (creates something like column D from column B) whereas Artik's and my second answer use the results you've already got in column D

kiltro
04-08-2020, 06:59 AM
Because my first answer only uses the first 2 columns and produces an array, (creates something like column D from column B) whereas Artik's and my second answer use the results you've already got in in column D

Got it thanks!

kiltro
04-08-2020, 08:31 AM
I'm trying to translate p45cal last method in VBA but unfortunately it seems RemoveDuplicates doesnt work on mac even if you define a column (Columns:=1) the window where you select the column keeps popout.

Is there a solution around this?

Artik
04-08-2020, 02:19 PM
ActiveSheet.ListObjects(1).Range.RemoveDuplicates Columns:=1, Header:=xlYesIs it working?

Artik

kiltro
04-09-2020, 12:34 AM
ActiveSheet.ListObjects(1).Range.RemoveDuplicates Columns:=1, Header:=xlYesIs it working?

Artik

As I said unfortunately on mac osx it seems it does not, when the code reaches that line it popsout this window:

26310

p45cal
04-09-2020, 05:21 AM
A VBA version of my suggestion in msg#4 could be:
Sub blah()
With ActiveSheet.Range("A1").ListObject
Set newColm = .ListColumns.Add
With newColm.DataBodyRange
.Cells(1) = 1
.DataSeries
End With
.Range.Sort key1:=newColm.Range, order1:=xlDescending, Header:=xlYes
.Range.RemoveDuplicates Columns:=1, Header:=xlYes
.Range.Sort key1:=newColm.Range, order1:=xlAscending, Header:=xlYes
newColm.Delete
.ListColumns("Day").Delete
End With
End Suband this works fine on a PC.
Remove Duplicates in vba seems to be a known problem on the Mac versions of Excel
It's a shame it doesn't work on a Mac because it is fast and efficient.
So we have to grow our own 'remove duplicates' procedure, and while we're at it, remove the duplicates from the top so we don't have to bother sorting:
Sub blah2()
Dim rngToDelete As Range
With ActiveSheet.Range("A1").ListObject '<<<< adjust this so you're looking at the right table.
x = .ListColumns(1).DataBodyRange.Value
For i = 1 To UBound(x)
y = Application.Match(x(i, 1), x, 0)
If y < i Then
x(y, 1) = "¬!"
If rngToDelete Is Nothing Then Set rngToDelete = .ListRows(y).Range Else Set rngToDelete = Union(rngToDelete, .ListRows(y).Range)
End If
Next i
If Not rngToDelete Is Nothing Then rngToDelete.Delete
.ListColumns("Day").Delete
End With
End Sub

However, at this stage, I'd have thought that my snippet in msg#3 is easier and faster.