Consulting

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 24

Thread: Removing duplicates from multiple columns

  1. #1

    Removing duplicates from multiple columns

    I have a whole bunch of columns, 100+, and each one has many rows of data in it. (Numbers or text or maybe a combination of the two.) I want to delete the duplicate data, but I DON'T want to delete the rows. If you delete an entire row, that would delete data from other columns.
    I need a macro that would look at each column independently. I want it to looks at the data in the first cell and then compare it to all of the data in that column in the rows below it. If it finds a match, it deletes that match. It then continues with the next item in that same column. It does this for every item I have in that column.
    Then, once it is done with the last row in that column, a bunch of empty cells are left . The macro would then continue and sort the data, putting the empty cells below. The end result is that all of the duplicates are removed (deleted)
    The macro then continues until each of the 100+ columns has been checked.
    then transposes the data on a separate sheet
    A generic code will be better so i can use it for few or many columns (totally new at writing code).
    Is this possible?

  2. #2
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,340
    Location
    I'm going to have to assume that the data in adjacent columns are completely unrelated to each other.

    Sub NoDupColumnData()
    Dim oSheet As Worksheet
    Dim oRng As Range
    Dim varVals As Variant
      ThisWorkbook.Sheets(1).UsedRange.Copy 'Whatever sheet has your 400 columns
      Set oSheet = ThisWorkbook.Sheets.Add
      oSheet.Name = "Filtered"
      oSheet.Paste
      Application.ScreenUpdating = False
      For Each oRng In oSheet.UsedRange.Columns
        varVals = fcnUnique(oRng)
        oRng.Clear
        oRng.Resize(UBound(varVals), 1).Value = WorksheetFunction.Transpose(varVals)
      Next oRng
      Application.ScreenUpdating = True
    lbl_Exit:
      Exit Sub
    End Sub
    Public Function fcnUnique(oColRng As Range) As Variant
    Dim colUniques As New Collection
    Dim oCell As Range
    Dim lngIndex As Long
    Dim vUnique As Variant
      On Error Resume Next
      For Each oCell In oColRng.Cells
        If Len(CStr(oCell)) > 0 Then
          colUniques.Add oCell, CStr(oCell)
        End If
        If Err.Number <> 0 Then
          'colUniques.Add "", CStr(lngIndex)
          'lngIndex = lngIndex + 1
        End If
      Next oCell
      On Error GoTo 0
      ReDim vUnique(1 To colUniques.Count)
      For lngIndex = LBound(vUnique) To UBound(vUnique)
        vUnique(lngIndex) = colUniques(lngIndex)
      Next lngIndex
      fcnUnique = vUnique
    lbl_Exit:
      Exit Function
    End Function
    Last edited by gmaxey; 01-15-2017 at 07:55 AM.
    Greg

    Visit my website: http://gregmaxey.com

  3. #3
    SEE SHEET 2.
    Attached Files Attached Files
    Last edited by grose456; 01-14-2017 at 07:34 PM. Reason: NEW MESSAGE

  4. #4
    VBAX Regular
    Joined
    Dec 2016
    Posts
    29
    Location
    How about this

    Sub abc()
     Dim arr, iRow As Long, iCol As Long, i As Long
     
     arr = Range("a1").CurrentRegion
     
     Worksheets.Add: i = 1
     With CreateObject("scripting.dictionary")
        For iCol = 1 To UBound(arr, 2)
            For iRow = 1 To UBound(arr)
                If Not .exists(arr(iRow, iCol)) And Trim$(arr(iRow, iCol)) <> "" Then
                    .Item(arr(iRow, iCol)) = Empty
                End If
            Next
            Cells(i, "a").Resize(, .Count) = .keys
            .RemoveAll: i = i + 1
        Next
     End With
     Cells(1).CurrentRegion.Borders.LineStyle = xlContinuous
    End Sub

  5. #5
    SORRY, EDITED MY PREVIOUS REPLY INSTEAD OF POSTING A NEW REPLY. SEE ABOVE ATTACHMENT. Visited and read your website. You have a very impressive background.

  6. #6
    VBAX Regular
    Joined
    Dec 2016
    Posts
    29
    Location
    Does my solution in Post#4 not do what you want?

  7. #7
    Mac Moderator VBAX Guru mikerickson's Avatar
    Joined
    May 2007
    Location
    Davis CA
    Posts
    2,778
    Perhaps this will do what you want
    Sub test()
        Dim arrInput As Variant
        Dim colNum As Long, rowNum As Long
        Dim destinationRange As Range
        
        Set destinationRange = Sheet3.Range("A1")
        arrInput = Sheet1.Range("A1").CurrentRegion
        
        colNum = 1
        For colNum = 1 To UBound(arrInput, 2)
        For rowNum = 1 To UBound(arrInput, 1)
            If arrInput(rowNum, colNum) <> vbNullString Then
                If rowNum <> Application.Match(arrInput(rowNum, colNum), Application.Index(arrInput, 0, colNum), 0) Then
                    arrInput(rowNum, colNum) = vbNullString
                End If
            End If
        Next rowNum
        Next colNum
        
        Application.ScreenUpdating = False
        With destinationRange
            With .Resize(UBound(arrInput, 2), UBound(arrInput, 1))
                .Value = Application.Transpose(arrInput)
                .SpecialCells(xlCellTypeBlanks).Delete shift:=xlToLeft
            End With
        End With
        Application.ScreenUpdating = True
    End Sub
    Attached Files Attached Files

  8. #8
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,728
    Location
    Maybe something like this

    Some cells had a leading space so I had to TRIM them also to make certain that ZH9 and spaceZH9 would be de-dupped




    Option Explicit
    
    Sub ColumnDeDup()
        Dim rCol As Range
        Dim wsInput As Worksheet, wsOutput As Worksheet
    
        Application.ScreenUpdating = False
    
        Set wsInput = Worksheets("Input")
    
        With wsInput.Cells(1, 1).CurrentRegion
            
            'trim
            .Value = Application.Evaluate("IF(" & .Address & "<>"""",TRIM(" & .Address & "),"""")")
            
            'remove empty cells
            On Error Resume Next
            .Cells.SpecialCells(xlCellTypeBlanks).Delete Shift:=xlUp
            On Error GoTo 0
    
            'remove dups in each col
            For Each rCol In .Columns
                Call rCol.RemoveDuplicates(1, xlYes)
            Next
        End With
        
        
        'add new worksheet named 'Output'
        Application.DisplayAlerts = False
        On Error Resume Next
        Worksheets("Output").Delete
        On Error GoTo 0
        Application.DisplayAlerts = True
        Worksheets.Add.Name = "Output"
        
        Set wsOutput = Worksheets("Output")
        
        'copy and transpose to Output
        wsInput.Cells(1, 1).CurrentRegion.Copy
        wsOutput.Select
        wsOutput.Cells(1, 1).Select
        Selection.PasteSpecial Paste:=xlPasteAll, Operation:=xlNone, SkipBlanks:=False, Transpose:=True
        
        
        'make Output pretty
        wsOutput.Cells(1, 1).CurrentRegion.EntireColumn.AutoFit
        
        
        Application.ScreenUpdating = True
    End Sub
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  9. #9
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    All you need:

    Sub M_snb()
        Application.ScreenUpdating = False
        Sheet3.UsedRange.ClearContents
        
        For j = 1 To Sheet1.Cells(1).CurrentRegion.Columns.Count
           Sheet1.Cells(1).CurrentRegion.Columns(j).AdvancedFilter 2, , Sheet3.Cells(1, j), True
        Next
        Sheet3.UsedRange.SpecialCells(4).Delete
        sn = Application.Transpose(Sheet3.UsedRange)
        
        Sheet3.UsedRange.ClearContents
        
        Sheet3.Cells(1).Resize(UBound(sn), UBound(sn, 2)) = sn
    End Sub

  10. #10
    gives me:run time error 424

  11. #11
    hi Paul.
    i need the duplicate values to be gone, only the unique value to remain.

  12. #12
    Sorry Mike7952,
    new at this so thought i was still replying to gmaxey. i want the duplicate values to go and only teh unique value to remain. see my attachment sheet one above.

  13. #13
    VBAX Contributor
    Joined
    Dec 2009
    Location
    Sevastopol
    Posts
    150
    Location
    In the attached workbook click the Run button on Sheet1 to create a list of required unique values in Sheet2
    The code:
    Sub ExtractUnique()
    'ZVI:2017-01-16
     
      '--> Settings, change to suit
      Const Src = "Sheet1"  ' Name of the source sheet
      Const Dst = "Sheet2"  ' Name of the destination sheet
      ' <-- End of settingw
     
      Dim a
      Dim Col As Range, Rng As Range
      Dim i As Long, j As Long
      Dim s As String
     
      ' Define source data range taking into the account possible empty rows among data rows
      With Worksheets(Src)
        Set Rng = Intersect(.UsedRange, .UsedRange.Cells(1).CurrentRegion.EntireColumn)
      End With
      
      ' Clear the destination sheet
      Sheets("Sheet2").UsedRange.ClearContents
     
      ' Main
      With CreateObject("Scripting.Dictionary")
        .CompareMode = 1
        ' We need to go column by column to prevent overflow of amount values in a()
        For Each Col In Rng.Columns
          a = Col.Cells.Value
          For i = 1 To UBound(a)
            s = Trim(a(i, 1))
            If Len(s) Then .Item(s) = vbNullString
          Next
          If .Count Then
            j = j + 1
            a = .Keys
            Sheets(Dst).Cells(j, 1).Resize(, UBound(a)).Value = a
          End If
          .RemoveAll
        Next
      End With
     
      ' Fit widths and activate the destination sheet
      Sheets(Dst).UsedRange.Columns.AutoFit
      Sheets(Dst).Select
     
    End Sub
    Attached Files Attached Files

  14. #14
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,645
    Quote Originally Posted by grose456 View Post
    gives me:run time error 424
    Do not use VBA-code you don't understand.

    Of course you have to add a sheet3 first (see attachment)

    or use
    Sub M_snb()
      sn = Sheet1.Cells(1).CurrentRegion
       
      With CreateObject("scripting.dictionary")
        For jj = 1 To UBound(sn, 2)
          For j = 1 To UBound(sn)
            If .exists(sn(j, jj)) Or sn(j, jj) = "" Then
              sn(j, jj) = ""
            Else
              x0 = .Item(sn(j, jj))
              sn(.Count, jj) = sn(j, jj)
            End If
          Next
          .RemoveAll
        Next
        Sheet2.Cells(1).Resize(UBound(sn, 2), UBound(sn)) = Application.Transpose(sn)
      End With
    End Sub
    Attached Files Attached Files
    Last edited by snb; 01-16-2017 at 01:41 AM.

  15. #15
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,728
    Location
    Quote Originally Posted by grose456 View Post
    hi Paul.
    i need the duplicate values to be gone, only the unique value to remain.
    I thought they were

    Can you provide an example?

    In post #8 with the macro and the attachment, 'Original' is your data, 'Input' is a copy I made of 'Original' for testing

    When I ran the macro on 'Input' it appeared to remove the duplicates on a column-by-column basis

    Capture.JPG

    'Output' is the transpose you asked for using 'input' as an input
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  16. #16
    VBAX Regular
    Joined
    Dec 2016
    Posts
    29
    Location
    Paul
    I believe I have a solution in post#4 also based on a column to column basis.

    Sub abc()
        Dim arr, iRow As Long, iCol As Long, i As Long
         
        arr = Range("a1").CurrentRegion
         
        Worksheets.Add: i = 1
        With CreateObject("scripting.dictionary")
            For iCol = 1 To UBound(arr, 2)
                For iRow = 1 To UBound(arr)
                    If Not .exists(arr(iRow, iCol)) And Trim$(arr(iRow, iCol)) <> "" Then
                        .Item(arr(iRow, iCol)) = Empty
                    End If
                Next
                Cells(i, UBound(arr, 2) + 3).Resize(, .Count) = .keys
                Cells(1, iCol).Resize(.Count) = Application.Transpose(.keys)
                .RemoveAll: i = i + 1
            Next
        End With
        Cells(1).CurrentRegion.Borders.LineStyle = xlContinuous
        Cells(1, UBound(arr, 2) + 5).CurrentRegion.Borders.LineStyle = xlContinuous
    End Sub

  17. #17
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,728
    Location
    Well, since Excel provides a built in .RemoveDuplicates method, I figure it's easier and faster to just use it instead of rolling my own
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  18. #18
    i went back and checked the file. when i say i want to delete duplicates i want only the unique value to remailn. for example in the attachment "original" the pink colored cell are duplicates which i need to disppear and only the green value to remain.

  19. #19
    i would have done that except i need it for one column at a time. done independently of each other. i tried doing it one column at a time but after 30 columns i figured there had to be an easier way.

  20. #20
    see post #3

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •