Consulting

Results 1 to 12 of 12

Thread: How to remove duplicates within duplication sets based on criteria?

  1. #1

    How to remove duplicates within duplication sets based on criteria?

    I want to know how to filter the duplicate sets in the "m/z" column [highlighted yellow so that I can easily differentiate between the different duplication sets] based on:

    Keep the row within the duplication set if (TO BE FOLLOWED IN CHRONOLOGICAL ORDER):
    1.) N/O value <= 2 (denoted in green)
    2.) (N+S)/O value <= 1 (also denoted in green)
    3.) keep the row with the highest H/C value that is <= 2.25

    What I need to do with these tests (or rules) is use them to compare each duplicate row within the duplication set to find one "Last man standing."

    For example: look at duplication set of row 35, 36, and 37. All three pass the N/O test, so all three of the rows advance to the 2nd test. Now, row 35 and 36 fail the 2nd test due to being bigger than 1, and row 37 passes because it is <= 1; meaning that row 35 & 36 will get deleted and row 37 will stay as the "last man standing."

    You see, I can't just filter every row that fails a test within the entire workbook because some rows that fail a test may still be the "winner" because it passed the most tests compared to the others within its duplication set.

    For Example: look at row 18, 19, and 20. Because row 18 failed the N/O test, it gets deleted. But, 19 and 20 pass so they both move on to the (N+S)/O test. Since row 19 failed the 2nd test and row 20 passed, that makes it the only row left, and therefore, it is declared the "winner" and stays as the "unique" duplicate. However, notice that row 20 failed the 3rd test with having a value greater than 2.25. This is ignored because row 20 beat row 18 and 19 before it could "compete" in the 3rd test.

    So if I were to just delete all of the rows that failed a test, row 20 would get deleted too, even though it was supposed to stay as the "unique" duplicate because it was the "last man standing."

    In addition, if the duplications WITHIN the duplication set fails a test simultaneously to whereas there is not a single "winner", then delete every row WITHIN that duplication set.

    Example 1: Take a look at row 4, 5, and 6. All three pass the first test, so all three advance on to the second test; now all three fail the second test so all three get deleted and there is no "winner."

    Example 2: Take a look at row 13, 14, and 15. Row 13 fails the first test, so its automatically eliminated. Row 14 and 15 pass the first test so they advance to the 2nd test; however, row 14 and 15 both fail the second test simultaneously and both get deleted, therefore, there is not a single "winner" for this duplication set.

    Im thinking this has to be achieved through some type of VBA code, but any suggestions are greatly appreciated, thanks!



    Moderator's Note: To all, Please read the entire thread before answering. This is a Merged thread and there is additional information of the OP's situation below.
    Attached Files Attached Files
    Last edited by SamT; 11-25-2016 at 12:23 PM.

  2. #2
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,730
    Location
    Who is the winner here and why?

    Capture.JPG
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  3. #3
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Two Rows pass all tests, and have the same h/c . Then what?

    @ Paul
    Row 39 = Last man standing. Yes-No?

    Proposed Algorithm:
    For each DuplicatesSet
    For Each Test
    If FailedRows.Count = Set.Count then
    Delete Set
    Exit For
    Else
    Delete FailedRows
    End If
    If Set.Count = 1 Then Exit For
    Next Test
    If Set.Count > 1 then Do What 'Two Rows with same h/c 
    Next Set

    @ maupinsmason,
    Please verify that the logic in the proposed algorithm matches your needs.
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  4. #4
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,730
    Location
    Play with this and let me know

    The attachment has the code and the results


    Option Explicit
    
    Dim rData As Range
    
    Sub DeleteSomeRows()
        Dim rData1 As Range
        Dim iRow As Long, iStart As Long, iEnd As Long
        Dim aCounts() As Long, aChecks() As Long
        
        
        'setup
        Application.ScreenUpdating = False
        Set rData = ActiveSheet.Cells(1, 1).CurrentRegion
        Set rData1 = rData.Cells(2, 1).Resize(rData.Rows.Count - 1, rData.Columns.Count)
        With ActiveSheet.Sort
            .SortFields.Clear
            .SortFields.Add Key:=rData1.Columns(1), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
            .SortFields.Add Key:=rData1.Columns(4), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
            .SetRange rData
            .Header = xlYes
            .MatchCase = False
            .Orientation = xlTopToBottom
            .SortMethod = xlPinYin
            .Apply
        End With
        
        'pass #1 - for a m/z, (all col 2 outside) OR (all col 3 outside) OR (all col 4 outside) or (any col 2 outside)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                If .Cells(iRow + 1, 1).Value = .Cells(iRow, 1).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData1(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(1).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        
        'pass #2 - for a m/z with 2+ rows, (ANY col 3 outside)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                If .Cells(iRow + 1, 1).Value = .Cells(iRow, 1).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData2(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(1).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        
        
        'pass #3 - for a m/z with 2+ rows, (ANY col 4 outside)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                If .Cells(iRow + 1, 1).Value = .Cells(iRow, 1).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData3(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(1).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        
        'pass #4 - for a m/z with 2+ rows, (largest col 4)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                If .Cells(iRow + 1, 1).Value = .Cells(iRow, 1).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData4(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(1).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        Application.ScreenUpdating = True
    End Sub
        
        
        
    Private Sub CheckData1(S As Long, E As Long)
        Dim i As Long
        Dim bAllFail As Boolean
        
        With rData
            'first test - all col 2 outside
            bAllFail = True
            For i = S To E
                If .Cells(i, 2).Value <= 2# Then bAllFail = False
            Next i
                    
            If bAllFail Then
                For i = S To E
                    .Cells(i, 1).Value = True
                Next i
                Exit Sub
            End If
            
            'second test - all col 3 outside
            bAllFail = True
            For i = S To E
                If .Cells(i, 3).Value <= 1# Then bAllFail = False
            Next i
                    
            If bAllFail Then
                For i = S To E
                    .Cells(i, 1).Value = True
                Next i
                Exit Sub
            End If
            
            
            'third test  - all col 4 outside
            bAllFail = True
            For i = S To E
                If .Cells(i, 4).Value <= 2.25 Then bAllFail = False
            Next i
                    
            If bAllFail Then
                For i = S To E
                    .Cells(i, 1).Value = True
                Next i
                Exit Sub
            End If
            
            'fourth test  - ANY col 2 outside
            For i = S To E
                If .Cells(i, 2).Value > 2# Then .Cells(i, 1).Value = True
            Next i
        
        End With
    End Sub
        
        
    Private Sub CheckData2(S As Long, E As Long)
        Dim i As Long
        
        If S = E Then Exit Sub
        
        With rData
            'any col 3 outside
            For i = S To E
                If .Cells(i, 3).Value > 1# Then .Cells(i, 1).Value = True
            Next i
        End With
    End Sub
        
    Private Sub CheckData3(S As Long, E As Long)
        Dim i As Long
        
        If S = E Then Exit Sub
        
        With rData
            'any col 4 outside
            For i = S To E
                If .Cells(i, 4).Value > 2.25 Then .Cells(i, 1).Value = True
            Next i
        End With
    End Sub
        
    Private Sub CheckData4(S As Long, E As Long)
        Dim i As Long
        Dim m As Double
        
        If S = E Then Exit Sub
        
        With rData
            m = 0#
            For i = S To E
                If .Cells(i, 4).Value > m Then m = .Cells(i, 4).Value
            Next i
            For i = S To E
                If .Cells(i, 4).Value < m Then .Cells(i, 1).Value = True
            Next i
        
        End With
    End Sub
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  5. #5

    How to remove duplicates within duplication sets based on criteria?

    I want to know how to filter the duplicate sets in the "m/z" column [highlighted yellow so that I can easily differentiate between the different duplication sets] based on:

    Keep the row within the duplication set if (TO BE FOLLOWED IN CHRONOLOGICAL ORDER):
    1.) N/O value <= 2 (denoted in green)
    2.) (N+S)/O value <= 1 (also denoted in green)
    3.) keep the row with the highest H/C value that is <= 2.25

    What I need to do with these tests (or rules) is use them to compare each duplicate row within the duplication set to find one "Last man standing."

    For example: look at duplication set of row 36, 37, and 38. All three pass the N/O test, so all three of the rows advance to the 2nd test. Now, row 36 and 37 fail the 2nd test due to being bigger than 1, and row 38 passes because it is <= 1; meaning that row 36 & 37 will get deleted and row 38 will stay as the "last man standing."

    You see, I can't just filter every row that fails a test within the entire workbook because some rows that fail a test may still be the "winner" because it passed the most tests compared to the others within its duplication set.

    For Example: look at row 22, 23, and 24. Because 22, 23 and 24 all pass the N/O test, they advance to the (N+S)/O test. Since row 22 & 23 failed; and row 24 passed, that makes row 24 the only row left, and therefore, it is declared the “last man standing” and stays as the "unique duplicate.” However, notice that the winner row 24 failed the 3rd test with having a value greater than 2.25. This is ignored because row 24 beat row 22 and 23 before it could "compete" in the 3rd test.

    So, if I were to just delete all of the rows that failed a test, row 24 would get deleted too, even though it was supposed to stay as the "unique duplicate” because it was the "last man standing."

    In addition, if the duplications WITHIN the duplication set fails a test simultaneously to whereas there is not a single "winner", then delete every row WITHIN that duplication set.

    Example 1: Take a look at row 18 and 19. Both of them pass the first test, so both of them advance on to the second test. Now both of them fail the second test so row 18 and 19 get deleted and there is no "winner."

    Example 2: Take a look at row 4, 5, and 6. Row 4 fails the first test, so row 4 automatically gets eliminated. Row 5 and 6 pass the first test so they advance to the 2nd test; however, row 5 and 6 both fail the second test simultaneously and both get deleted, therefore, there is not a single "winner" for this duplication set.

    Moreover, there will be some duplicates within the duplication sets that will pass all three tests, and wind up with the same H/C value.

    So, in the event of this happening, I need to put them through 2 Tiebreakers (Also needs to be followed in chronological order):

    Tiebreakers:

    1.) If the value within the "P" column of the duplicates is a non-zero digit, then the duplicate must contain a H/C value within the range: 1.5 <= # <= 2.25 (if the duplicate has a zero in the “P” column, then void the tiebreaker #1 and move straight to tiebreaker #2).


    For Example: Take a look at the duplication set of row 29 and 30. Both of them pass the 1st, 2nd and 3rd, while still having the same H/C value, but both of them have a zero in the “P” column, so they will not be tested with the Tiebreaker #1. Instead they will jump straight to Tiebreaker #2:

    2.) Keep the duplicate that has the value closest to zero within the "stddev2" column.

    For Example: Using the same duplication set of row 29 and 30, take a look at the “stddev2” column. Because row 29 has -0.8 and row 30 has a “stddev2” value of 1, row 29 will be declared the “winner.” This is because -0.8 is closer to zero than 1 is.

    However, if there are still duplicates that passed tiebreaker #1 or jumped straight to tiebreaker #2, and they have the same "stddev2" value as well, then completely delete all of the duplicates within that duplication set, and move on to the next set.

    I’m thinking this has to be achieved through some type of VBA code, but any suggestions are greatly appreciated, thanks!
    Attached Files Attached Files

  6. #6
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,730
    Location
    I added the new TieBreaker logic to my macro in the previous post you started

    I think the logic is what you asked for

    Let me know

    The attachment has the macro and the Results




    Option Explicit
    Const cMZ As Long = 2
    Const cP As Long = 10
    Const cStdDev2 As Long = 12
    Const cNO As Long = 14
    Const cNSO As Long = 15
    Const cHC As Long = 16
    Dim rData As Range
    Sub DeleteSomeRows()
        Dim rData1 As Range
        Dim iRow As Long, iStart As Long, iEnd As Long
        Dim aCounts() As Long, aChecks() As Long
        
        
        'setup
        Application.ScreenUpdating = False
        Set rData = ActiveSheet.Cells(1, 1).CurrentRegion
        Set rData1 = rData.Cells(2, 1).Resize(rData.Rows.Count - 1, rData.Columns.Count)
        Application.StatusBar = "Sorting"
        With ActiveSheet.Sort
            .SortFields.Clear
            .SortFields.Add Key:=rData1.Columns(cMZ), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
            .SortFields.Add Key:=rData1.Columns(cHC), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
            .SetRange rData
            .Header = xlYes
            .MatchCase = False
            .Orientation = xlTopToBottom
            .SortMethod = xlPinYin
            .Apply
        End With
        
        'pass #1 - for a m/z, (all col 2 outside) OR (all col 3 outside) OR (all col 4 outside) or (any col 2 outside)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #1 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData1(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        
        'pass #2 - for a m/z with 2+ rows, (ANY col 3 outside)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #2 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData2(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        
        
        'pass #3 - for a m/z with 2+ rows, (ANY col 4 outside)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #3 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData3(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        
        'pass #4 - for a m/z with 2+ rows, (largest col 4)
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #4 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    Call CheckData4(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        'tiebreaker #1 -
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Tiebreaker - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    Call TieBreaker1(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        
        Application.StatusBar = False
        Application.ScreenUpdating = True
    End Sub
        
        
        
    Private Sub CheckData1(S As Long, E As Long)
        Dim i As Long
        Dim bAllFail As Boolean
        
        With rData
            'first test - all col 2 outside
            bAllFail = True
            For i = S To E
                If .Cells(i, cNO).Value <= 2# Then bAllFail = False
            Next i
                    
            If bAllFail Then
                For i = S To E
                    .Cells(i, cMZ).Value = True
                Next i
                Exit Sub
            End If
            
            'second test - all col 3 outside
            bAllFail = True
            For i = S To E
                If .Cells(i, cNSO).Value <= 1# Then bAllFail = False
            Next i
                    
            If bAllFail Then
                For i = S To E
                    .Cells(i, cMZ).Value = True
                Next i
                Exit Sub
            End If
            
            
            'third test  - all col 4 outside
            bAllFail = True
            For i = S To E
                If .Cells(i, cHC).Value <= 2.25 Then bAllFail = False
            Next i
                    
            If bAllFail Then
                For i = S To E
                    .Cells(i, cMZ).Value = True
                Next i
                Exit Sub
            End If
            
            'fourth test  - ANY col 2 outside
            For i = S To E
                If .Cells(i, cNO).Value > 2# Then .Cells(i, cMZ).Value = True
            Next i
        
        End With
    End Sub
        
        
    Private Sub CheckData2(S As Long, E As Long)
        Dim i As Long
        
        If S = E Then Exit Sub
        
        With rData
            'any col 3 outside
            For i = S To E
                If .Cells(i, cNSO).Value > 1# Then .Cells(i, cMZ).Value = True
            Next i
        End With
    End Sub
        
    Private Sub CheckData3(S As Long, E As Long)
        Dim i As Long
        
        If S = E Then Exit Sub
        
        With rData
            'any col 4 outside
            For i = S To E
                If .Cells(i, cHC).Value > 2.25 Then .Cells(i, cMZ).Value = True
            Next i
        End With
    End Sub
        
    Private Sub CheckData4(S As Long, E As Long)
        Dim i As Long
        Dim m As Double
        
        If S = E Then Exit Sub
        
        With rData
            m = 0#
            For i = S To E
                If .Cells(i, cHC).Value > m Then m = .Cells(i, cHC).Value
            Next i
            For i = S To E
                If .Cells(i, cHC).Value < m Then .Cells(i, cMZ).Value = True
            Next i
        
        End With
    End Sub
        
        
    Private Sub TieBreaker1(S As Long, E As Long)
        Dim bAllZeros As Boolean, bOneFailed As Boolean
        Dim i As Long
        Dim m As Double
        
        If S = E Then Exit Sub
        
        bAllZeros = True
        With rData
            For i = S To E
                If .Cells(i, cP).Value <> 0# Then bAllZeros = False
            Next i
                    
            If bAllZeros Then
                m = 10 ^ 6
                For i = S To E
                    If Abs(.Cells(i, cStdDev2).Value) < m Then m = Abs(.Cells(i, cStdDev2).Value)
                Next i
            
                If m <> 10 ^ 6 Then
                    For i = S To E
                        If Abs(.Cells(i, cStdDev2).Value) <> m Then .Cells(i, cMZ).Value = True
                    Next i
                
                Else
                    For i = S To E
                        .Cells(i, cMZ).Value = True
                    Next i
                End If
            
            Else
                bOneFailed = False
                For i = S To E
                    If .Cells(i, cP).Value <> 0# Then
                        If (.Cells(i, cHC).Value < 1.5) Or (.Cells(i, cHC).Value > 2.25) Then
                            .Cells(i, cMZ).Value = True
                            bOneFailed = True
                        End If
                    End If
                Next i
                
                If Not bOneFailed Then
                    For i = S To E
                        .Cells(i, cMZ).Value = True
                    Next i
                End If
                
            End If
        
        End With
    End Sub
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  7. #7
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,730
    Location
    Let me know how this works out for you

    I did it with 5 simple passes instead of one huge complicated one because 1) That's my style; 2) it was easier to develop and test; but 3) it's much easier to correct or add anything
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  8. #8
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    The OP started two identical threads with replies in both.

    This tread is those two threads merged into one.

    Posts are out of Thread order because they are in chronological order regardless of the original thread they were in.



    To ALL: PLease do not dupicate your questions in two threads. Thank you.
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  9. #9
    Holy Moly Paul... That is amazing! Thank you so much for helping me out so much!

    For the most part it actually looks correct, except I found a few that should have passed and did not end up in the "Results" sheet you created.

    Unfortunately, I am not fluent in coding language (I can get the general idea of what is going on in the code, but I can't edit the code because I simply don't know it that well), so what I did was I went through the sheet I originally attached and processed it manually in the way I want them to be processed. I attached a new and updated "Maupinsmason's Sample Workbook" where I created another sheet that contains what the results should look like if the rules were followed the way I want them too.

    Please let me know if there is anything else I can help with.

    Thank you so much,

    Mason
    Attached Files Attached Files

  10. #10
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,730
    Location
    A rule wasn't 100% correct 100% of the time

    I also simplified the code a bit so you could make any changes you needed, as well as make it a little more self documenting

    Option Explicit
    
    Const cMZ As Long = 2
    Const cP As Long = 10
    Const cStdDev2 As Long = 12
    Const cNO As Long = 14
    Const cNSO As Long = 15
    Const cHC As Long = 16
    
    Dim rData As Range
    
    Sub DeleteSomeRows()
        Dim rData1 As Range
        Dim iRow As Long, iStart As Long, iEnd As Long
        Dim aCounts() As Long, aChecks() As Long
        
        
        'testing to init data
    '    Worksheets("HA_MM_7_0915_NB_MF_proc_reduced").Columns("B:P").Copy
    '    Sheets("Results").Select
    '    Range("B1").Select
    '    ActiveSheet.Paste
        '------------------------------------------------
        
        'setup - sort in MZ ascending order
        Application.ScreenUpdating = False
        Set rData = ActiveSheet.Cells(1, 1).CurrentRegion
        Set rData1 = rData.Cells(2, 1).Resize(rData.Rows.Count - 1, rData.Columns.Count)
        Application.StatusBar = "Sorting"
        With ActiveSheet.Sort
            .SortFields.Clear
            .SortFields.Add Key:=rData1.Columns(cMZ), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
            .SetRange rData
            .Header = xlYes
            .MatchCase = False
            .Orientation = xlTopToBottom
            .SortMethod = xlPinYin
            .Apply
        End With
        
    '---------------------------------------------------------------------------
        'pass #1 - for a MZ with 2+ rows, delete if
        '   a   NO and NSO outside on row
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #1 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    If iStart <> iEnd Then Call CheckData1(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
    '---------------------------------------------------------------------------
        'pass #2 - for a MZ with 2+ rows, delete if
        '   a   NO not outside and NSO outside on row
        '   b   NO outside and NSO not outside on row
        '   c   All rows have either NO or NSO outside
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #2 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    If iStart <> iEnd Then Call CheckData2(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
    
    '---------------------------------------------------------------------------
        'pass #3 - for a MZ with 2+ rows, delete if
        '   a   HC outside
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #3 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    If iStart <> iEnd Then Call CheckData3(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
    '---------------------------------------------------------------------------
        'pass #4 - for a MZ with 2+ rows, delete
        '   a   not the largest HC
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #4 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    If iStart <> iEnd Then Call CheckData4(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
    '---------------------------------------------------------------------------
        'pass #5 - tiebreaker
        '1.) If the value within the "P" column of the duplicates is a non-zero digit, then the duplicate must contain a H/C value
        '    within the range: 1.5 <= # <= 2.25
        '    (if the duplicate has a zero in the “P” column, then void the tiebreaker #1 and move straight to tiebreaker #2).
        '
        ' 2.) Keep the duplicate that has the value closest to zero within the "stddev2" column.
        '
        ' if there are still duplicates that passed tiebreaker #1 or jumped straight to tiebreaker #2, and
        ' they have the same "stddev2" value as well, then completely delete all of the duplicates
        
        iStart = 2
        iEnd = iStart
        With rData
            For iRow = 2 To .Rows.Count
                Application.StatusBar = "Pass #5 - Row " & iRow
                If .Cells(iRow + 1, cMZ).Value = .Cells(iRow, cMZ).Value Then
                    iEnd = iEnd + 1
                Else
                    If iStart <> iEnd Then Call CheckData5(iStart, iEnd)
                    iStart = iRow + 1
                    iEnd = iStart
                End If
            Next iRow
        End With
        On Error Resume Next
        rData.Columns(cMZ).SpecialCells(xlCellTypeConstants, xlLogical).EntireRow.Delete
        On Error GoTo 0
        Application.StatusBar = False
        Application.ScreenUpdating = True
        
        MsgBox "Done"
    End Sub
        
    '--------------------------------------------------------------------
    Private Function RowIsMarked(i As Long) As Boolean
        RowIsMarked = Application.WorksheetFunction.IsLogical(rData.Cells(i, cMZ).Value)
    End Function
    Private Sub MarkRow(i As Long)
        rData.Cells(i, cMZ).Value = True
    End Sub
    Private Function NO_Outside(i As Long) As Boolean
        NO_Outside = rData.Cells(i, cNO).Value > 2#
    End Function
    Private Function NSO_Outside(i As Long) As Boolean
        NSO_Outside = rData.Cells(i, cNSO).Value > 1#
    End Function
    Private Function HC_Outside(i As Long) As Boolean
        HC_Outside = rData.Cells(i, cHC).Value > 2.25
    End Function
    Private Function All_NO_Outside(S As Long, E As Long) As Boolean
        Dim i As Long
        
        All_NO_Outside = True
        
        For i = S To E
            If Not NO_Outside(i) Then
                All_NO_Outside = False
                Exit Function
            End If
        Next i
    End Function
    Private Function All_NSO_Outside(S As Long, E As Long) As Boolean
        Dim i As Long
        
        All_NSO_Outside = True
        
        For i = S To E
            If Not NSO_Outside(i) Then
                All_NSO_Outside = False
                Exit Function
            End If
        Next i
    End Function
    
    '------------------------------------------------------------------------
    Private Sub CheckData1(S As Long, E As Long)
        Dim i As Long
        
        With rData
            For i = S To E
                If NO_Outside(i) And NSO_Outside(i) Then MarkRow (i)
            Next i
        End With
    End Sub
            
            
    'any NO outside
    Private Sub CheckData2(S As Long, E As Long)
        Dim i As Long
        Dim bAllRowsOutside As Boolean
        
        With rData
            For i = S To E
                If NSO_Outside(i) And Not NO_Outside(i) Then
                    MarkRow (i)
                ElseIf Not NSO_Outside(i) And NO_Outside(i) Then
                    MarkRow (i)
                End If
            Next i
        
        
            bAllRowsOutside = True
            For i = S To E
                If Not (NSO_Outside(i) And NO_Outside(i)) Then
                    bAllRowsOutside = False
                    Exit For
                End If
            Next i
            
            If bAllRowsOutside Then
                For i = S To E
                    MarkRow (i)
                Next i
            End If
        End With
    End Sub
        
    Private Sub CheckData3(S As Long, E As Long)
        Dim i As Long
        
        With rData
            For i = S To E
                If HC_Outside(i) Then MarkRow (i)
            Next i
        End With
    End Sub
        
    Private Sub CheckData4(S As Long, E As Long)
        Dim i As Long
        Dim m As Double
        
        With rData
            m = 0#
            For i = S To E
                If .Cells(i, cHC).Value > m Then m = .Cells(i, cHC).Value
            Next i
            For i = S To E
                If .Cells(i, cHC).Value < m Then MarkRow (i)
            Next i
        
        End With
    End Sub
        
        
    Private Sub CheckData5(S As Long, E As Long)
        Dim i As Long, j As Long, n As Long, n1 As LoadPictureConstants
        Dim m As Double
        
        If S = E Then Exit Sub
        
        n = E - S + 1
        
        
        With rData
            For i = S To E
                If .Cells(i, cP).Value <> 0# Then
                    For j = S To E
                        If .Cells(j, cP).Value <> 0# Then
                            If (.Cells(j, cHC).Value < 1.5) Or (.Cells(j, cHC).Value > 2.25) Then
                                .Cells(j, cMZ).Value = True
                                n = n - 1
                            End If
                        End If
                    Next j
                End If
            Next i
                    
            If n = 1 Then Exit Sub
                    
                    
            m = 10 ^ 6
            For i = S To E
                If Not Application.WorksheetFunction.IsLogical(.Cells(i, cMZ).Value) Then
                    If Abs(.Cells(i, cStdDev2).Value) < m Then m = Abs(.Cells(i, cStdDev2).Value)
                End If
            Next i
            
            n1 = 0
            For i = S To E
                If Not Application.WorksheetFunction.IsLogical(.Cells(i, cMZ).Value) Then
                    If Abs(.Cells(i, cStdDev2).Value) = m Then n1 = n1 + 1
                End If
            Next i
            
            If n1 = 1 Then
                For i = S To E
                    If Not Application.WorksheetFunction.IsLogical(.Cells(i, cMZ).Value) Then
                        If Abs(.Cells(i, cStdDev2).Value) <> m Then
                            .Cells(i, cMZ).Value = True
                        End If
                    End If
                Next i
            
            Else
                For i = S To E
                    .Cells(i, cMZ).Value = True
                Next i
            End If
        
        End With
    End Sub
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  11. #11
    WOW! This works flawlessly!! Thank you so much Paul! I owe you HUGE man!

    I have one question though, if I had to add more columns such as column K in the new spreadsheet I attached, how would I change the code to accommodate the change?

    I changed the position of the columns in the beginning of your code to:



    Option Explicit

    Const cMZ As Long = 2
    Const cP As Long = 10
    Const cStdDev2 As Long = 13
    Const cNO As Long = 15
    Const cNSO As Long = 16
    Const cHC As Long = 17

    Dim rData As Range

    Sub DeleteSomeRows()
    Dim rData1 As Range
    Dim iRow As Long, iStart As Long, iEnd As Long
    Dim aCounts() As Long, aChecks() As Long


    'testing to init data
    ' Worksheets("HA_MM_7_0915_NB_MF_proc_reduced").Columns("B:P").Copy
    ' Sheets("Results").Select
    ' Range("B1").Select
    ' ActiveSheet.Paste
    '------------------------------------------------



    However, when I ran the code with the new spread sheet, it gave me this error:


    Screen Shot 2016-11-27 at 11.50.55 PM.jpg


    When I press "OK" the VBA window comes up with this:



    Screen Shot 2016-11-27 at 11.51.16 PM.jpg

    Any suggestions on how to fix this? I'm sure it's a simple fix, I just wanted to make sure though.

    Thanks

    Mason
    Attached Files Attached Files

  12. #12
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,730
    Location
    Ooops - that usually happens when I'm relying on the Intellisense to complete something, and I'm off by one in the list

    It should be 'Long', not 'LoadPictureConstants'

    Funny thing is that it didn't affect my test runs so the change must have crept in somewhere later , or it was a compatible data type since I'm using Excel 2016 and 'LoadPictureConstants' is part of the stdolb2.tlb OLE Reference which is probably why it worked

    Don't worry about that, just change it to Long and it should work fine

    Le me know if it doesn't
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •