Consulting

Page 1 of 3 1 2 3 LastLast
Results 1 to 20 of 42

Thread: How to search substring against range and if found remove it?

  1. #1

    How to search substring against range and if found remove it?

    I have column E which contains what I call substrings. These substrings are separated by semicolons ";" which is what identifies them as unique entities within cells. In almost every instance in my workbook, every single one of these substrings will be unique from every other substring in every other cell in the workbook. However, sometimes there will be duplicates and they need to be removed. I have marked which cells contain duplicates that need to be removed by marking the associated O cell as true. For instance, in the attached sheet E2: (spaces " " inserted after the "//" and in the "http" to get around restrictions)

    ht tp:// ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg;ht tp:// ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image1.jpg;ht tp:// 72-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image2.jpg

    Contains the same substring as found in D13

    +ht tp:// ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg


    In the above pair, O2 is marked true which means the duplicates in E2 need to be removed and D13 left unchanged.

    Also, E17 contains duplicates of E8. In that case, O17 is "true" which indicates e17 duplicates be removed and E8 left as it is.


    So the logic I desire is when you see "TRUE" in O, search the E value in that row against every cell in columns D and E and if duplicate substrings are found in E and D, remove only duplicate substrings from the E cell which has an associated O value of true, else do nothing. So we will only ever be removing duplicates from column E even though we are also searching against column D and E.


    Additional information:
    1 The amount of substrings in each cell in E:E are not guaranteed to be the same. Cells in E:E may sometimes be blank but will never exceed 20 substrings. but D will always have 1 per cell.
    2)Substrings will ALWAYS start with what you see up UNTIL the word "bob".
    3) The number of rows may be quite large, maybe up to 50,000 but I'm not sure exactly.


    Is there a macro we can write to accomplish this? I haven't had much results on other forums or googling it myself. Thanks!



    Workbook: (please note spaces)

    ht tps:// drive.google.com/file/d/0B1TMLUGTVwb9aXIxcVdibkRiRk0/view?usp=sharing
    Last edited by joshman1088; 09-06-2016 at 08:13 AM.

  2. #2
    If it helps your thinking process, I received this on mrexcel, but it didn't work, it seems to remove anything in the E cell where O is marked true. Problem is the guy never responded so I'm looking around for answers. Also, the below code only searches against range E:E whereas I now need D:E

    Option Explicit
    
    
    Sub SearchInRange()
    
    
        Dim Data    As Variant
        Dim Dict    As Object
        Dim Key     As Variant
        Dim Item    As Variant
        Dim n       As Long
        Dim NewData As String
        Dim Rng     As Range
        Dim RngBeg  As Range
        Dim RngEnd  As Range
        Dim Term    As Variant
        Dim Terms   As Variant
        Dim Wks     As Worksheet
        
            Set Wks = ThisWorkbook.ActiveSheet
            
            Set RngBeg = Wks.Range("E2")
            Set RngEnd = Wks.Cells(Rows.Count, "E").End(xlUp)
            
            If RngEnd.Row < RngBeg.Row Then Exit Sub
            
            Set Dict = CreateObject("Scripting.Dictionary")
                Dict.CompareMode = vbTextCompare
                
                ' Copy column "E" values into a 2D array.
                Data = RngBeg.Resize(RowSize:=RngEnd.Row - RngBeg.Row + 1).Value
                
                ' Save unique values in the Dictionary object.
                For Each Item In Data
                    If Not IsEmpty(Item) Then
                        Terms = Split(Item, ":")
                        For Each Term In Terms
                            If Not Dict.Exists(Term) Then Dict.Add Term, True
                        Next Term
                    End If
                Next Item
            
            Set RngBeg = Wks.Range("O2")
            Set RngEnd = Wks.Cells(Rows.Count, "O").End(xlUp)
            Set Rng = Wks.Range(RngBeg, RngEnd)
            
                ' Remove duplicate values in Data array when cell in column "O" is TRUE.
                For n = 1 To Rng.Rows.Count
                    If Rng.Cells(n, 1) = True Then
                        NewData = ""
                        Terms = Split(Data(n, 1), ":")
                        For Each Term In Terms
                            If Not Dict.Exists(Term) Then
                                If NewData = "" Then
                                    NewData = Term
                                Else
                                    NewData = NewData & ":" & Term
                                End If
                            End If
                        Next Term
                        Data(n, 1) = NewData
                    End If
                Next n
                
            ' Write the Data array values back to the column "E".
            Set Rng = Wks.Range("E2").Resize(UBound(Data, 1), 1)
            Rng.Value = Data
            
    End Sub
    Last edited by SamT; 09-06-2016 at 12:02 PM.

  3. #3
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Substrings will ALWAYS start with what you see
    WHAT??? Of course what we see always starts with what we see. In fact, it always ends with what we see, and everything in between is always what we see. Bob can't tell us what to not see..

    Unless we be doing LSD.

    UNTIL the word "bob".
    What does Bob have to do with things?

    Ignoring what Bob has to say for the moment, What I think you are saying is:
    For each O=True Row
    For each SubString In Cell "E"
    If SubString is Found anywhere else in Column D or E, Delete that substring from Cell "E".

    OK, what does Bob say about that?

    Final question: If E has only one Substring and it is a duplicate, What happens?
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  4. #4
    Quote Originally Posted by SamT View Post
    WHAT??? Of course what we see always starts with what we see. In fact, it always ends with what we see, and everything in between is always what we see. Bob can't tell us what to not see..


    Final question: If E has only one Substring and it is a duplicate, What happens?
    I mean substrings in E will always start with the file path ht tp:// ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/
    And strings in D will always be the same as above but with a + in front of them: +ht tp:// ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/
    I mentioned that because people often ask if cell values will always look the same or not.

    Yes, if E only has one value and it is a duplicate then please delete it.

    Thanks! Sorry for the confusion. It's hard to work things correctly every time.

  5. #5
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Option Explicit
    
    Sub DeleteDuplicatesIn_E()
         
         
        Dim Found  As Range
        Dim CriteriaRng As Range
        Dim CriteriaCel As Range
        Dim FirstCriteriaAddress As String
        Dim StrCel  As Range
        Dim Term    As Variant
        Dim Terms   As Variant
        Dim Wks     As Worksheet
         
        Set Wks = ThisWorkbook.ActiveSheet
        With Wks
         
          Set CriteriaRng = Intersect(Range("O:O"), .UsedRange)
          Set CriteriaCel = CriteriaRng.Cells(1)
          If CriteriaCel <> "TRUE" Then _
             Set CriteriaCel = CriteriaRng.Find(What:="TRUE", After:=CriteriaCel, SearchDirection:=xlNext)
          FirstCriteriaAddress = CriteriaCel.Address
          
          Do
            Set StrCel = .Cells(CriteriaCel.Row, "E")
            If StrCel = "" Then GoTo NextCriteriaCel
    
            Terms = Split(StrCel, ":")
           
            If Not IsArray(Terms) Then
              Set Found = Range("D:E").Find(What:=Terms, After:=StrCel, SearchDirection:=xlNext)
              If Not Found.Address = StrCel.Address Then StrCel = ""
            Else
              For Each Term In Terms
                Set Found = Range("D:E").Find(What:=Term, After:=StrCel, SearchDirection:=xlNext)
                If Not Found.Address = StrCel.Address Then Replace StrCel, Term, ""
              Next Term
              'Code here to remove excess Colons from StrCel
            End If
    
    NextCriteriaCel:
             Set CriteriaCel = CriteriaRng.Find(What:="", After:=CriteriaCel, _
                                      SearchDirection:=xlNext)
          Loop While CriteriaCel.Address <> FirstCriteriaAddress
        End With
    End Sub
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  6. #6
    How long did that take you to run? I hit run and it never stops, I left it hours ago and it's still going. I tried on two separate computers as well. Why is this the case? Regardless, thank you, I'm dying to see it if works.

  7. #7
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    You didn't test it on a small set of data first?

    I don't even know if it works. All I did was compile it to check for obvious errors.

    Screenupdating is not disabled, so: 1) you should be able to see the progression; 2) it is much slower than with Screenupdating diabled.

    Press and hold Ctrl+F12, (Pause Break,) for at least ten seconds to stop execution.
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  8. #8
    Quote Originally Posted by SamT View Post
    You didn't test it on a small set of data first?

    I don't even know if it works. All I did was compile it to check for obvious errors.

    Screenupdating is not disabled, so: 1) you should be able to see the progression; 2) it is much slower than with Screenupdating diabled.

    Press and hold Ctrl+F12, (Pause Break,) for at least ten seconds to stop execution.
    I did check it on a small dataset. I used the test file I uploaded, something like 19 rows? Do you have any insight into this? This behavior doesn't seem normal. I can run other formulas that search every cell against every other range and it happens pretty quickly. I'm also trying it on 7 rows right now and it just seems to lock up.

  9. #9
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,724
    Location
    Is this correct:

    Situation1 --

    E2 contains these 3 'substrings'

    h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg;
    h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image1.jpg;
    h ttp://72-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image2.jpg


    D13 contains 1

    +h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg


    which matches (after deleting the +) the first substring in E2

    So O2 = True, and you want to delete the first substring from E2 since it's in D13


    Situation2 --

    Likewise, one of the substrings in E17 is also in E2

    so O17 = True and you want to delete the duplicated substring in E17
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  10. #10
    Quote Originally Posted by Paul_Hossler View Post
    Is this correct:

    Situation1 --

    E2 contains these 3 'substrings'

    h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg;
    h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image1.jpg;
    h ttp://72-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image2.jpg


    D13 contains 1

    +h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg


    which matches (after deleting the +) the first substring in E2

    So O2 = True, and you want to delete the first substring from E2 since it's in D13


    Situation2 --

    Likewise, one of the substrings in E17 is also in E2

    so O17 = True and you want to delete the duplicated substring in E17
    Yes that is entirely correct.
    But in addition to this, I have just realized the problem is bigger than I realized at first. I need the search to include columns B, C, D and E, which will be searched against B:B, C:C, D: D and E:E and altered only if the O value is true, as before. I regret that this will increase computation time.

    So if B2,C2 and D2 contain
    +h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg

    and E283 contains
    h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318CK1_main.jpg;
    h ttp://ec2-54-172-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image1.jpg;
    h ttp://72-158-94.compute-1.amazonaws.com/assets/images/bob/VG2318C_image2.jpg

    And O2 is true, then remove the entire cell contents of B2,C2 and D2. Or just remove the string, whichever is easier. There will only ever be a single string in b,c,d so if it's a duplicate and removed the cell becomes empty. Thanks for all your help, this is a pivotal issue.

  11. #11
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,724
    Location
    Not sure I understood all that

    What I have is

    1. A Sub that removes duplicates in a single column (B, C, D)
    2. A Sub that takes the de-duped single column (B, C, D) and removes substrings in a multiple substring column (E)
    3. Logic that takes each piece of a multi-substring column (E) and removes that substring from the rest of that column (E)

    Do you really need 'True' in column O? That will get into more complicated and longer running code

    Option Explicit
    
    Sub DeDup()
        Dim rData As Range, rCell As Range, rNext As Range, rLast As Range
        Dim s As String
        Dim v As Variant
        Dim i As Long
    
        Application.ScreenUpdating = False
    
        Set rData = Worksheets("Test_ph").Cells(1, 1).CurrentRegion
    
        Call CheckForDupsInSingleValueColumn(rData.Columns(2))
        Call CheckForDupsInSingleValueColumn(rData.Columns(3))
        Call CheckForDupsInSingleValueColumn(rData.Columns(4))
    
        Call CheckForDupsSingleAgaintMultiple(rData.Columns(2), rData.Columns(5))
        Call CheckForDupsSingleAgaintMultiple(rData.Columns(3), rData.Columns(5))
        Call CheckForDupsSingleAgaintMultiple(rData.Columns(4), rData.Columns(5))
    
        'check for dups in col E and remove
        With rData.Columns(5)
            For Each rCell In .Cells
                Set rNext = .Cells(rCell.Row + 1, 1)
                Set rLast = rNext.End(xlDown)
                
                v = Split(rCell.Value, ";")
                
                For i = LBound(v) To UBound(v)
                    Call Range(rNext, rLast).Replace(v(i) & ";", vbNullString, xlPart)
                    Call Range(rNext, rLast).Replace(v(i), vbNullString, xlPart)    '   no ; on last one
                Next I
            Next
        End With
    
        Application.ScreenUpdating = True
    
    End Sub
    
    Private Sub CheckForDupsInSingleValueColumn(rSingleCol As Range)
        Dim rCell As Range, rNext As Range, rLast As Range
        
        With rSingleCol
            For Each rCell In .Cells
                If Application.WorksheetFunction.CountIf(.Cells, rCell.Value) > 1 Then
                    Set rNext = .Cells(rCell.Row + 1, 1)
                    Set rLast = rNext.End(xlDown)
                    Call Range(rNext, rLast).Replace(rCell.Value, vbNullString, xlWhole)
                End If
            Next
        End With
    End Sub
     
    Private Sub CheckForDupsSingleAgaintMultiple(rSingleCol As Range, rMultiple As Range)
        Dim rCell As Range
        Dim s As String
        With rSingleCol
            For Each rCell In .Cells
                If Left(rCell.Value, 1) = "+" Then
                    s = Right(rCell.Value, Len(rCell.Value) - 1)
                Else
                    s = rCell.Value
                End If
                        
                If Len(s) > 1 Then
                    Call rMultiple.Replace(s & ";", vbNullString, xlPart)
                    Call rMultiple.Replace(s, vbNullString, xlPart)  ' no ; on last one
                End If
            Next
        End With
    End Sub
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  12. #12
    Quote Originally Posted by Paul_Hossler View Post
    Not sure I understood all that

    What I have is

    1. A Sub that removes duplicates in a single column (B, C, D)
    2. A Sub that takes the de-duped single column (B, C, D) and removes substrings in a multiple substring column (E)
    3. Logic that takes each piece of a multi-substring column (E) and removes that substring from the rest of that column (E)

    Do you really need 'True' in column O? That will get into more complicated and longer running code
    Paul, I too have questions for you. First, let me say that the reason we need "true" is we need to tell excel WHICH cells to target for duplicate deletion. Right now it appears to target the lowest of the pair? So if B2,C2,D2 and B3,C3,D3 have duplicates it removes the duplicates in row 3. What I need is for it to remove row 2 duplicates if O2 is "true", or row 3 duplicates if O3 is "true. That would be most helpful.

    To address what you have:

    1. This seems to work other than: A) we need to remove only duplicates where O is true (or some other method if you think there is a better way). B) I need this to search against B,C,D and E. Right now it seems to only search against B,C,D. Or atleast it didn't remove duplicates from B,C or D when duplicates with in an E cell only.

    You may view the below worksheet for an example of where I want B2,C2 and D2 removed because they contain duplicate of string in E3

    https://drive.google.com/file/d/0B1T...ew?usp=sharing

    2. When I try running it opens dialog asking for macro name, with the first option being "dedup". I don't see the other names "CheckForDupsInSingleValueColumn" etc, so I just hit "run" and nothing happens. Why is this?
    3. Same as 2.

    I appreciate your help immensely. It seems to be close? to working so that's very good. If we can get this working it will be of tremendous value.

  13. #13
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    used the test file I uploaded, something like 19 rows?
    I didn't count that as a test. 1900 rows, better.

    How many rows did you use it on?
    Any idea how far the code got before you terminated it?

    I did see a logic error in the last Set CriteriaCel = CriteriaRng.Find(What:="",. . . That would have added a tremendous overhead.

    And I added SpeedCode, and SearchOrder:= xlByColumns in two locations.



    Sub DeleteDuplicatesIn_E() 
         
         
        Dim Found  As Range 
        Dim CriteriaRng As Range 
        Dim CriteriaCel As Range 
        Dim FirstCriteriaAddress As String 
        Dim StrCel  As Range 
        Dim Term    As Variant 
        Dim Terms   As Variant 
        Dim Wks     As Worksheet 
         
    SpeedCodeTrue    
    
    Set Wks = ThisWorkbook.ActiveSheet 
        With Wks 
             
            Set CriteriaRng = Intersect(Range("O:O"), .UsedRange) 
            Set CriteriaCel = CriteriaRng.Cells(1) 
            If CriteriaCel <> "TRUE" Then _ 
            Set CriteriaCel = CriteriaRng.Find(What:="TRUE", After:=CriteriaCel, SearchDirection:=xlNext) 
            FirstCriteriaAddress = CriteriaCel.Address 
             
            Do 
    '****************For test puroposes only:
    CriteriaCel.Interior.ColorIndex = 4
    '****************
                Set StrCel = .Cells(CriteriaCel.Row, "E") 
                If StrCel = "" Then GoTo NextCriteriaCel 
                 
                Terms = Split(StrCel, ":") 
                 
                If Not IsArray(Terms) Then 
                    Set Found = Range("D:E").Find(What:=Terms, After:=StrCel, SearchDirection:=xlNext, SearchOrder:= xlByColumns) 
                    If Not Found.Address = StrCel.Address Then 
    StrCel = "" 
    
     '****************For test puroposes only:
    CriteriaCel.Interior.ColorIndex = 3
    '****************
    
    End If
               Else 
                    For Each Term In Terms 
                        Set Found = Range("D:E").Find(What:=Term, After:=StrCel, SearchDirection:=xlNext, SearchOrder:= xlByColumns) 
                        If Not Found.Address = StrCel.Address Then 
    Replace StrCel, Term, "" 
    
     '****************For test puroposes only:
    CriteriaCel.Interior.ColorIndex = 3
    '****************
    
    End If
                    Next Term 
                     'Code here to remove excess Colons from StrCel
                End If 
                 
    NextCriteriaCel: 
                Set CriteriaCel = CriteriaRng.Find(What:="TRUE", After:=CriteriaCel, _ 
                SearchDirection:=xlNext) 
            Loop While CriteriaCel.Address <> FirstCriteriaAddress 
        End With 
    End Sub
    Public Sub SpeedCode(Optional Faster As Boolean)
    Static CalcSetting As XlCalculation
    If CalcSetting = 0 Then CalcSetting = Application.Calculation
      
      If Faster Then
        With Application
          CalcSetting = .Calculation
          .ScreenUpdating = False
          .EnableEvents = False
          .Calculation = xlCalculationAutomatic
        End With
      Else
        With Application
          .ScreenUpdating = True
          .EnableEvents = True
          .Calculation = CalcSetting
        End With
        CalcSetting = 0
      End If
    
    SpeedCode False
    End Sub
    The For testing purposes only code: Each checked TRUE cell in Column O is colored, IF duplicates are found, it is colored Red.

    Make these adjustments as desired

    For testing only the first thousand rows:
    Change; Set CriteriaRng = Intersect(Range("O:O"), .UsedRange)
    To
    Set CriteriaRng = Range("O1:O1000")

    For testing only the first ten thousand rows:
    To
    Set CriteriaRng = Range("O1:O10000")
    Note that those adjustments still search the entire sheet but only for the TRUEs found in the limited set of Column O rows.

    Personally, I would comment out the first SpeedCode line and set the Criteria Range to only the first two TRUEs, and I would Step thru the code using F8 to manually verify the deleting process working.

    Then I would test the first 100 rows. If the 100 row test takes longer than 1 second, let us know before you continue.




    To search B,C,D, and E:
    2 Instances Set Found = Range("D:E") to Set Found = Range("B:E")
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  14. #14
    Sam, why would a 19 row test not be good? Yes, I originally tested on 19 rows and I have no idea how far it got, it just says "running".

    I ran your new code (first code box) by commenting out the speedcode true and it did highlight the "true"s as red. However, I checked and it does so regardless of whether there are any duplicates or not. It did not, however, delete anything. Was it supposed to? I stepped through with f8 and same results.

    I checked 100 rows and it only took a couple milliseconds.

  15. #15
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Change
    Replace StrCel, Term, ""
    To
    StrCel = Replace(StrCel, Term, "")

    And in SpeedCode, remove the last line "SpeedCode = False" how that got in there
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  16. #16
    Sam, I did that and it just turns the TRUE green.

  17. #17
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,724
    Location
    Paul, I too have questions for you. First, let me say that the reason we need "true" is we need to tell excel WHICH cells to target for duplicate deletion. Right now it appears to target the lowest of the pair? So if B2,C2,D2 and B3,C3,D3 have duplicates it removes the duplicates in row 3. What I need is for it to remove row 2 duplicates if O2 is "true", or row 3 duplicates if O3 is "true. That would be most helpful.
    It does delete (replace with "" actually) the 2nd, 3rd, ... instances

    1. How does the TRUE get into O2 or O3? Good luck if you're doing it manually

    2. Does it matter?

    3. Do you want to delete the entire row, or just the substring that is duplicated?
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  18. #18
    Quote Originally Posted by Paul_Hossler View Post
    It does delete (replace with "" actually) the 2nd, 3rd, ... instances

    1. How does the TRUE get into O2 or O3? Good luck if you're doing it manually

    2. Does it matter?

    3. Do you want to delete the entire row, or just the substring that is duplicated?

    1. I am using =SUMPRODUCT(--ISNUMBER(SEARCH(A2,A:A)))>1 to determine which should be marked true. What is happening in this whole process is that I used

    =IFERROR(INDEX($L$1:$L$228,SMALL(IF(ISNUMBER(SEARCH("|"&$B2,"|"&$L$1:$L$228 )),
    ROW($L$1:$L$228)-ROW($L$1)+1),COLUMNS($E$2:E2))),"")

    to look for model numbers in a column against a range of values and return each instance of those values. So model number abc would return image1abc.jpg, image2abc.jpg image3abc.jpg etc. HOWEVER, in the rare instance when one model number includes another, like abcd includes abc, then all abcd results are return to abc as well. The SUMPRODUCT formula tells me when abc is contained within another cell, in this case abcd, and marks as true. In this way we can know that abc will contain the correct results as well as bad ones. And in that way we know to remove the duplicates for abc as opposed to abcd.

    2. I don't see macro 2 or 3 doing anything, that's why I asked. I didn't know if the dialog box indicated something wasn't running correctly. I think that happened to me in the past but I can't remember.


    3. I just want to delete duplicate strings. I just said "row" as shorthand for "duplicate strings in that row". But I guess shorthand is bad in this instance.

  19. #19
    Paul, it only just now occurs to me that dedup is calling the others, so all I have to do is run dedup and that's it, right? Regardless, I still have the issues I posted in 1. as well as I don't see anything being searched against and removed from E:E? So if E2 is DUP of E3 then nothing happens

  20. #20
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    My very bad.
    I based that code off the code you posted in #2: Terms = Split(Item, ":")
    I replaced the Colons in this code with semicolons and add a check to remove leading semicolons.

    Why didn't you catch that use of colons?

    Sub DeleteDuplicatesIn_E()
         
         
        Dim Found  As Range
        Dim CriteriaRng As Range
        Dim CriteriaCel As Range
        Dim FirstCriteriaAddress As String
        Dim StrCel  As Range
        Dim Term    As Variant
        Dim Terms   As Variant
        Dim Wks     As Worksheet
         
        'SpeedCode True '<<<<<<<<<<<<<<<<<<<<<Uncomment after testing
         
        Set Wks = ThisWorkbook.ActiveSheet
        With Wks
             
            Set CriteriaRng = Intersect(Range("O:O"), .UsedRange)
            Set CriteriaCel = CriteriaRng.Cells(1)
            If CriteriaCel <> "TRUE" Then _
            Set CriteriaCel = CriteriaRng.Find(What:="TRUE", After:=CriteriaCel, SearchDirection:=xlNext)
            FirstCriteriaAddress = CriteriaCel.Address
             
            Do
                 '****************For test puroposes only:
                CriteriaCel.Interior.ColorIndex = 4
                 '****************
                Set StrCel = .Cells(CriteriaCel.Row, "E")
                If StrCel = "" Then GoTo NextCriteriaCel
                 
                Terms = Split(StrCel, ";")
                 
                If Not IsArray(Terms) Then
                    Set Found = Range("D:E").Find(What:=Terms, After:=StrCel, SearchDirection:=xlNext, SearchOrder:=xlByColumns)
                    If Not Found.Address = StrCel.Address Then
                        StrCel = ""
                         
                         '****************For test puroposes only:
                        CriteriaCel.Interior.ColorIndex = 3
                         '****************
                         
                    End If
                Else
                    For Each Term In Terms
                        Set Found = Range("D:E").Find(What:=Term, After:=StrCel, SearchDirection:=xlNext, SearchOrder:=xlByColumns)
                        If Not Found.Address = StrCel.Address Then
                            StrCel = Replace(StrCel, Term, "")
                            StrCel = Replace(StrCel, ";;", ";")
                            If Right(StrCel, 1) = ";" Then StrCel = Left(StrCel, Len(StrCel) - 1)
                            If Left(StrCel, 1) = ";" Then StrCel = Mid(StrCel, 2)
                             
                             '****************For test puroposes only:
                            CriteriaCel.Interior.ColorIndex = 3
                             '****************
                             
                        End If
                    Next Term
                     
                End If
                 
    NextCriteriaCel:
                Set CriteriaCel = CriteriaRng.Find(What:="TRUE", After:=CriteriaCel, _
                SearchDirection:=xlNext)
            Loop While CriteriaCel.Address <> FirstCriteriaAddress
        End With
        SpeedCode
    End Sub
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •