Consulting

Results 1 to 14 of 14

Thread: Complicated Find

  1. #1
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location

    Complicated Find

    Hi All

    I've 2 Word files, for example, one for the source language (FileA) that was translated into the other Word file (FileB).

    I need a macro to do the below jobs:

    - Run the Macro to extract all the content of (FileA) as list of words and remove any repeated words to just keep the unique ones.

    - Use that Words list extracted from the previous step to search on (FileB) and if found any of that words, highlight it inside (FileB).

    The idea is simple, and I need it to make sure that the second translated file (FileB) is fully translated and that the Translator didn't miss translating any words from the source file (FileA).

  2. #2
    The following will do what you requested. The code assumes that FileA has each word to check on a separate line. The source is read into a collection of unique words and that collection is used to check the document FileB.

    Sub Macro1()
    'Graham Mayor - https://www.gmayor.com - Last updated - 17 Sep 2021
    Const sPath As String = "C:\Path\" 'the folder where the two documents are stored.
    Const sSource As String = "FileA.docx"
    Const sTarget As String = "FileB.docx"
    Dim oCol As Collection
    Dim i As Long
    Dim oSource As Document, oTarget As Document
    Dim oPara As Paragraph
    Dim oRng As Range
        Set oSource = Documents.Open(sPath & sSource)
        Set oTarget = Documents.Open(sPath & sTarget)
        Set oCol = New Collection
        On Error Resume Next
        For i = 1 To oSource.Paragraphs.Count
            Set oRng = oSource.Paragraphs(i).Range
            oRng.End = oRng.End - 1
            oCol.Add Trim(oRng.Text), Trim(oRng.Text)
        Next i
        On Error GoTo 0
        Options.DefaultHighlightColorIndex = wdYellow
        For i = 1 To oCol.Count
            Set oRng = oTarget.Range
            With oRng.Find
                .ClearFormatting
                .Text = CStr(oCol(i))
                .Replacement.ClearFormatting
                .Replacement.Text = "^&"
                .Replacement.Highlight = True
                .Forward = True
                .Wrap = wdFindContinue
                .Format = True
                .MatchCase = True
                .MatchWholeWord = True
                .MatchWildcards = False
                .MatchSoundsLike = False
                .MatchAllWordForms = False
                .Execute Replace:=wdReplaceAll
            End With
        Next i
        MsgBox "Check complete!"
    lbl_Exit:
        oSource.Close 0
        Set oSource = Nothing
        Set oTarget = Nothing
        Set oRng = Nothing
        Set oCol = Nothing
        Exit Sub
    End Sub
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  3. #3
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    Thanks a lot Graham Mayor for this quick reply.

    Actually the source word file (FileA) is a normal file that has paragraphs, tables, ...etc. So is it possible to add some codes on the beginning of the macro to extract all its content as separate words to find them later on (FileB)?

    Thanks in advance for your help on that.

  4. #4
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    If it's better, I'm sharing with you the attached sample files.FileA.docx --- FileB.docx

  5. #5
    It's a lot more complicated with a normal document as a word source, especially without seeing that document (which you posted after I came up with the code), however the following should be close. It will however take some time to run:

    P.S. I have checked it against your samples and it appears to work


    Sub GetWords()
    'Graham Mayor - https://www.gmayor.com - Last updated - 17 Sep 2021
    Const sPath As String = "C:\Path\"    'the folder where the two documents are stored.
    Const sSource As String = "FileA.docx"
    Const sTarget As String = "FileB.docx"
    Dim oSource As Document, oTarget As Document
    Dim oStory As Range, oRng As Range
    Dim oCol As Collection
    Dim oWord As Object
    Dim sText As String
    Dim i As Long
        Set oCol = New Collection
        Set oSource = Documents.Open(sPath & sSource)
        Set oTarget = Documents.Open(sPath & sTarget)
    
        On Error Resume Next
        For Each oStory In oSource.StoryRanges
            For Each oWord In oStory.Words
                sText = Trim(oWord)
                If OnlyLetters(sText) = True Then
                    oCol.Add sText, sText
                End If
                DoEvents
            Next oWord
            If oStory.StoryType <> wdMainTextStory Then
                While Not (oStory.NextStoryRange Is Nothing)
                    Set oStory = oStory.NextStoryRange
                    For Each oWord In oStory.Words
                        sText = Trim(oWord)
                        If OnlyLetters(sText) = True Then
                            oCol.Add sText, sText
                        End If
                        DoEvents
                    Next oWord
                Wend
            End If
            DoEvents
        Next oStory
        On Error GoTo 0
        Options.DefaultHighlightColorIndex = wdYellow
        For i = 1 To oCol.Count
            Set oRng = oTarget.Range
            With oRng.Find
                .ClearFormatting
                .Text = CStr(oCol(i))
                .Replacement.ClearFormatting
                .Replacement.Text = "^&"
                .Replacement.Highlight = True
                .Forward = True
                .Wrap = wdFindContinue
                .Format = True
                .MatchCase = True
                .MatchWholeWord = True
                .MatchWildcards = False
                .MatchSoundsLike = False
                .MatchAllWordForms = False
                .Execute Replace:=wdReplaceAll
            End With
            DoEvents
        Next i
        MsgBox "Check complete!"
    lbl_Exit:
        oSource.Close 0
        Set oWord = Nothing
        Set oSource = Nothing
        Set oTarget = Nothing
        Set oRng = Nothing
        Set oCol = Nothing
        Exit Sub
    End Sub
    
    Private Function OnlyLetters(sWord As String) As Boolean
    Dim intPos As Integer
        For intPos = 1 To Len(sWord)
            Select Case Asc(Mid(sWord, intPos, 1))
                Case 65 To 90, 97 To 122
                    OnlyLetters = True
                Case Else
                    OnlyLetters = False
                    Exit For
            End Select
        Next
    lbl_Exit:
        Exit Function
    End Function
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  6. #6
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    Dear Graham

    That's
    fantastic, I tested it and it works with the below notes:

    - It didn't detect this word
    Plaintiff’s from the source file (FileA), I mean that it's found also on (FileB) but it's not highlighted.

    - This great macro didn't run or work on the text inside the text box which is located under the table.

    Is there a solution for that 2 issues?


  7. #7
    Words with apostrophes can be accommodated easily enough by adding the apostrophe characters to the OnlyGetLetters function.
    I have added unique words to the collection, but PLAINTIFF and plaintiff are considered to be the same word, so the practical solution is to change the search to ignore case.
    Text boxes are frankly a pain as they are essentially graphics and not in the text story ranges. The following will also handle text boxes.
    There are lots of instances of the letter 'a' highlighted. I have eliminated them with the two entries
    If OnlyLetters(sText) = True And Len(sText) > 1 Then
    If you want the 'a' instances highlighted remove the following bit from those two lines.
    And Len(sText) > 1

    You can find my web site at https://www.gmayor.com/Word_pages.htm if you wish to demonstrate your appreciation for the work that went into this.

    Sub GetWords()
    'Graham Mayor - https://www.gmayor.com - Last updated - 17 Sep 2021
    Const sPath As String = "C:\Path\"    'the folder where the two documents are stored.
    Const sSource As String = "FileA.docx"
    Const sTarget As String = "FileB.docx"
    Dim oSource As Document, oTarget As Document
    Dim oStory As Range, oRng As Range
    Dim oShp As Shape
    Dim oCol As Collection
    Dim oWord As Object
    Dim sText As String
    Dim i As Long
        Set oCol = New Collection
        Set oSource = Documents.Open(sPath & sSource)
        Set oTarget = Documents.Open(sPath & sTarget)
    
        On Error Resume Next
    
        For Each oStory In oSource.StoryRanges
            Select Case oStory.StoryType
                Case 1 To 11
                    Do
                        For Each oWord In oStory.Words
                            sText = Trim(oWord)
                            If OnlyLetters(sText) = True And Len(sText) > 1 Then
                                oCol.Add sText, sText
                            End If
                            DoEvents
                        Next oWord
                        DoEvents
    
                        Select Case oStory.StoryType
                            Case 6, 7, 8, 9, 10, 11
                                If oStory.ShapeRange.Count > 0 Then
                                    For Each oShp In oStory.ShapeRange
                                        If oShp.TextFrame.HasText Then
                                            For Each oWord In oShp.TextFrame.TextRange.Words
                                                sText = Trim(oWord)
                                                If OnlyLetters(sText) = True And Len(sText) > 1 Then
                                                    oCol.Add sText, sText
                                                End If
                                                DoEvents
                                            Next oWord
                                        End If
                                        DoEvents
                                    Next oShp
                                End If
                            Case Else
                                'Do Nothing
                        End Select
                        'Get next linked story (if any)
                        Set oStory = oStory.NextStoryRange
                    Loop Until oStory Is Nothing
                Case Else
            End Select
            DoEvents
        Next oStory
        Options.DefaultHighlightColorIndex = wdYellow
        For i = 1 To oCol.Count
    Debug.Print oCol(i)
            For Each oStory In oTarget.StoryRanges
                Select Case oStory.StoryType
                    Case 1 To 11
                        Do
                            With oStory.Find
                                .ClearFormatting
                                .Text = CStr(oCol(i))
                                .Replacement.ClearFormatting
                                .Replacement.Text = "^&"
                                .Replacement.Highlight = True
                                .Forward = True
                                .Wrap = wdFindContinue
                                .Format = True
                                .MatchCase = False
                                .MatchWholeWord = True
                                .MatchWildcards = False
                                .MatchSoundsLike = False
                                .MatchAllWordForms = False
                                .Execute Replace:=wdReplaceAll
                            End With
                            DoEvents
    
                            Select Case oStory.StoryType
                                Case 6, 7, 8, 9, 10, 11
                                    If oStory.ShapeRange.Count > 0 Then
                                        For Each oShp In oStory.ShapeRange
                                            If oShp.TextFrame.HasText Then
                                                Set oRng = oShp.TextFrame.TextRange
                                                With oRng.Find
                                                    .ClearFormatting
                                                    .Text = CStr(oCol(i))
                                                    .Replacement.ClearFormatting
                                                    .Replacement.Text = "^&"
                                                    .Replacement.Highlight = True
                                                    .Forward = True
                                                    .Wrap = wdFindContinue
                                                    .Format = True
                                                    .MatchCase = True
                                                    .MatchWholeWord = True
                                                    .MatchWildcards = False
                                                    .MatchSoundsLike = False
                                                    .MatchAllWordForms = False
                                                    .Execute Replace:=wdReplaceAll
                                                End With
                                            End If
                                            DoEvents
                                        Next oShp
                                    End If
                                Case Else
                                    'Do Nothing
                            End Select
                            'Get next linked story (if any)
                            Set oStory = oStory.NextStoryRange
                        Loop Until oStory Is Nothing
                    Case Else
                End Select
                DoEvents
            Next oStory
        Next i
        MsgBox "Check complete!"
    lbl_Exit:
        oSource.Close 0
        Set oWord = Nothing
        Set oSource = Nothing
        Set oTarget = Nothing
        Set oRng = Nothing
        Set oCol = Nothing
        Exit Sub
    End Sub
    
    Private Function OnlyLetters(sWord As String) As Boolean
    Dim intPos As Integer
        For intPos = 1 To Len(sWord)
            Select Case Asc(Mid(sWord, intPos, 1))
                Case 65 To 90, 97 To 122, 146, 180
                    OnlyLetters = True
                Case Else
                    OnlyLetters = False
                    Exit For
            End Select
        Next
    lbl_Exit:
        Exit Function
    End Function
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  8. #8
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    Dear Graham

    Thanks a lot for this effort you did for me, you really helped me a lot.

    I visited your website, but I don't know how
    to demonstrate my appreciation for the work that went into this, Could you please let me know how could I do that?



  9. #9
    There are donation links on every page
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  10. #10
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    Okay, I did the donation as shown below:

    https://ibb.co/0Z58BL0
    https://ibb.co/6NzLRD8


    Thanks for your help.

  11. #11
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    Hi Graham,

    Can I ask you for something on that Macro, if the target file has RTL (right to left) language, I noticed that the highlighted words are flipped, Is this issue can be solved?

    I attached here both the sample source and target files, along with a screenshot for the issue.
    Attached Images Attached Images
    Attached Files Attached Files

  12. #12
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    Hi Graham,

    Do you have any clue for the above reported issue?

  13. #13
    I regret I have no experience with right to left languages.
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  14. #14
    VBAX Regular
    Joined
    Dec 2015
    Posts
    88
    Location
    No Worries - Thanks for your reply .

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •