Consulting

Results 1 to 10 of 10

Thread: Extracting text from Word to Excel between "start" and "end" words

  1. #1
    VBAX Newbie
    Joined
    Sep 2010
    Posts
    3
    Location

    Extracting text from Word to Excel between "start" and "end" words

    Hi,

    I'm a novice VBA user in need of some help. I'm trying to extract sections of text (paragraphs) bounded by a "start" word and "end" word from a 150+ page word (Office 2003) document to an excel document. The document has a cover sheet, table of contents the with the paragraphs in question in the middle of this document. The paragraphs are always bounded by the same "start" word string that's enumerated and the same "end" word.

    START_1001
    Blah
    Blah
    Blah
    END_OF_PARAGRAPH

    START_1002
    Blah
    Blah
    Blah
    END_OF_PARAGRAPH


    I've attempted a section of code that somewhat works. The extract to excel part is troublesome as I can't seem to be able to index the row count properly not to have overlapping pasting of the paragraphs.

    Also I'm having trouble keeping the loop going from paragraph to paragraph to the end of the word document before closing the excel doc. How do you know you're at the end of the document?

    When the paragraph contains pictures in the word doc, when copied and pasted in excel the pictures are pasted on top of one another and on top of the text. Can this be corrected?

    Thanks in advance for the help.


    Option Explicit
    
    Sub CopyRequirementsBetweenWords()
    Dim appExcel As Object
        Dim objSheet As Object
        Dim aRange As Range
        Dim Endofline As Object
        Dim intRowCount As Integer
        Dim LineCount As Integer
    intRowCount = 1
    Do Until ActiveDocument.Bookmarks("\Sel").Range.End = _
        ActiveDocument.Bookmarks("\EndOfDoc").Range.End             'This is the "End of document" tag
    With Selection.Find
                 .Text = "START_"     'This is the "Start" word for the search
                 .Forward = True
                 .Wrap = wdFindContinue
                 .Format = False
                 .MatchCase = False
                 .MatchWholeWord = False
                 .MatchAllWordForms = False
                 .MatchSoundsLike = False
                 .MatchWildcards = True
            End With
    Selection.Find.Execute
            Selection.MoveLeft Unit:=wdCharacter, Count:=1
    Do
                  Selection.EndKey Unit:=wdLine, Extend:=wdExtend
    'This is the "End" word for the search
                  If InStr(1, Selection.Text, "END_") Then Exit Do
    Selection.MoveDown Unit:=wdLine, Extend:=wdExtend
                  LineCount = LineCount + 1
            Loop
    Selection.Copy
    If objSheet Is Nothing Then
                    Set appExcel = CreateObject("Excel.Application")
    'Change the file path to match the location of your test.xls
                    Set objSheet = appExcel.workbooks.Open("C:\temp\test1.xls").Sheets("Sheet1")
    intRowCount = 1
                End If
    objSheet.Cells(intRowCount, 1).Select
            objSheet.Paste
            intRowCount = intRowCount + LineCount
    Selection.MoveDown Unit:=wdLine, Count:=LineCount
    Loop    'Loop to next requirement for copy and pasting into Excel document
    If Not objSheet Is Nothing Then
            appExcel.workbooks(1).Close True
            appExcel.Quit
            Set objSheet = Nothing
            Set appExcel = Nothing
        End If
        Set aRange = Nothing
    End Sub
    Last edited by Aussiebear; 04-19-2023 at 08:22 PM. Reason: Adjusted the code tags

  2. #2
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    Welcome to VBA Express!

    Try this.

    Add
    Dim LastRow As Long
    Then

    LastRow = objSheet.Range("A65536").End(xlUp).Row
    objSheet.Cells(LastRow + 1, 1).Select
    objSheet.Paste 
    'intRowCount = intRowCount + LineCount
    'Selection.MoveDown Unit:=wdLine, Count:=LineCount
    David
    Last edited by Aussiebear; 04-19-2023 at 08:23 PM. Reason: Adjusted the code tags

  3. #3
    VBAX Newbie
    Joined
    Sep 2010
    Posts
    3
    Location
    Hi David,

    I've actually worked out some fixes to my code issues. The loop works to continue to go through to the next paragraph, and I was able to delete the superimposed pictures.

    The issue now is that if there is a table in the paragraph, the code enters the paragraph but gets stuck in an infinite loop right before entering the table to start copying - not sure how to fix this... Any thoughts?

    Thanks!
    Pat


    Option Explicit
    Sub CopyRequirementsBetweenWords()
    ' This macro copies requirements from a source document using a "Start" and "End" key word that defines a range
    ' and exports this selection to an Excel file.
    Dim appExcel As Object
        Dim objSheet As Object
        Dim aRange As Range
        Dim Endofline As Object
        Dim intRowCount As Integer
        Dim LineCount As Integer
        Dim StartText As String
        Dim EndText As String
        Dim OutputFileName As String
    intRowCount = 1
    'Define search parameters
        StartText = InputBox("Please enter your Start word")  'This is the "Start" word for the search
        EndText = InputBox("Please enter your End Word") 'This is the "End" word for the search
    'Define output parameters
        OutputFileName = InputBox("Please enter your Output File path, file name and extension (Ex: C:\Temp\Test.xls)")
    Do Until ActiveDocument.Bookmarks("\Sel").Range.End = _
        ActiveDocument.Bookmarks("\EndOfDoc").Range.End             'This is the "End of document" tag
    With Selection.Find
                 .Text = StartText     'This is the "Start" word for the search from the Inputbox
                 .Forward = True
                 .Wrap = wdFindContinue
                 .Format = False
                 .MatchCase = False
                 .MatchWholeWord = False
                 .MatchAllWordForms = False
                 .MatchSoundsLike = False
                 .MatchWildcards = True
                 LineCount = 1
            End With
    Selection.Find.Execute
            Selection.MoveLeft Unit:=wdCharacter, Count:=1
    Do
                  Selection.EndKey Unit:=wdLine, Extend:=wdExtend
    'This is the "End" word for the search from the Inputbox
                  If InStr(1, Selection.Text, EndText) Then Exit Do
                  'This section does not work with Word tables...cursor does not enter the table to start copying
                  Selection.MoveDown Unit:=wdLine, Extend:=wdExtend
                  LineCount = LineCount + 1
            Loop
    Selection.Copy
    LineCount = LineCount + 3
    If objSheet Is Nothing Then
                    Set appExcel = CreateObject("Excel.Application")
    'Change the file path to match the location of your test.xls
                    Set objSheet = appExcel.workbooks.Open(OutputFileName).Sheets("Sheet1")
    intRowCount = 1
                End If
    objSheet.Cells(intRowCount, 1).Select
            objSheet.Paste
            intRowCount = intRowCount + LineCount
    Selection.MoveDown Unit:=wdLine, Count:=1
    'Delete the newly pasted pictures in Excel
    On Error Resume Next
        objSheet.DrawingObjects.Visible = True     '"objSheet" enables stuff to happen in Excel
        objSheet.DrawingObjects.Delete
        On Error GoTo 0
    objSheet.Cells.ClearComments
    Loop    'Loop to next requirement for copy and pasting into Excel document
    If Not objSheet Is Nothing Then
            appExcel.workbooks(1).Close True
            appExcel.Quit
            Set objSheet = Nothing
            Set appExcel = Nothing
        End If
        Set aRange = Nothing
    End Sub
    Last edited by Aussiebear; 04-19-2023 at 08:24 PM. Reason: Adjusted the code tags

  4. #4
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Let me see if I have this correctly.

    You search for the starting word (StartText), and then, if found, move one character to the left.

    Selection.MoveLeft Unit:=wdCharacter, Count:=1
    And then

    Do 
                Selection.EndKey Unit:=wdLine, Extend:=wdExtend 
    'This is the "End" word for the search from the Inputbox
                If InStr(1, Selection.Text, EndText) Then Exit Do 
                 'This section does not work with Word tables...
    'cursor does not enter the table to start copying
                Selection.MoveDown Unit:=wdLine, Extend:=wdExtend 
                LineCount = LineCount + 1 
            Loop

    1. extend the selection to the line
    2. check to see if current Selection contains the ending word (EndText) and if not move down a line and repeat #2.

    YIKES!

    Is this correct? Is this what you are doing?
    Last edited by Aussiebear; 04-19-2023 at 08:25 PM. Reason: Adjusted the code tags

  5. #5
    VBAX Wizard
    Joined
    May 2004
    Posts
    6,713
    Location
    Question. Say Endtext is a word in a table. What do you want to happen? The whole table is used? Part of it up to EndText?

    Using Selection like this is very very inefficient.

  6. #6
    VBAX Newbie
    Joined
    Sep 2010
    Posts
    3
    Location
    Hi Fumei,

    I'm sure that the bulk of my code is far from efficient as I've only been using VBA for a few days. The logic behind the "start" word find was to look for it, identify it then go the beginning of the word then highlight the word and subsequent text up to and including the "end" word.

    Once I find the word and it's highlighted I move left one character, it actually moves the cursor to the beginning of the word.

    Selection.MoveLeft Unit:=wdCharacter, Count:=1
    I tried extending the selection to the line, but my implementation must be wrong as it highlights the entire document...

    Normally the "end text" would not be found in the table as it's a single line, but if it was I'd like to copy the entire table for pasting into Excel.

    -Pat
    Last edited by Aussiebear; 04-19-2023 at 08:26 PM. Reason: Adjusted the code tags

  7. #7
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    Can you upload two sample files to help resolve all the issues? One with the Word doc and one for Excel. In the Excel file, show the expected output.

    Zip and upload using Manage Attachments in Additional Options.

    Just a comment on your code so far.

    Learn to use ranges instead of Selection. The macro recorder uses Selection exclusively, but it's very inefficient and messy.

    David


  8. #8
    VBAX Newbie
    Joined
    Oct 2011
    Posts
    1
    Location

    Hi All

    I am new to this forum, and have found the answer to the above mentioned issue. Please let me know if i can post it

  9. #9
    Although this thread is quite old, your approach would be very much appreciated!

  10. #10
    VBAX Expert Dave's Avatar
    Joined
    Mar 2005
    Posts
    832
    Location
    Hi John Doe and Welcome to this forum! If U check the Knowledge Base of this form for "Extract and Transfer Paragraphs" might be helpful. I was the contributor. Good luck. Dave

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •