Consulting

Results 1 to 12 of 12

Thread: Extract data from Word to Excel

  1. #1
    VBAX Regular
    Joined
    Oct 2018
    Posts
    6
    Location

    Extract data from Word to Excel

    Hello,

    I am a noob with VBA and have written a few basic macros to perform calculations in Excel. However, the task I am currently dealing with is more daunting.

    Basically, I have multiple files and folders containing Word documents that are all form letters. I need to extract the fie number, date Month Day, Year(convert to dd/mm/yyyy), and business name into their own columns. The data is in the same location in very file, the first line is the file number, the third line is the date, the fifth line is the business name.

    If possible link file number to Word document.

    Any help or links to tutorials or reading materials would be greatly appreciated.

    Sincerely,
    Vekmaa

  2. #2
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    This kind of thing has been addressed in numerous threads in the Integration/Automation of Office Applications Help forum, which is where your question has now been moved to. I suggest you do a search there.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  3. #3
    Graham Mayor - MS MVP (Word) 2002-2019
    Visit my web site for more programming tips and ready made processes
    http://www.gmayor.com

  4. #4
    VBAX Regular
    Joined
    Oct 2018
    Posts
    6
    Location
    Thank you for the links/suggestions.

  5. #5
    VBAX Regular
    Joined
    Oct 2018
    Posts
    6
    Location
    Hi,

    I made some modifications to this code; however, I still needs some help to get it perform the tasks I need it to do.
    It finds the data and populates the spreadsheet, but how do I get the program to iterate through hundreds of word documents, both *.doc and *.docx?

    I referenced a few books and online resources, but the code I tried open word on screen and had to be manually closed. My local library's books are from the late 90's, the syntax has likely changed. Is there a method for it to run in the background and iterate through the files in a folder? Please help!

    Thank you,
    Vekmaa

    Option Explicit
    
    Sub GrabUsage()
    Dim FName As String, FD As FileDialog
    Dim WApp As Object, WDoc As Object, WDR As Object
    Dim ExR As Range
        Set ExR = Selection ' current location in Excel Sheet
        'let's select the WORD doc
        Set FD = Application.FileDialog(msoFileDialogOpen)
        FD.Show
        If FD.SelectedItems.Count <> 0 Then
            FName = FD.SelectedItems(1)
        Else
            Exit Sub
        End If
    
        ' open Word application and load doc
        Set WApp = CreateObject("Word.Application")
        ' WApp.Visible = True
        Set WDoc = WApp.Documents.Open(FName)
    '=====================================================================
            ' go home and search
                WApp.Selection.HomeKey Unit:=6
                WApp.Selection.Find.ClearFormatting
                WApp.Selection.Find.Execute "File: "
                'move cursor from find to final data item
                    ' WApp.Selection.MoveDown Unit:=5, Count:=1
                    WApp.Selection.MoveRight Unit:=1, Count:=1
                'the miracle happens here
                    WApp.Selection.MoveRight Unit:=1, Count:=8, Extend:=1
        'grab and put into excel
            Set WDR = WApp.Selection
            ExR(1, 1) = WDR ' place at Excel cursor
    '=====================================================================
            'repeat
                WApp.Selection.HomeKey Unit:=6
                WApp.Selection.Find.ClearFormatting
                'WApp.Selection.Find.Execute "File: "
        
                'move cursor from find to final data item
                    WApp.Selection.MoveDown Unit:=wdLine, Count:=2
                    'WApp.Selection.MoveRight Unit:=1, Count:=1
        
                'the miracle happens here
                    WApp.Selection.MoveRight Unit:=1, Count:=16, Extend:=1
        'grab and put into excel
            Set WDR = WApp.Selection
            ExR(1, 2) = WDR ' place in cell right of Excel cursor
    '======================================================================
            'repeat
                WApp.Selection.HomeKey Unit:=6
                WApp.Selection.Find.ClearFormatting
                'WApp.Selection.Find.Execute "File: "
        
                'move cursor from find to final data item
                    WApp.Selection.MoveDown Unit:=wdLine, Count:=3
                    WApp.Selection.MoveRight Unit:=1, Count:=1
        
                'the miracle happens here
                    WApp.Selection.MoveRight Unit:=wdWord, Count:=4, Extend:=wdExtend
        'grab and put into excel
            Set WDR = WApp.Selection
            ExR(1, 3) = WDR ' place in cell right of Excel cursor
    '======================================================================
            'repeat
                WApp.Selection.HomeKey Unit:=6
                WApp.Selection.Find.ClearFormatting
                WApp.Selection.Find.Execute "Licence No. "
        
                'move cursor from find to final data item
                    ' WApp.Selection.MoveDown Unit:=5, Count:=1
                    WApp.Selection.MoveRight Unit:=1, Count:=1
        
                'the miracle happens here
                    WApp.Selection.MoveRight Unit:=1, Count:=6, Extend:=1
        'grab and put into excel
            Set WDR = WApp.Selection
            ExR(1, 4) = WDR ' place in cell right of Excel cursor
    '======================================================================
        WDoc.Close
        WApp.Quit
    End Sub

  6. #6
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    Try the following Excel macro. It assumes your references to 'line' 1, 3, & 5, are referring to paragraphs 1, 3, & 5; if not you need to provide clear advice on what these 'lines' are and how they're differentiated (e.g. via manual line breaks).
    Sub GetDocData()
    'Note: this code requires a reference to the Word object model.
    'See under the VBE's Tools|References.
    Application.ScreenUpdating = False
    Dim wdApp As New Word.Application, wdDoc As Word.Document
    Dim strFolder As String, strFile As String, r As Long
    strFolder = GetFolder: If strFolder = "" Then GoTo ErrExit
    strFile = Dir(strFolder & "\*.doc", vbNormal)
    While strFile <> ""
      Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
      r = r + 1
      With wdDoc
        ActiveSheet.Range("A" & r).Value = Split(.Paragraphs(1).Range.Text, vbCr)(0)
        ActiveSheet.Range("B" & r).Value = Format(CDate(Trim(Split(.Paragraphs(3).Range.Text, vbCr)(0))), "DD/MM/YY")
        ActiveSheet.Range("C" & r).Value = Split(.Paragraphs(5).Range.Text, vbCr)(0)
        .Close SaveChanges:=False
      End With
      strFile = Dir()
    Wend
    ErrExit:
    wdApp.Quit
    Set wdDoc = Nothing: Set wdApp = Nothing
    Application.ScreenUpdating = True
    End Sub
     
    Function GetFolder() As String
        Dim oFolder As Object
        GetFolder = ""
        Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
        If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
        Set oFolder = Nothing
    End Function
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  7. #7
    VBAX Regular
    Joined
    Oct 2018
    Posts
    6
    Location
    Hello,

    Thank you for the reply. I have go through hundreds of word docs and docx to extract the File 1234567,the DATE, the company name, and the Licene no. 123456(some docs Licence Number 12346).
    Example document below.


    HTML Code:
    File:  1234567
    
    January 01, 2018
    
    THE CORPORATION OF CORPORATIONS
    PO BOX 123
    SOMETOWN, BC  A1B 2C3
    
    Dear Sir or Madam:
    Re:  Some words about Licence No. 123456
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. 
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure 
    dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non 
    proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    
    Yours truly,
     
    Some Bureaucratic Administrator
    Some Region

  8. #8
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    It is impossible to tell from your post what differentiates the lines and, as you haven't said the result you're getting are wrong and, if so, how, I'm not in a position to give further advice.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  9. #9
    VBAX Regular
    Joined
    Oct 2018
    Posts
    6
    Location
    Apologies for the lack of clarity. The lines 1, 3, 5 and 10 are the positions of the data I am trying to extract. But for lines 1 and 10 I only need the 7 digit and 6 digit numbers respectively. Not sure if that is more clear.

  10. #10
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    This is the first time you've mentioned getting anything from 'line' 10 and you still haven't said what problems, if any, you're having getting data from the other 'lines'.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

  11. #11
    VBAX Regular
    Joined
    Oct 2018
    Posts
    6
    Location
    Macropod, thank you for your patience. The code I posted earlier works in extracting the data I am after. The issue is it opens a single document and then closes, so I was seeking
    information on getting the macro to open a folder and run through all the docs in the folder, rather than having to run the macro for each individual file.


    Sincerely,
    Vekmaa

  12. #12
    Knowledge Base Approver VBAX Guru macropod's Avatar
    Joined
    Jul 2008
    Posts
    4,435
    Location
    The code I posted shows how to get data from a whole folder of documents; all you need do is correctly specify the 'lines' to process - something you seem determined not to clarify.
    Cheers
    Paul Edstein
    [Fmr MS MVP - Word]

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •