PDA

View Full Version : [SOLVED:] Extract data from Word to Excel



Vekmaa
10-26-2018, 12:20 PM
Hello,

I am a noob with VBA and have written a few basic macros to perform calculations in Excel. However, the task I am currently dealing with is more daunting.

Basically, I have multiple files and folders containing Word documents that are all form letters. I need to extract the fie number, date Month Day, Year(convert to dd/mm/yyyy), and business name into their own columns. The data is in the same location in very file, the first line is the file number, the third line is the date, the fifth line is the business name.

If possible link file number to Word document.

Any help or links to tutorials or reading materials would be greatly appreciated.

Sincerely,
Vekmaa

macropod
10-26-2018, 01:00 PM
This kind of thing has been addressed in numerous threads in the Integration/Automation of Office Applications Help forum, which is where your question has now been moved to. I suggest you do a search there.

gmayor
10-27-2018, 02:14 AM
See http://www.gmayor.com/ExtractDataFromForms.htm (http://www.gmayor.com/ExtractDataFromForms.htm) and https://www.gmayor.com/extract_data_from_similar_docs.htm

Vekmaa
10-29-2018, 11:59 AM
Thank you for the links/suggestions.

Vekmaa
10-30-2018, 08:45 AM
Hi,

I made some modifications to this code; however, I still needs some help to get it perform the tasks I need it to do.
It finds the data and populates the spreadsheet, but how do I get the program to iterate through hundreds of word documents, both *.doc and *.docx?

I referenced a few books and online resources, but the code I tried open word on screen and had to be manually closed. My local library's books are from the late 90's, the syntax has likely changed. Is there a method for it to run in the background and iterate through the files in a folder? Please help!

Thank you,
Vekmaa


Option Explicit

Sub GrabUsage()
Dim FName As String, FD As FileDialog
Dim WApp As Object, WDoc As Object, WDR As Object
Dim ExR As Range
Set ExR = Selection ' current location in Excel Sheet
'let's select the WORD doc
Set FD = Application.FileDialog(msoFileDialogOpen)
FD.Show
If FD.SelectedItems.Count <> 0 Then
FName = FD.SelectedItems(1)
Else
Exit Sub
End If

' open Word application and load doc
Set WApp = CreateObject("Word.Application")
' WApp.Visible = True
Set WDoc = WApp.Documents.Open(FName)
'=====================================================================
' go home and search
WApp.Selection.HomeKey Unit:=6
WApp.Selection.Find.ClearFormatting
WApp.Selection.Find.Execute "File: "
'move cursor from find to final data item
' WApp.Selection.MoveDown Unit:=5, Count:=1
WApp.Selection.MoveRight Unit:=1, Count:=1
'the miracle happens here
WApp.Selection.MoveRight Unit:=1, Count:=8, Extend:=1
'grab and put into excel
Set WDR = WApp.Selection
ExR(1, 1) = WDR ' place at Excel cursor
'=====================================================================
'repeat
WApp.Selection.HomeKey Unit:=6
WApp.Selection.Find.ClearFormatting
'WApp.Selection.Find.Execute "File: "

'move cursor from find to final data item
WApp.Selection.MoveDown Unit:=wdLine, Count:=2
'WApp.Selection.MoveRight Unit:=1, Count:=1

'the miracle happens here
WApp.Selection.MoveRight Unit:=1, Count:=16, Extend:=1
'grab and put into excel
Set WDR = WApp.Selection
ExR(1, 2) = WDR ' place in cell right of Excel cursor
'======================================================================
'repeat
WApp.Selection.HomeKey Unit:=6
WApp.Selection.Find.ClearFormatting
'WApp.Selection.Find.Execute "File: "

'move cursor from find to final data item
WApp.Selection.MoveDown Unit:=wdLine, Count:=3
WApp.Selection.MoveRight Unit:=1, Count:=1

'the miracle happens here
WApp.Selection.MoveRight Unit:=wdWord, Count:=4, Extend:=wdExtend
'grab and put into excel
Set WDR = WApp.Selection
ExR(1, 3) = WDR ' place in cell right of Excel cursor
'======================================================================
'repeat
WApp.Selection.HomeKey Unit:=6
WApp.Selection.Find.ClearFormatting
WApp.Selection.Find.Execute "Licence No. "

'move cursor from find to final data item
' WApp.Selection.MoveDown Unit:=5, Count:=1
WApp.Selection.MoveRight Unit:=1, Count:=1

'the miracle happens here
WApp.Selection.MoveRight Unit:=1, Count:=6, Extend:=1
'grab and put into excel
Set WDR = WApp.Selection
ExR(1, 4) = WDR ' place in cell right of Excel cursor
'======================================================================
WDoc.Close
WApp.Quit
End Sub

macropod
10-30-2018, 02:30 PM
Try the following Excel macro. It assumes your references to 'line' 1, 3, & 5, are referring to paragraphs 1, 3, & 5; if not you need to provide clear advice on what these 'lines' are and how they're differentiated (e.g. via manual line breaks).

Sub GetDocData()
'Note: this code requires a reference to the Word object model.
'See under the VBE's Tools|References.
Application.ScreenUpdating = False
Dim wdApp As New Word.Application, wdDoc As Word.Document
Dim strFolder As String, strFile As String, r As Long
strFolder = GetFolder: If strFolder = "" Then GoTo ErrExit
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
Set wdDoc = wdApp.Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
r = r + 1
With wdDoc
ActiveSheet.Range("A" & r).Value = Split(.Paragraphs(1).Range.Text, vbCr)(0)
ActiveSheet.Range("B" & r).Value = Format(CDate(Trim(Split(.Paragraphs(3).Range.Text, vbCr)(0))), "DD/MM/YY")
ActiveSheet.Range("C" & r).Value = Split(.Paragraphs(5).Range.Text, vbCr)(0)
.Close SaveChanges:=False
End With
strFile = Dir()
Wend
ErrExit:
wdApp.Quit
Set wdDoc = Nothing: Set wdApp = Nothing
Application.ScreenUpdating = True
End Sub

Function GetFolder() As String
Dim oFolder As Object
GetFolder = ""
Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
Set oFolder = Nothing
End Function

Vekmaa
11-05-2018, 11:38 AM
Hello,

Thank you for the reply. I have go through hundreds of word docs and docx to extract the File 1234567,the DATE, the company name, and the Licene no. 123456(some docs Licence Number 12346).
Example document below.



File: 1234567

January 01, 2018

THE CORPORATION OF CORPORATIONS
PO BOX 123
SOMETOWN, BC A1B 2C3

Dear Sir or Madam:
Re: Some words about Licence No. 123456
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure
dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Yours truly,

Some Bureaucratic Administrator
Some Region

macropod
11-05-2018, 02:13 PM
It is impossible to tell from your post what differentiates the lines and, as you haven't said the result you're getting are wrong and, if so, how, I'm not in a position to give further advice.

Vekmaa
11-05-2018, 03:26 PM
Apologies for the lack of clarity. The lines 1, 3, 5 and 10 are the positions of the data I am trying to extract. But for lines 1 and 10 I only need the 7 digit and 6 digit numbers respectively. Not sure if that is more clear. :(

macropod
11-05-2018, 03:33 PM
This is the first time you've mentioned getting anything from 'line' 10 and you still haven't said what problems, if any, you're having getting data from the other 'lines'.

Vekmaa
11-07-2018, 09:41 AM
Macropod, thank you for your patience. The code I posted earlier works in extracting the data I am after. The issue is it opens a single document and then closes, so I was seeking
information on getting the macro to open a folder and run through all the docs in the folder, rather than having to run the macro for each individual file.


Sincerely,
Vekmaa

macropod
11-07-2018, 03:47 PM
The code I posted shows how to get data from a whole folder of documents; all you need do is correctly specify the 'lines' to process - something you seem determined not to clarify.