PDA

View Full Version : Extracting Word Table cell Information from Multiple .docx Files on an Intranet page



matt stiles
11-23-2016, 09:43 AM
Hey All,

This code works for extracting some Word table cell information from all of the *docx files in the folder C:\Users\John.Smith\Desktop\Form Folder. Can this code be adapted to extract the same Word table cell information from all of the *.docx files on an company intranet page? The page uses .aspx

Please let me know if you have any questions.


Sub wordScrape()


Dim wrdDoc As Object, objFiles As Object, fso As Object, wordApp As Object
Dim sh1 As Worksheet
Dim x As Integer


If ActiveSheet.Name = "IRC Log Short" Then
For Each mytable In Sheet2.ListObjects
mytable.DataBodyRange.ClearContents
Next mytable
End If


FolderName = "C:\Users\John.Smith\Desktop\Form Folder" ' Change this to the folder containing your word documents


Set sh1 = ThisWorkbook.Sheets(1)
Set fso = CreateObject("Scripting.FileSystemObject")
Set wordApp = CreateObject("Word.application")
Set objFiles = fso.GetFolder(FolderName).Files


x = 6
For Each wd In objFiles
If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then
Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True)
sh1.Cells(x, 1) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=2, Column:=2).Range)
sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=14, Column:=1).Range)
sh1.Cells(x, 3) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=14, Column:=2).Range)
sh1.Cells(x, 4) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=14, Column:=6).Range)
sh1.Cells(x, 5) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=14, Column:=3).Range)
sh1.Cells(x, 6) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=14, Column:=4).Range)
sh1.Cells(x, 7) = Application.WorksheetFunction.Clean(wrdDoc.Tables(2).Cell(Row:=14, Column:=5).Range)
sh1.Cells(x, 8) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=1, Column:=4).Range)
x = x + 1
wrdDoc.Close
End If


Next wd
wordApp.Quit


Thank you for whatever help you may provide.

Sincerely,

Matt

matt stiles
11-28-2016, 07:04 AM
If it helps, the intranet page is laid out so that all of the file names have the same file naming convention, and they are listed as links that open the document in Word once selected. Each shortcut includes the file name and the file type, so "IRC" and ".docx" can be found in all of the relevant links that need to be opened.

Please let me know if you have any questions.

Sincerely,

Matt