PDA

View Full Version : Extracting data from word to excel



JLEnglish
07-06-2015, 02:26 AM
Hi,
So, basically I have a lot of word files that I'd like to convert into excel files. I have sets of data in the word files that are labeled. I'd like to put anything under them until the next label in a cell then move on to the next cell. I have working code that completes the task for the first set of data. However, I can't seem to get it to loop and extract from multiple files. Ideally, I'd like for it to finish one set of data, go to the next row, repeat, repeat until the end of the word file. Then, I'd like it to make a new tab and do it for the next word file. I've attached two word files and what I'd like it to look like in excel as an example. Thanks for the help.
13871
13872
13873

Here's the code I have:


Sub Macro()
Dim word_ap As Object
Set word_ap = CreateObject("Word.Application")
word_ap.Visible = True

Dim oDoc As Object
Set oDoc = word_ap.Documents.Open(ThisWorkbook.Path & "\vba_01.docx")

Dim p, v, a, s(1 To 7) As String
a = Split("MAN,WOMAN,KID", ",")
For Each p In oDoc.Paragraphs
v = v & vbLf & p.Range.Text
Next

If v Like "*M*KK*dd*" Then
v = Split(v, a(0))
s(1) = v(0)
s(2) = a(0)

v = Split(v(1), a(1))
s(3) = v(0)
s(4) = a(1)

v = Split(v(1), a(2))
s(5) = v(0)
s(6) = a(2)
s(7) = v(1)
Range("a1").Resize(, 7) = s
End If

oDoc.Close
Set oDoc = Nothing

word_ap.Quit False
Set word_ap = Nothing
End Sub

snb
07-06-2015, 05:23 AM
You'd better standardize the data in the Word document first.


Sub M_snb()
sn = Split(ActiveDocument.Content, Chr(12) & vbCr)
ReDim sp(UBound(sn), 6)

For j = 0 To UBound(sn)
sp(j, 0) = Split(sn(j), vbCr)(0)
sq = Split(Replace(sn(j), sp(j, 0) & vbCr, ""), vbCr & vbCr)
For jj = 0 To UBound(sq) - 1
st = Split(sq(jj), vbCr)
sp(j, 1 + 2 * jj) = st(0)
sp(j, 2 + 2 * jj) = Replace(sq(jj), st(0) & vbCr, "")
Next
Next

GetObject(, "excel.application").activeworkbook.sheets(1).Cells(1).Resize(UBound(sp) + 1, UBound(sp, 2)) = sp
End Sub

JLEnglish
07-06-2015, 07:52 AM
Thanks for the help. I corrected the word files and reuploaded them here:
13875
13877
13876

I also noticed I made this mistake so I changed it too.

If v Like "*M*KK*dd*" Then

to

If v Like “*MAN*WOMAN*KID*” Then


The important things are the repeated category such as MAN, WOMAN and KID and unrepeated title (red letters)
The contents I included could become several lines, which should be inserted in one cell among the categories.
I'm quite new to this vba coding so I could not understand what the code you gave me was.
The only thing I know is how to add the number of categories.
Please, let me know how I can use your code with brief explanation if possible. (Where should I copy and paste the code?)
Thank you very much.