PDA

View Full Version : Solved: data from Word form to excel sheet



Nooro
06-25-2013, 12:43 AM
Hey.
Pretty new to the VBA-coding business, and new on this forum, so excuse me if i've missed giving you some information in this post.

Anyhow, my problem is this. I have a hundred-ish word documents (will provide a sample) with tables, text and checkboxes. I'd like to extract certain data from these sheets (always the same from all word-documents, though these documents may vary a little bit in shape) and build a database with information from each word-document to one row each in the excel sheet. Oh, and I'l like to do the coding in excel.

For instance, in this sample, I'd like to export the taste, temperature, application, and then the result of the sausage and the paper (so that it shows good and unsatisfactory).

I've modified the content of the report, of course, so the real reports will include more parts, etc.

Hope you guys can help me.

SamT
06-25-2013, 08:12 AM
Nooro,

Welcome to VBA Express, you will get an answer here, just not from me. I am not really an MS Word guy.

I id look at your Sample and noticed that it is not set up for easy data extraction. There are many ways to setup a Document for data extraction. Two that I know of are Bookmarks, and in later versions of Word, Smart Tags.

IF none of the Existing Documents are set up, you might want to change the template to use data extraction techniques for future coding ease.

IF you can cleanse an existing Document of all personal and proprietary data, then upload it here, it would be a real help when a real Word guru sees this thread. Your sample just isn't complete enough.

Be sure to tell us exactly what kind of changes in document "Shape" we can expect.

Nooro
06-25-2013, 12:12 PM
hey, thanks for replying. :)

None of the hundreds of documents have been bookmarked (not that im sure exactly how that'd work. About the template, i'll be working on making a new one that'll be easier, but i still need to manage the data from previous reports, so i was hoping someone could help me here.

What I've done with this sample, basically, is removing all of the data, and replacing it other data. The real document will include a larger distribution list, a larger summary of a paragraph or two, that table with data that I only want _some_ of the data from, and then that list of parts where you check if they were good, failed, etc.

All of the documents basically look like that, they just have a bit more data in each table. The number of tables, etc, will still be the same.

If it's still unclear i'll see what I can do about uploading another sample that might be a bit more clear to you, but I cant do that until tomorrow.

SamT
06-25-2013, 02:50 PM
Word data can be extracted by
Folder(Path),Documents(n)Sections(n).Pages(n).Tables(n).Range(n).Rows(n).Co lumns(n).Characters(n)

So you can see why it is neccessary to see an original document layout.

As I understand Bookmarks and Smart Tags, One can extract the data with a direct reference to the bookmark or smart tag, but exactly how :dunno

Why don't you mark this thread Solved with the Thread Tools menu and repost your issues in our Word VBA forum

http://www.vbaexpress.com/forum/forumdisplay.php?f=20

Nooro
06-25-2013, 10:45 PM
Okay, didnt think I had to be that specific with the data, I thought if I got help getting started with a code for the sample I provided I'd be able to adapt it to the real deal.
Well, if I have to be that specific with the sample and with what I need from the document.. I'm having a meeting tomorrow, deciding exactly what data I will need to extract from the word documents, so I'll drop by tomorrow and hopefully be able to give you a better sample as well as letting you know exactly what I need.

Hmm. Is this really a word VBA issue though? I mean, since I have hundreds of word documents that I need to extract data from and insert into an excel sheet, shouldnt the VBA coding be done in excel?
Sorry if im being obnoxious :P

GTO
06-26-2013, 12:31 AM
...I'd like to extract certain data from these sheets (always the same from all word-documents, though these documents may vary a little bit in shape) and build a database with information from each word-document to one row each in the excel sheet. Oh, and I'l like to do the coding in excel...

Hi there,

Could I suggest an alternative? I think Sam brings up an excellent point that the WORD part will likely be the most difficult, if there is any real variation in the current existing documents. That said, certainly it may benefit you to get help with the Excel portion to plunk the bits of data on the worksheet(s).

I would kindly suggest you take a look-see at several of the documents that you already have. Look to see the greatest variations, and make examples of these. Zip the examples, and start a thread in the WORD forum (with a link to this thread and add a link here to the question in the WORD forum) asking for suggestions at capturing the parts of the document wanted. Then maybe after a WORD guru says how to go about that the best way, we could help on the exporting.

Hope that is helpful,

Mark

Nooro
06-26-2013, 01:53 AM
Hey!
Good idea with the variation check. I'll take a look at it tomorrow after the meeting and will hopefully be able to give you examples (with the data removed, as it is classified). I'll start a thread in the Word-forum as well.
Thanks for helping :)

mancubus
06-26-2013, 03:03 AM
hi there.

showing how the excel file will look like the macro is run will help as well.

ie upload a manually prepared excel file :)

SamT
06-26-2013, 03:32 AM
Okay, didnt think I had to be that specific with the data, I thought if I got help getting started with a code for the sample I provided I'd be able to adapt it to the real deal.
Well, if I have to be that specific with the sample and with what I need from the document.. I'm having a meeting tomorrow, deciding exactly what data I will need to extract from the word documents, so I'll drop by tomorrow and hopefully be able to give you a better sample as well as letting you know exactly what I need.

Hmm. Is this really a word VBA issue though? I mean, since I have hundreds of word documents that I need to extract data from and insert into an excel sheet, shouldnt the VBA coding be done in excel?
Sorry if im being obnoxious :P

Obnoxious? Far From it, you're being very nice and considerate. Considerate towards your data and nice to us. :friends:

The issue is not whether it's Word VBA or Excel VBA. A :bug: person could use Power Point VBA to extract the data from Word into Excel. :rotlaugh:

VBA is VBA. The problem is that the Object Model for Word is not conducive to data extraction and one must be very familiar with the OM in order to find the data. It is only possible in your case because all the Documents have the same OM structure.

In Office 2007 and later, MS added some common objects to all the models that make Sharing data between different programs easier.

Many times in Word, instead of using Objects, on must use phrases to find data. Sometimes it is possible to use special words.

An example of a phrase might be "This is the 3rd Qtr report of Department Name of our Division Name..." You can recognize data identifiers because they will be the same in every Document.

Note how I replaced confidential information with bold generic labels. In your example replace all paragraphs with a short Ipsum Lorem and insert bold data identifying phrases and words in the appropriate Ipsum Lorem paragraph. It is important that every paragraph be represented, as Paras are part of the OM and can be iterated.

In the tables in the example, you can leave the cells blank but do put "Label Data" in any extractable column or row labels that might contain or imply confidential info.

If you need to extract an entire sentence, Sentencenize(sic) the Ipsum Lorem and insert the sentence, "This 3rd sentence is data." in the appropriate location. You can do the same to indicate you need to extract an entire paragraph.

The best example you can provide is an actual Document that you have sanitized in the above manner. That way you are preserving the complete structure of the Document.

snb
06-26-2013, 06:41 AM
Sub M_snb()
ReDim sn(0, 4)

With Getobject("G:\OF\sample.docx")
sn(0, 0) = Replace(Replace(.Tables(2).Cell(2, 2).Range, vbCr, ""), Chr(7), "")
sn(0, 1) = Replace(Replace(.Tables(2).Cell(3, 2).Range, vbCr, ""), Chr(7), "")
sn(0, 2) = Replace(Replace(.Tables(2).Cell(4, 2).Range, vbCr, ""), Chr(7), "")
For j = 2 To 5
If .Tables(3).Cell(2, j).Range.FormFields(1).Result = True Then sn(0, 3) = Choose(j, "", "good", "satisfactory", "Unsatisfactory", "failed")
If .Tables(3).Cell(5, j).Range.FormFields(1).Result = True Then sn(0, 4) = Choose(j, "", "good", "satisfactory", "Unsatisfactory", "failed")
Next
.close 0
End with

Thisworkbook.sheets("sheet1").cells(rows.count,1).end(xlup).offset(1).resize(,5)=sn
End Sub

Nooro
06-27-2013, 10:12 PM
Hi all.
I would like to thank all of you for taking the time to help me with my issue. I said earlier that I would have a meeting with my boss about exactly what we wanted out of this project, and I would let you guys know about exactly what we were after.
However, my boss wanted to take this into a different direction, and it doesnt look like I'll have to do the VBA coding _yet_, or if it even will be me who will do it. (dont worry, i didnt get fired or anything like that. will just focus on my other assignmets :P)
So, I'll mark this thread as solved for now. If I get to the point where I'll have to do the coding I'll be back and pester you guys again. Until then, take care :)


tl;dr: Don't need to code anymore. So thanks, and maybe I'll seeya later!