PDA

View Full Version : Need help splitting Word document



turtledriver
06-22-2011, 01:59 PM
I'm using Word 2007. I need help splitting a large Word document into smaller files. While I've found variations on this board, I haven't found a solution to meet my exact needs. In my large document, I want a macro to search for 4 digits and perform a split at ONE LINE before those 4 digits. For example the first line would be "John Doe" and the second line would be "0034". I want the macro to search for the 0034 and split but include John Doe in the split. Then I'd like to name the separate files by the 4 digits. And of course the macro would contine to loop throughout the entire document until finished. Is this possible?

Frosty
06-24-2011, 11:15 AM
Yes, very possible. Why don't you show us how far you've gotten on your own? If you've found variations on this board, you should have gotten somewhere. Also, if you've cross-posted this elsewhere, please indicate you've done that.

In general, when approaching this kind of thing... forget the looping part-- that's pretty easy. Just record a macro which does everything you want it to do *except* for doing it a bunch of times. Then attempt to clean up the macro as best you can (which will also serve to help you understand what's happening), and then post your question and include the macro with the VBA tags.

What you're describing is finding some text, and then expanding the range a bit when you find the text, and then creating new files based on that. Not terribly difficult, but you're going to need to do some of the legwork to help us help you.

turtledriver
06-24-2011, 08:14 PM
Thanks for the reply, Frosty. I'm glad to hear it's a possibility. Basically, I found this code on this thread:
vbaexpress.com/kb/getarticle.php?kb_id=922

Here is the code:
Put this code In a standard module:
Option Explicit

Sub SplitNotes(delim As String, strFilename As String)
Dim doc As Document
Dim arrNotes
Dim I As Long
Dim X As Long
Dim Response As Integer

arrNotes = Split(ActiveDocument.Range, delim)

Response = MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", 4)
If Response = 7 Then Exit Sub
For I = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(I)) <> "" Then
X = X + 1
Set doc = Documents.Add
doc.Range = arrNotes(I)
doc.SaveAs ThisDocument.Path & "\" & strFilename & Format(X, "000")
doc.Close True
End If
Next I
End Sub


Sub test()
' delimiter & filename
SplitNotes "///", "Notes "
End Sub

It almost does everything I want. However, it's splitting based on a hard coded delimiter and naming the saved/split individual files based on the hard code (Notes 001, Notes 002, etc). So if I could figure out a way to code it to look for 4 numbers in a row and split there BUT also include the line directly above those 4 numbers in the split (which is a first and last name), AND use that 4 digit number as the saved file name, that would be perfect. I'm just not sure how to modify the code to make this happen.

Any help would be greatly appreciated!

Thanks again!

Frosty
06-24-2011, 09:29 PM
That code won't work for you, as it revolves around the Split function, rather than the Find function.

You'll need a couple posts to be able attach a document. In the meantime, why don't you record a macro which finds your number pattern. Look up wildcard searching (which will help you perfect your recorded macro), working with ranges, and describe what you mean by a line (is it a paragraph, automatically wrapped text, etc) when you say "line above."

There are other threads on this board which use .Find to locate areas of the document in order to split the document up, but the methodology above is barking up the wrong tree for your purposes.

When I'm back at a computer in a day or two, I can help move you along further, but see how far you can get.

Eventually, you'll have two functions: correctly identifying a range (and adjusting it to include the name), and then saving that range as a new document with a particular name.

turtledriver
06-27-2011, 04:57 AM
Thanks Frosty. I've been trying to figure out a solution based upon your suggestion but so far it's been to no avail. I'm not sure how to incorporate my macro to perform the operation. I'll keep trying.

Frosty
06-27-2011, 09:33 AM
I don't know what you mean by "incorporate my macro" -- unless you're talking about the code you've posted in this thread. That code won't work for you *at all* because it uses a methodology completely wrong for your needs.

The basic premise to learning how to do this (for free), is to deepen your understanding of word regular functions before you find the need to "code it up"-- in this case, you need to learn about the Find function. Ultimately, I think you're going to need to post a document with all sensitive stuff removed from it. However, in the meantime, the following may help you:

This thread was about taking an existing document, and splitting it into multiple documents based on criteria. http://www.vbaexpress.com/forum/showthread.php?t=36264

At its core, however, it all revolves around the following steps:

1. In word, just perform a normal search (CTRL+F).
2. Click the MORE button
3. Click the Special button and choose "Any Digit" (four times)
4. Click Find Next

Does that work? Do you find stuff in addition to your 4 digits? If you do (which seems highly possible, as you will also find the "2011" in June 27, 2011 with that limited search), then you are going to need to find a bit of a more advanced search.

Keep exploring the Special button and see if you can't find a search which finds *only* the stuff you want it to find. Maybe ".^p^#^#^#^#^P" will work (period, paragraph mark, 4 digits followed by a paragraph mark).

If you can't find that, read up on wildcard searches at http://word.mvps.org/faqs/General/UsingWildcards.htm

"[0-9]{4}" in the Find What box will be the equivalent of the "^#^#^#^#" when you aren't doing a wildcard search. Depending on what the document really looks like, one of the above methodologies should get you pretty darn close (in that you will *only* find the stuff you want to find in any of the documents you would want to run this macro on).

Once you've figured that out, you can start to record the macro (using as few steps as possible) to find one occurence of your digits. Then, you can continue recording as you hold down the shift key and click the up arrow. From there, you could continue recording the macro as you hit CTRL+SHIFT+HOME to select everything from just above that item to the very top of the document, copy it, create a new document, paste it in to the document, save the document, etc.

This is how you'll learn to be self-sufficient. Then, when you get stuck (and this is, obviously, a very limited to way to get working code), you will have specific questions that people can easily answer. As you can see, even using that methodology, the thread I pointed you at took 3 pages and 58 replies. That poster was very interested in learning the process-- but it still takes time.

The problem with just quickly rattling off a "solution" is that you will end up not learning anything, and it will still be wrong (because you haven't given enough criteria for me to provide you a complete solution in a single post).

Hope this helps.