PDA

View Full Version : Word VBA - splitting a word document into sections.



Michelle_s
01-29-2011, 07:48 AM
Hi Word-Gods!

I've got a word document that I need to split into new documents. Each page has a specific value on it that I could use to identify it as belonging to a section, i.e. the first 5 pages of the workbook might have the value 2468 on it. the next 3 pages might have the value 3579 on them, the subsequent 8 pages might have the value 6789 on them.

Can anyone show me how to do this in VBA? I was thinking that I could maybe:

1) find the last page that the values appear on
2) add a page break
3) copy the pages of the book into new documents based on locations of page breaks I've added.

But I don't know how to do that, especially the bits with question marks...

Can anyone help me out? I can't work this out and I need it for work :(

Any help would be greatly appreciated. x




Public Sub CreatePageBreaks_and SplitDoc()

Dim Mydoc As Document, NewDoc As Document
Set MyDoc = Word.Documents("C:\myfolder\mydoc.doc")
Selection.HomeKey Unit:=wdStory

MyConst = 2468

'get page that value last appears on
Dim y1 As Integer
Dim r1 As Word.Range
Set r1 = MyDoc.Content
If r1.Find.Execute(FindText:=MyConst, Forward:=False, Wrap:=False) = True Then
y1 = r1.Information(wdActiveEndPageNumber)
End If

'create page breaks based on page that last value is on?????

NewDocName = "c:\myfolder\mydoc_ " & MyConst

'copy pages up until/between page break and copy into a new doc?????

NewDoc.SaveAs filename:=NewDocName
NewDoc.Close
MyDoc.Close wdDoNotSaveChanges

End Sub

gmaxey
01-29-2011, 10:54 AM
You can split documents based on several user selected criteria using my:
http://gregmaxey.mvps.org/Document_Splitter.htm

Perhaps you could use your code to apply seciton breaks were needed and use it.

Tinbendr
01-29-2011, 02:07 PM
Not to diminish Greg's excellent utility, here's my effort.

Sub SplitByArray()
Dim aDoc As Document
Dim dDoc As Document
Dim MyArray As Variant
Dim Rng1 As Range
Dim A As Long
Dim RngStart As Long
'Delimiters seperated by space.
MyArray = Split("123 456 789")
Set aDoc = ActiveDocument
Set Rng1 = aDoc.Range
'Store Rng1 Start
RngStart = Rng1.Start
For A = 0 To UBound(MyArray)
'Find each delimiter
With Rng1.Find
.ClearFormatting
.Text = MyArray(A)
.Forward = True
.MatchWildcards = False
.Wrap = wdFindStop
.Execute
End With

If Rng1.Find.Found Then
'Delimiter found. Create new file.
Set dDoc = Documents.Add
dDoc.Range = aDoc.Range(RngStart, Rng1.End)

dDoc.SaveAs FileName:=MyArray(A), fileformat:=wdFormatDocumentDefault
dDoc.Close
RngStart = Rng1.End + 1
Rng1.End = aDoc.Range.End
End If

Next

'If no delimiter at the end of the doc,
'this section will capture the remaining text.
If RngStart < Rng1.End Then
Set dDoc = Documents.Add
dDoc.Range = aDoc.Range(RngStart, Rng1.End)
dDoc.SaveAs FileName:="Closing " & A + 1, fileformat:=wdFormatDocumentDefault
dDoc.Close
End If
End Sub

David

Michelle_s
01-30-2011, 04:05 AM
Hi Both,

Thanks so much for responding - really appreciated. Unfortunately, for my purposes at least, both approaches had a couple of issues.

gmaxey, your add-in looks to work well, except when I specified my user-defined value to split on, it returned the entire document. :( Perhaps this is because the values in the document are all in frames?

Tinbendr, your macro runs well, except there are some issues with the export of the data. 90% of the way there tho. :)

1) It definitely splits the document but I noticed that one of the newly split documents had 2 sections worth of data in. Also, the section for 4480 didn't include any data, but the other sections did.

2) Would it be possible to keep the formatting from the original document? Perhaps this is why data for section 4480 didn't copy over? Again, most the data is in frames.

I hope this doesn't sound too critical. I'm really grateful for you both taking the time to have a look. If you've got any thoughts on how to get around the 2 issues above that would be perfect.

Thanks so much. :)

Tinbendr
01-30-2011, 05:00 AM
Can you mock up a sample and upload it.

David

Tinbendr
01-30-2011, 05:36 AM
To correct the formatted text try this.

Replace this linedDoc.Range = aDoc.Range(RngStart, Rng1.End)
With this one.
dDoc.Range.FormattedText = aDoc.Range(RngStart, Rng1.End)

Michelle_s
01-30-2011, 05:42 AM
Hi David,

Thanks so much for your help. I've attached a sample.

It's a stripped down version of the proper document - you'll see from the footer that the main doc is about 130 pages long but it follows the same format. I've x'd and y'd out alot of the data but hopefully it will make sense. :) I don't need the first page called conflicts, if that makes sense.

Thanks again, you're really helpful.

Tinbendr
01-31-2011, 06:24 AM
Ok try this.

Sub SplitByArray()
Dim aDoc As Document
Dim dDoc As Document
Dim MyPath As String
Dim MyArray As Variant
Dim Rng1 As Range
Dim Rng2 As Range
Dim A As Long
Dim RngStart As Long
MyPath = "C:\MyFolder\"

'Delimiters seperated by space. Last Element must remain. "!999!
MyArray = Split("4480 2050 2210 2990 !999!")
Set aDoc = ActiveDocument
Set Rng1 = aDoc.Range
'Insert last helper Parameter
aDoc.Range.InsertAfter "!999!"
'Store Rng1 Start
RngStart = Rng1.Start
For A = 0 To UBound(MyArray) - 1
'Find each delimiter
With Rng1.Find
.ClearFormatting
.Text = MyArray(A)
.Forward = True
.MatchWildcards = False
.Wrap = wdFindStop
.Execute
End With

If Rng1.Find.Found Then
'Delimiter found. Create new file.

Set Rng2 = aDoc.Range(Rng1.End + 1, ActiveDocument.Range.End)
If UBound(MyArray) > 1 Then
With Rng2.Find
.ClearFormatting
.Text = MyArray(A + 1)
.Forward = True
.MatchWildcards = False
.Wrap = wdFindStop
.Execute
End With
If Rng2.Find.Found Then
Set dDoc = Documents.Add
dDoc.Range.FormattedText = aDoc.Range(Rng1.Start - 1, Rng2.Start - 2)
dDoc.SaveAs FileName:=MyPath & MyArray(A), fileformat:=wdFormatDocumentDefault
dDoc.Close
Rng1.Start = Rng1.End + 1
Rng1.End = aDoc.Range.End
End If
End If
End If
Next

'If no delimiter at the end of the doc,
'this section will capture the remaining text.
If Rng1.Start < Rng1.End Then
Set dDoc = Documents.Add
dDoc.Range = aDoc.Range(Rng1.Start, aDoc.Range.End)
dDoc.SaveAs FileName:="999- " & A + 1, fileformat:=wdFormatDocumentDefault
dDoc.Close
End If
End Sub

Michelle_s
01-31-2011, 07:49 AM
Hi David,

That's almost perfect, thanks! There's a wee problem though...

The code works absolutely perfectly with the supplied sample, but when I put it in the full document it fails to create new documents. I think this is where it has an issue:

Set Rng2 = aDoc.Range(Rng1.End + 1, ActiveDocument.Range.End)
If UBound(MyArray) > 1 Then
With Rng2.Find
.ClearFormatting
.Text = MyArray(A + 1)
.Forward = True
.MatchWildcards = False
.Wrap = wdFindStop
.Execute
End With
If Rng2.Find.Found Then
Set dDoc = Documents.Add
dDoc.Range.FormattedText = aDoc.Range(Rng1.Start - 1, Rng2.Start - 2)
dDoc.SaveAs FileName:=MyPath & MyArray(A), fileformat:=wdFormatDocumentDefault
dDoc.Close
Rng1.Start = Rng1.End + 1
Rng1.End = aDoc.Range.End
End If


It has no problem finding rng1 - when I hover over rng1 it says rng1.find.found = True

but when I hover over rng2.find.found, it evaluates as false so it doesn't jump into the part where it adds new docs.

Any thoughts? Sorry about this, it works BRILLIANTLY with the sample doc. The sample doc includes additional text (i.e. names etc) but the principle format of the documents should be the same....

Is it worth noting that I've added 97 items to the split array?

Michelle

Tinbendr
01-31-2011, 08:55 AM
Hover over .Text = MyArray(A + 1) and make sure the value is the NEXT element in the array.

If that is correct, manually perform a search to make sure it can find the next element.

Other than that I'm at a loss unless I can see a original (but perhaps smaller) version of the file.

Michelle_s
01-31-2011, 09:21 AM
Hi David,

I hovered over .Text = MyArray(A + 1) and yes, it finds the next item in the array.

For some reason it doesn't seem to pick it up because it returns all of the pages up until the !999! delimiter at the very end of the doc. It's like it's finding the 'section' start point but not finding the 'section' end point (the next item in the array) and going straight to the very end of the doc. I wonder if it's to do with special characters?

Btw, thanks for your message but unfortunately I can't send you a version of the original document - I'm sure you've got nothing but the best of intentions in mind but the data is too sensitive. I would have replied directly but I haven't sent enough messages to send PMs to other users.

Thanks for all your help though, really excellent stuff otherwise.

Michelle

Michelle_s
01-31-2011, 09:44 AM
I think I might have come up with a reason.

The list of items in the split array aren't necessarily in the order that they are in the document.

An obvious solution to this (if I'm correct) would be to match the order of items in the split array to that of the pages in the document. However, I'm not sure if I the order in the document will be static each time the report is ran. Interesting...

Michelle_s
01-31-2011, 09:53 AM
Yes, I think that's fixed it. All seems to be working well now.

David, you're a wonderful, wonderful man in so many ways.

Michelle.