View Full Version : Splitting large document
leecable
07-06-2017, 06:20 AM
Good afternoon,
First time poster, and pretty much a newbie using VBA. I have been set a challenge to split a large document into lots of smaller documents by specific parameters.
This is what ive found so far..
Sub SplitNotes(delim As String, strFilename As String)
Dim doc As Document
Dim arrNotes
Dim I As Long
Dim X As Long
Dim Response As Integer
arrNotes = Split(ActiveDocument.Range, delim)
Response = MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections.Do you wish to proceed?", 4)
If Response = 7 Then Exit Sub
For I = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(I)) <> "" Then
X = X + 1
Set doc = Documents.Add
doc.Range = arrNotes(I)
doc.SaveAs ThisDocument.Path & "\" & strFilename & Format(X, "000")
doc.Close True
End If
Next I
End Sub
Sub test()
'delimiter & filename
SplitNotes "///", "Notes "
End Sub
Now, I can see where I need to put the 'delimiter' information. For the document ill be using ',,,,,' after each block of text that has to be split. The issue ive got is, the document is about 2500 pages long, and its going to take me a fair while to sit and split each thing with my delimiter. Is there a quicker way? The documents vary in length so this is making it a little more tricky!
The other question is, within each specific document there is a title and a description. I want each document to be named with the title and description values, from each specific document.
Whilst i'd be grateful for the answer, an explanation of how you did it, would also be great! Wanting to learn!
And yes, ive tried google.
Thanks folks!
gmaxey
07-06-2017, 09:05 AM
Maybe the document is already delimited by some other means. See: http://gregmaxey.com/word_tip_pages/document_splitter.html
If not then you will have to delimit it. Perhaps you could use find and replace to insert the delimiter
leecable
07-06-2017, 09:55 AM
Your splitter looks glorious Greg! I shall have a peruse and see if it is delimit'd in anyway! Not that i can clearly see! but find and replace could be my friend!
Any idea on the naming of the documents and whether it can be done?
gmayor
07-06-2017, 08:38 PM
Whether you can name the document from the contents depends on whether there is something common to each document that can be readily identified.
Was the large document created from mail merge, as that would make splitting and naming much simpler as mail merge defines each 'document' with a section break, and there is some commonality of layout to aid naming.
Kilroy
07-07-2017, 07:42 AM
This code I pieced together seems to work. It separates based on page breaks. It asks for each file a "save as". I'm sure the professionals here could streamline it.
Sub Split1()
Dim arr() As Variant
Dim i As Byte
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
arr = Array("^b", "^m", "^n")
For i = LBound(arr) To UBound(arr)
With Selection.Find
.Text = arr(i)
.Replacement.Text = "//kilroy//"
.Forward = True
.Wrap = wdFindContinue
End With
Selection.Find.Execute Replace:=wdReplaceAll
Next
Call Split2
End Sub
Sub Split2()
SplitNotes "//kilroy//"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
Dim oRng As Word.Range
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex
If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number = 0 Then
Set oRng = ActiveDocument.Range
oRng.Collapse wdCollapseEnd
oRng.MoveStart wdWord, -1
oRng.Delete
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub
Kilroy
07-07-2017, 11:29 AM
Guys I have 2 Questions:
1. How could I modify the above code to just automatically save the documents created instead of going through the "save as" dialogue? Is keeping them open an option?
2. Close the original document without saving?
gmayor
07-08-2017, 09:01 PM
The code does not display the saveas diallog. It uses the saveas VBA function to save the document
To close the original document name the document e.g as oSource then close oSource e.g. as follows
I have not tested your code!
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim oSource As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
Dim oRng As Word.Range
Set oSource = ActiveDocument
arrNotes = Split(oSource.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex
If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number = 0 Then
Set oRng = oSource.Range
oRng.Collapse wdCollapseEnd
oRng.MoveStart wdWord, -1
oRng.Delete
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close 0
End If
Next lngIndex
oSource.Close 0
End Sub
leecable
07-10-2017, 04:21 AM
Whether you can name the document from the contents depends on whether there is something common to each document that can be readily identified.
Was the large document created from mail merge, as that would make splitting and naming much simpler as mail merge defines each 'document' with a section break, and there is some commonality of layout to aid naming.
Managed to find and replace to insert a delimiter at the start of each new document, which is sorted. There is something common about all of the documents, they all contain a section of text called 'Title:' Which is identical in every document. Obviously the text after the heading is different in each. So, I would ideally like that to be the title of each saved document.
So I want it to split at the part where I have inserted the delimiter, and now pull the name of the title as the saved file name. Its got to be possible.
I have had a look into the code of your splitter -
With oDoc
.UpdateStylesOnOpen = False
NewDocPageSetUp oDoc
oDoc.SaveAs FileName:=pPath & pFileName & Format(i, "000") & pExt, FileFormat:=lngType
.Close wdDoNotSaveChanges
End With
So I am guessing its to do with this part.
leecable
07-10-2017, 04:54 AM
I have also managed to find this delightful piece of code
Sub GetRenameFiles()
Dim fd As FileDialog
Dim strFolder As String
Dim strFile As String
Dim aDoc As Document
Dim rngNewName As Range
Set fd = Application.FileDialog(msoFileDialogFolderPicker)
With fd
.Title = "Select the folder that contains the files."
If .Show = -1 Then
strFolder = .SelectedItems(1) & "\"
Else
MsgBox "You did not select a folder."
Exit Sub
End If
End With
strFile = Dir$(strFolder & "*.doc*")
While strFile <> ""
Set aDoc = Documents.Open(strFolder & strFile)
Set rngNewName = aDoc.Paragraphs(1).Range
rngNewName.MoveEnd wdCharacter, -1
aDoc.SaveAs2 rngNewName.Text
aDoc.Close
strFile = Dir$()
Wend
End Sub
Which will basically name them the first line. Can I navigate to a specific line? As the Title is always on the same line on each page?
leecable
07-10-2017, 05:55 AM
Cancel that, after scrolling down the document, it turns out that 'Title:' is not always in the same place within each individual document.
I'm also having issues with gregs splitter, in that its giving me 86 parts...of a document that's nearly 10000 pages long, this isn't correct :(
Kilroy
07-10-2017, 07:19 AM
What is separating your documents? Give us a sample of a few pages. delete the content if need be. Just need the beginning and end of a few sections.
leecable
07-10-2017, 07:31 AM
Ive got the documents splits, its just now the Save issue. I now have 800 files starting with notes 001, and so on.
Kilroy
07-10-2017, 08:24 AM
Ok that's great. What is the code you used? I would be interested in seeing it.
leecable
07-10-2017, 08:31 AM
So I used the original code I found,
Sub SplitNotes(delim As String, strFilename As String)
Dim doc As Document
Dim arrNotes
Dim I As Long
Dim X As Long
Dim Response As Integer
arrNotes = Split(ActiveDocument.Range, delim)
Response = MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections.Do you wish to proceed?", 4)
If Response = 7 Then Exit Sub
For I = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(I)) <> "" Then
X = X + 1
Set doc = Documents.Add
doc.Range = arrNotes(I)
doc.SaveAs ThisDocument.Path & "\" & strFilename & Format(X, "000")
doc.Close True
End If
Next I
End Sub
Sub test()
'delimiter & filename
SplitNotes "///", "Notes "
End Sub
Used the find and replace tool within word to insert my delimiter (delimit) into the document at the appropriate places. Which was swift. This then worked fine in splitting it. The issue is, I now have 800+ documents names Notes001. :(
Kilroy
07-10-2017, 08:47 AM
what was the common separator for each doc?How did you put in the delimit?
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.