PDA

View Full Version : Split Document with delimiter and Name new file



Thr33_d
11-17-2014, 04:15 PM
I have been researching this issue and ran into a wall. I want to split a Word document, based on a delimeter, and then name the new file a specific name as set in the actual delimeter. I found this post which provides a very good start. I will post the code below because I can't post links yet.

What that code does is finds the delimeter "///" and splits the file into separate files with the name, "Notes 001", "Notes 002", . . .

That's good, but ideally I would like to have the files names something more descriptive. So what I would like to do is actually name the file something in the delimeter itself. For example, let's say I have one (1) document and want to break it into three (3) separate files. Here is an example:

This would be my introduction. "///","Introduction"
This would be my body."///","body"
This would be my conclusion."///","conclusion"

When I run the macro as it is currently I would get three (3) files named: Notes 001, Notes 002, and Notes 003. But, ideally, what I would want is 3 separate files named: Introduction, body, and conclusion.

Here is the current code I am using:



Sub SplitNotes(delim As String, strFilename As String)
Dim doc As Document
Dim arrNotes
Dim I As Long
Dim X As Long
Dim Response As Integer
arrNotes = Split(ActiveDocument.Range, delim)
Response = MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections.Do you wish to proceed?", 4)
If Response = 7 Then Exit Sub
For I = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(I)) <> "" Then
X = X + 1
Set doc = Documents.Add
doc.Range = arrNotes(I)
doc.SaveAs ThisDocument.Path & "\" & strFilename & Format(X, "000")
doc.Close True
End If
Next I
End Sub
Sub test()
'delimiter & filename
SplitNotes "///", "Notes "
End Sub

gmaxey
11-17-2014, 04:31 PM
Try:


Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
arrNotes = Split(ActiveDocument.Range, strDelimiter)
arrTitles = Split("Intro,Body,Conclusion", ",")
If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number <> 0 Then
oDoc.SaveAs ThisDocument.Path & "\" & strTitle & Format(lngNum, "000")
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub
Sub test()
SplitNotes "///"
End Sub

Thr33_d
11-17-2014, 07:28 PM
Try:


Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
arrNotes = Split(ActiveDocument.Range, strDelimiter)
arrTitles = Split("Intro,Body,Conclusion", ",")
If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number <> 0 Then
oDoc.SaveAs ThisDocument.Path & "\" & strTitle & Format(lngNum, "000")
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub
Sub test()
SplitNotes "///"
End Sub



Thank you for the quick answer. I see what you did and perhaps I was not particularly clear in my original post. You hardcoded the names into the VBA...my names would change based on the delimeter in the document.

So, in my original example, your code would work...

This would be my introduction. "///","Introduction"
This would be my body."///","body"
This would be my conclusion."///","conclusion"

but

This would be blue. "///","blue"
This would be green. "///","green"
This would be red. "///","red"

would still be the wrong names. What I would be looking for would be "blue.docx", "green.docx", and "red.docx"

So if I ran the macro on the 1st document it would give me 3 separate files: Intoduction.docx, body.docx, and conclusion.docx
and if I ran the macro on the 2nd document it would give me 3 separate files: blue.docx, green.docx, and red.docx

Does this explain a little better?

The idea would be to get the actual name of the document to be split from the delimeter itself, or in some other fashion right from the document.

gmaxey
11-17-2014, 07:53 PM
Thr33_d

In both examples, the name of the file you wanted to use was the last word of the file text? Is that always the case or are you wanting to define the file name of each part in the calling procedure i.e, Sub Test?

The delimiter in the Split function has to be an entity. You can't have more than one delimiter.

Thr33_d
11-18-2014, 07:33 AM
Thr33_d

In both examples, the name of the file you wanted to use was the last word of the file text? Is that always the case or are you wanting to define the file name of each part in the calling procedure i.e, Sub Test?

The delimiter in the Split function has to be an entity. You can't have more than one delimiter.

Let me provide a few examples:

1) I have a .docm file named "MasterFile.docm" it looks like this:

This would be my introduction. "///","Introduction"
This would be my body."///","body"
This would be my conclusion."///","conclusion

What I am looking for is a macro that will split the MasterFile.docm into three (3) separate files as follows:

File "Introduction.docx" would consist of a single document containing:
This would be my introduction.

File "body.docx" would consist of a single document containing:
This would be my body.

File "conclusion.docx" would consist of a single document containing:
This would be my conclusion.

The goal is to have a global type macro that would split the document at the "///" delimiter and name the document whatever is designated in the actual document, NOT the macro.

So if I were to use: "///","Hawaii" in the actual .docm file the new file would be created at the "///" and would be named "Hawaii.docx" I guess the way to look at it is: "///","{CreatedFileName}" would be the line in the original document where {CreatedFileName} is what the new split file will be named.

gmaxey
11-18-2014, 07:52 AM
Like I said, the delimiter uses in the split function has to be a single entity e.g., "///"

If you use a format like this:



This is the introduction. Introduction///
This is the body. Body///
This is the conclusion. Conclusion///

Then I think this will work:



Sub test()
SplitNotes "///"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex

If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number <> 0 Then
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub

Thr33_d
11-18-2014, 12:46 PM
Like I said, the delimiter uses in the split function has to be a single entity e.g., "///"

If you use a format like this:



This is the introduction. Introduction///
This is the body. Body///
This is the conclusion. Conclusion///

Then I think this will work:



Sub test()
SplitNotes "///"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex

If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number <> 0 Then
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub



When I run this code, every document is named, "Misc 001" "Misc 002" "Misc 003" etc. The code you provided works to split the documents into separate documents, but the naming is the problem. Naming the file Misc 001 doesn't work because I have no way of knowing which file is which.

I really appreciate you help with this...thank you.

gmaxey
11-18-2014, 02:25 PM
My error comparison should have been = not <> Try:


Sub test()
SplitNotes "///"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
Dim oRng As Word.Range
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex

If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number = 0 Then
Set oRng = ActiveDocument.Range
oRng.Collapse wdCollapseEnd
oRng.MoveStart wdWord, -1
oRng.Delete
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub

Thr33_d
11-18-2014, 04:32 PM
EXCELLENT! Thank you...just did a little tweaking to get it to work. Only issue now is that it want's to save a "blank" file as the last document...and since it doesn't have a delimeter it prompts for the file to be named and saved. No big deal, I just have to debug a bit.

But, THANK YOU!


My error comparison should have been = not <> Try:


Sub test()
SplitNotes "///"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
Dim oRng As Word.Range
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex

If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number = 0 Then
Set oRng = ActiveDocument.Range
oRng.Collapse wdCollapseEnd
oRng.MoveStart wdWord, -1
oRng.Delete
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub

gmaxey
11-18-2014, 06:37 PM
Good. Glad I could help and sorry for being so slow about grasping what you were after.

Thr33_d
11-20-2014, 07:41 AM
Good. Glad I could help and sorry for being so slow about grasping what you were after.

RRRGGGHHH!!! Not sure if I should start a new thread or continue on this one...BUT...

I have the documents splitting correctly and refined the VBA a bit, but now when the document is split, the new/created document doesn't have any of the formatting in the original document. Sigh...

For example, if the original document has:

This is centered

This is bold
This is underlined


the new/split document will have the text, but none of the formatting. It won't be centered, bold or underlined.

I've spent hours researching and testing but just cannot seem to get the formatting to transfer to the new/created document.

gmaxey
11-20-2014, 07:54 AM
Thr33_d

Do you have a name? Referring to someone as Th33_d doesn't seem natural ;-).

The reason you are losing the format is because you are putting the text of your document sections into a string array. String arrays don't hold formatting.
What you need to do is copy the text and paste it into the new document. While I can't completely write your code for you, this may help. It is taken from the document splitter add-in on my site:


Case Me.OptionUD
Selection.HomeKey Unit:=wdStory
Set oRng1 = ActiveDocument.Range
Set oRng2 = ActiveDocument.Range
i = 1
While i <= lngParts
With oRng1.Find
.Text = pStr
If .Execute Then
If i = 1 Then
Set oRng3 = oRng1
oRng3.End = oRng1.Start
oRng3.Start = ActiveDocument.Range.Start
Else
oRng2.Start = oRng1.End
With oRng2.Find
.Text = pStr
If .Execute Then
Set oRng3 = oRng1
oRng3.Start = oRng1.End '+ 1
oRng3.End = oRng2.Start
Else
oRng3.Start = oRng1.End '+ 1
oRng3.End = ActiveDocument.Range.End
End If
End With
End If
oRng3.Copy
oRng1.Collapse wdCollapseEnd
oRng2.Collapse wdCollapseEnd
End If
End With
Set oDoc = Documents.Add(ActiveDocument.AttachedTemplate.FullName)
oDoc.Content.Delete 'Strip template boiler plate text
With Selection
.PasteAndFormat (wdFormatOriginalFormatting)
.EndKey Unit:=wdStory
.MoveLeft Unit:=wdCharacter, Count:=1
.Delete Unit:=wdCharacter, Count:=1
End With
oDoc.UpdateStylesOnOpen = False
oDoc.SaveAs FileName:=pPath & pFileName & Format(i, "000") & pExt, FileFormat:=lngType
oDoc.Close wdDoNotSaveChanges
i = i + 1
Wend

Thr33_d
11-20-2014, 08:33 AM
Hah...Thr33_d is D.D. Diaz or DeeDee as my friends call me.

I had no idea about the string arrays...even after you told me about the formatting issue, I searched and still didn't find much. I will get to it and see if I can get the formatting to transfer.

Again, thank you so much for sharing your knowledge and experience.

gmaxey
11-20-2014, 09:44 AM
Diaz,

I'll let you choose your friends :-),

I was bored and thought I would try to work this out.

Using this format:

This is the beginning. Beginning///
This is the body. Body///
This is the end. End

The following should work without extra empty files and copy formatting:


Sub test()
SplitNotes "///"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oThisDoc As Document
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long, lngStart As Long
Dim strTitle As String
Dim oRng As Word.Range
Set oThisDoc = ActiveDocument
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " ")) & "///"
Next lngIndex
If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrTitles) To UBound(arrTitles)
Select Case True
Case lngIndex = 0
Set oRng = oThisDoc.Range
With oRng.Find
.Text = arrTitles(lngIndex)
If .Execute Then
oRng.Start = oThisDoc.Range.Start
oRng.Select
lngStart = oRng.End
For lngNum = 1 To Len(arrTitles(lngIndex)) + 1
oRng.End = oRng.End - 1
Next
End If
End With
Case lngIndex = UBound(arrTitles)
Set oRng = oThisDoc.Range
oRng.Start = lngStart
lngStart = oRng.End
For lngNum = 1 To Len(arrTitles(lngIndex)) - 2
oRng.End = oRng.End - 1
Next
oRng.Select
If oRng.Characters.First = vbCr Then oRng.Start = oRng.Start + 1
oRng.Select
Case Else
Set oRng = oThisDoc.Range
oRng.Start = lngStart
With oRng.Find
.Text = arrTitles(lngIndex)
If .Execute Then
oRng.Start = lngStart
lngStart = oRng.End
For lngNum = 1 To Len(arrTitles(lngIndex)) + 1
oRng.End = oRng.End - 1
Next
If oRng.Characters.First = vbCr Then oRng.Start = oRng.Start + 1
End If
End With
End Select
oRng.Copy
Set oDoc = Documents.Add(ActiveDocument.AttachedTemplate.FullName)
oDoc.Content.Delete 'Strip template boiler plate text
oDoc.Range.PasteAndFormat (wdFormatOriginalFormatting)
strTitle = Replace(arrTitles(lngIndex), "///", "")
strTitle = Replace(strTitle, vbCr, "")
On Error GoTo Err_Save
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
oDoc.Close True
Next lngIndex
lbl_Exit:
Exit Sub
Err_Save:
MsgBox Err.Number & " " & Err.Description
Resume lbl_Exit
End Sub

JPh
08-19-2018, 09:17 AM
Greetings. I'm using the above code :



Sub test() SplitNotes "///"
End Sub
Sub SplitNotes(strDelimiter As String)
Dim oDoc As Document
Dim arrNotes() As String, arrTitles() As String
Dim lngIndex As Long, lngNum As Long
Dim strTitle As String
arrNotes = Split(ActiveDocument.Range, strDelimiter)
For lngIndex = 0 To UBound(arrNotes)
ReDim Preserve arrTitles(lngIndex)
arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
Next lngIndex

If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
For lngIndex = LBound(arrNotes) To UBound(arrNotes)
If Trim(arrNotes(lngIndex)) <> "" Then
Set oDoc = Documents.Add
oDoc.Range = arrNotes(lngIndex)
On Error Resume Next
strTitle = arrTitles(lngIndex)
If Err.Number <> 0 Then
oDoc.SaveAs ThisDocument.Path & "\" & strTitle
Else
lngNum = lngNum + 1
oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
End If
oDoc.Close True
End If
Next lngIndex
End Sub

It works well, prompting me for split file names. But, as mentioned above, it doesn't preserve formatting.

To address this issue, another, final code was provided. This lengthier code, unfortunately, doesn't prompt for file names (in fact, running it, I don't even know where the split files landed).

It would be great to have both : a code that prompts for split file names and preserve formatting.

Thank you in advance,

Jean