Consulting

Results 1 to 15 of 15

Thread: Split Document with delimiter and Name new file

  1. #1
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location

    Split Document with delimiter and Name new file

    I have been researching this issue and ran into a wall. I want to split a Word document, based on a delimeter, and then name the new file a specific name as set in the actual delimeter. I found this post which provides a very good start. I will post the code below because I can't post links yet.

    What that code does is finds the delimeter "///" and splits the file into separate files with the name, "Notes 001", "Notes 002", . . .

    That's good, but ideally I would like to have the files names something more descriptive. So what I would like to do is actually name the file something in the delimeter itself. For example, let's say I have one (1) document and want to break it into three (3) separate files. Here is an example:

    This would be my introduction. "///","Introduction"
    This would be my body."///","body"
    This would be my conclusion."///","conclusion"

    When I run the macro as it is currently I would get three (3) files named: Notes 001, Notes 002, and Notes 003. But, ideally, what I would want is 3 separate files named: Introduction, body, and conclusion.

    Here is the current code I am using:


    Sub SplitNotes(delim As String, strFilename As String)
    Dim doc As Document
    Dim arrNotes
    Dim I As Long
    Dim X As Long
    Dim Response As Integer
    arrNotes = Split(ActiveDocument.Range, delim)
    Response = MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections.Do you wish to proceed?", 4)
    If Response = 7 Then Exit Sub
    For I = LBound(arrNotes) To UBound(arrNotes)
    If Trim(arrNotes(I)) <> "" Then
    X = X + 1
    Set doc = Documents.Add
    doc.Range = arrNotes(I)
    doc.SaveAs ThisDocument.Path & "\" & strFilename & Format(X, "000")
    doc.Close True
    End If
    Next I
    End Sub
    Sub test()
    'delimiter & filename
    SplitNotes "///", "Notes "
    End Sub

  2. #2
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    Try:

    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      arrTitles = Split("Intro,Body,Conclusion", ",")
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number <> 0 Then
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle & Format(lngNum, "000")
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub
    Sub test()
      SplitNotes "///"
    End Sub
    Greg

    Visit my website: http://gregmaxey.com

  3. #3
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location
    Quote Originally Posted by gmaxey View Post
    Try:

    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      arrTitles = Split("Intro,Body,Conclusion", ",")
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number <> 0 Then
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle & Format(lngNum, "000")
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub
    Sub test()
      SplitNotes "///"
    End Sub
    Thank you for the quick answer. I see what you did and perhaps I was not particularly clear in my original post. You hardcoded the names into the VBA...my names would change based on the delimeter in the document.

    So, in my original example, your code would work...

    This would be my introduction. "///","Introduction"
    This would be my body."///","body"
    This would be my conclusion."///","conclusion"

    but

    This would be blue. "///","blue"
    This would be green. "///","green"
    This would be red. "///","red"

    would still be the wrong names. What I would be looking for would be "blue.docx", "green.docx", and "red.docx"

    So if I ran the macro on the 1st document it would give me 3 separate files: Intoduction.docx, body.docx, and conclusion.docx
    and if I ran the macro on the 2nd document it would give me 3 separate files: blue.docx, green.docx, and red.docx

    Does this explain a little better?

    The idea would be to get the actual name of the document to be split from the delimeter itself, or in some other fashion right from the document.

  4. #4
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    Thr33_d

    In both examples, the name of the file you wanted to use was the last word of the file text? Is that always the case or are you wanting to define the file name of each part in the calling procedure i.e, Sub Test?

    The delimiter in the Split function has to be an entity. You can't have more than one delimiter.
    Greg

    Visit my website: http://gregmaxey.com

  5. #5
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location
    Quote Originally Posted by gmaxey View Post
    Thr33_d

    In both examples, the name of the file you wanted to use was the last word of the file text? Is that always the case or are you wanting to define the file name of each part in the calling procedure i.e, Sub Test?

    The delimiter in the Split function has to be an entity. You can't have more than one delimiter.
    Let me provide a few examples:

    1) I have a .docm file named "MasterFile.docm" it looks like this:

    This would be my introduction. "///","Introduction"
    This would be my body."///","body"
    This would be my conclusion."///","conclusion

    What I am looking for is a macro that will split the MasterFile.docm into three (3) separate files as follows:

    File "Introduction.docx" would consist of a single document containing:
    This would be my introduction.

    File "body.docx" would consist of a single document containing:
    This would be my body.

    File "conclusion.docx" would consist of a single document containing:
    This would be my conclusion.

    The goal is to have a global type macro that would split the document at the "///" delimiter and name the document whatever is designated in the actual document, NOT the macro.

    So if I were to use: "///","Hawaii" in the actual .docm file the new file would be created at the "///" and would be named "Hawaii.docx" I guess the way to look at it is: "///","{CreatedFileName}" would be the line in the original document where {CreatedFileName} is what the new split file will be named.

  6. #6
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    Like I said, the delimiter uses in the split function has to be a single entity e.g., "///"

    If you use a format like this:



    This is the introduction. Introduction///
    This is the body. Body///
    This is the conclusion. Conclusion///

    Then I think this will work:


    Sub test()
      SplitNotes "///"
    End Sub
    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      For lngIndex = 0 To UBound(arrNotes)
        ReDim Preserve arrTitles(lngIndex)
        arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
      Next lngIndex
        
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number <> 0 Then
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub
    Greg

    Visit my website: http://gregmaxey.com

  7. #7
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location
    Quote Originally Posted by gmaxey View Post
    Like I said, the delimiter uses in the split function has to be a single entity e.g., "///"

    If you use a format like this:



    This is the introduction. Introduction///
    This is the body. Body///
    This is the conclusion. Conclusion///

    Then I think this will work:


    Sub test()
      SplitNotes "///"
    End Sub
    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      For lngIndex = 0 To UBound(arrNotes)
        ReDim Preserve arrTitles(lngIndex)
        arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
      Next lngIndex
        
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number <> 0 Then
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub
    When I run this code, every document is named, "Misc 001" "Misc 002" "Misc 003" etc. The code you provided works to split the documents into separate documents, but the naming is the problem. Naming the file Misc 001 doesn't work because I have no way of knowing which file is which.

    I really appreciate you help with this...thank you.

  8. #8
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    My error comparison should have been = not <> Try:

    Sub test()
      SplitNotes "///"
    End Sub
    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      Dim oRng As Word.Range
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      For lngIndex = 0 To UBound(arrNotes)
        ReDim Preserve arrTitles(lngIndex)
        arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
      Next lngIndex
        
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number = 0 Then
              Set oRng = ActiveDocument.Range
              oRng.Collapse wdCollapseEnd
              oRng.MoveStart wdWord, -1
              oRng.Delete
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub
    Greg

    Visit my website: http://gregmaxey.com

  9. #9
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location
    EXCELLENT! Thank you...just did a little tweaking to get it to work. Only issue now is that it want's to save a "blank" file as the last document...and since it doesn't have a delimeter it prompts for the file to be named and saved. No big deal, I just have to debug a bit.

    But, THANK YOU!

    Quote Originally Posted by gmaxey View Post
    My error comparison should have been = not <> Try:

    Sub test()
      SplitNotes "///"
    End Sub
    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      Dim oRng As Word.Range
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      For lngIndex = 0 To UBound(arrNotes)
        ReDim Preserve arrTitles(lngIndex)
        arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
      Next lngIndex
        
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number = 0 Then
              Set oRng = ActiveDocument.Range
              oRng.Collapse wdCollapseEnd
              oRng.MoveStart wdWord, -1
              oRng.Delete
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub

  10. #10
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    Good. Glad I could help and sorry for being so slow about grasping what you were after.
    Greg

    Visit my website: http://gregmaxey.com

  11. #11
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location
    Quote Originally Posted by gmaxey View Post
    Good. Glad I could help and sorry for being so slow about grasping what you were after.
    RRRGGGHHH!!! Not sure if I should start a new thread or continue on this one...BUT...

    I have the documents splitting correctly and refined the VBA a bit, but now when the document is split, the new/created document doesn't have any of the formatting in the original document. Sigh...

    For example, if the original document has:
    This is centered
    This is bold
    This is underlined


    the new/split document will have the text, but none of the formatting. It won't be centered, bold or underlined.

    I've spent hours researching and testing but just cannot seem to get the formatting to transfer to the new/created document.

  12. #12
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    Thr33_d

    Do you have a name? Referring to someone as Th33_d doesn't seem natural ;-).

    The reason you are losing the format is because you are putting the text of your document sections into a string array. String arrays don't hold formatting.
    What you need to do is copy the text and paste it into the new document. While I can't completely write your code for you, this may help. It is taken from the document splitter add-in on my site:

    Case Me.OptionUD
        Selection.HomeKey Unit:=wdStory
        Set oRng1 = ActiveDocument.Range
        Set oRng2 = ActiveDocument.Range
        i = 1
        While i <= lngParts
          With oRng1.Find
            .Text = pStr
            If .Execute Then
              If i = 1 Then
                Set oRng3 = oRng1
                oRng3.End = oRng1.Start
                oRng3.Start = ActiveDocument.Range.Start
              Else
                oRng2.Start = oRng1.End
                With oRng2.Find
                  .Text = pStr
                  If .Execute Then
                    Set oRng3 = oRng1
                    oRng3.Start = oRng1.End '+ 1
                    oRng3.End = oRng2.Start
                  Else
                    oRng3.Start = oRng1.End '+ 1
                    oRng3.End = ActiveDocument.Range.End
                  End If
                End With
              End If
              oRng3.Copy
              oRng1.Collapse wdCollapseEnd
              oRng2.Collapse wdCollapseEnd
            End If
          End With
          Set oDoc = Documents.Add(ActiveDocument.AttachedTemplate.FullName)
          oDoc.Content.Delete 'Strip template boiler plate text
          With Selection
            .PasteAndFormat (wdFormatOriginalFormatting)
            .EndKey Unit:=wdStory
            .MoveLeft Unit:=wdCharacter, Count:=1
            .Delete Unit:=wdCharacter, Count:=1
          End With
          oDoc.UpdateStylesOnOpen = False
          oDoc.SaveAs FileName:=pPath & pFileName & Format(i, "000") & pExt, FileFormat:=lngType
          oDoc.Close wdDoNotSaveChanges
          i = i + 1
        Wend
    Greg

    Visit my website: http://gregmaxey.com

  13. #13
    VBAX Regular
    Joined
    Nov 2014
    Posts
    7
    Location
    Hah...Thr33_d is D.D. Diaz or DeeDee as my friends call me.

    I had no idea about the string arrays...even after you told me about the formatting issue, I searched and still didn't find much. I will get to it and see if I can get the formatting to transfer.

    Again, thank you so much for sharing your knowledge and experience.

  14. #14
    Microsoft Word MVP 2003-2009 VBAX Guru gmaxey's Avatar
    Joined
    Sep 2005
    Posts
    3,334
    Location
    Diaz,

    I'll let you choose your friends :-),

    I was bored and thought I would try to work this out.

    Using this format:

    This is the beginning. Beginning///
    This is the body. Body///
    This is the end. End

    The following should work without extra empty files and copy formatting:

    Sub test()
      SplitNotes "///"
    End Sub
    Sub SplitNotes(strDelimiter As String)
    Dim oThisDoc As Document
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long, lngStart As Long
    Dim strTitle As String
    Dim oRng As Word.Range
      Set oThisDoc = ActiveDocument
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      For lngIndex = 0 To UBound(arrNotes)
        ReDim Preserve arrTitles(lngIndex)
        arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " ")) & "///"
      Next lngIndex
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
      For lngIndex = LBound(arrTitles) To UBound(arrTitles)
        Select Case True
          Case lngIndex = 0
            Set oRng = oThisDoc.Range
            With oRng.Find
              .Text = arrTitles(lngIndex)
              If .Execute Then
                oRng.Start = oThisDoc.Range.Start
                oRng.Select
                lngStart = oRng.End
                For lngNum = 1 To Len(arrTitles(lngIndex)) + 1
                  oRng.End = oRng.End - 1
                Next
              End If
            End With
          Case lngIndex = UBound(arrTitles)
            Set oRng = oThisDoc.Range
            oRng.Start = lngStart
            lngStart = oRng.End
            For lngNum = 1 To Len(arrTitles(lngIndex)) - 2
              oRng.End = oRng.End - 1
            Next
            oRng.Select
            If oRng.Characters.First = vbCr Then oRng.Start = oRng.Start + 1
            oRng.Select
          Case Else
            Set oRng = oThisDoc.Range
            oRng.Start = lngStart
            With oRng.Find
              .Text = arrTitles(lngIndex)
              If .Execute Then
                oRng.Start = lngStart
                lngStart = oRng.End
                For lngNum = 1 To Len(arrTitles(lngIndex)) + 1
                  oRng.End = oRng.End - 1
                Next
                If oRng.Characters.First = vbCr Then oRng.Start = oRng.Start + 1
              End If
            End With
        End Select
        oRng.Copy
        Set oDoc = Documents.Add(ActiveDocument.AttachedTemplate.FullName)
        oDoc.Content.Delete 'Strip template boiler plate text
        oDoc.Range.PasteAndFormat (wdFormatOriginalFormatting)
        strTitle = Replace(arrTitles(lngIndex), "///", "")
        strTitle = Replace(strTitle, vbCr, "")
        On Error GoTo Err_Save
        oDoc.SaveAs ThisDocument.Path & "\" & strTitle
        oDoc.Close True
      Next lngIndex
    lbl_Exit:
      Exit Sub
    Err_Save:
      MsgBox Err.Number & " " & Err.Description
      Resume lbl_Exit
    End Sub
    Greg

    Visit my website: http://gregmaxey.com

  15. #15
    VBAX Regular
    Joined
    Aug 2018
    Posts
    8
    Location
    Greetings. I'm using the above code :

    Sub test()   SplitNotes "///"
    End Sub
    Sub SplitNotes(strDelimiter As String)
    Dim oDoc As Document
    Dim arrNotes() As String, arrTitles() As String
    Dim lngIndex As Long, lngNum As Long
    Dim strTitle As String
      arrNotes = Split(ActiveDocument.Range, strDelimiter)
      For lngIndex = 0 To UBound(arrNotes)
        ReDim Preserve arrTitles(lngIndex)
        arrTitles(lngIndex) = Right(arrNotes(lngIndex), Len(arrNotes(lngIndex)) - InStrRev(arrNotes(lngIndex), " "))
      Next lngIndex
        
      If MsgBox("This will split the document into " & UBound(arrNotes) + 1 & " sections. Do you wish to proceed?", vbYesNo) = vbNo Then Exit Sub
        For lngIndex = LBound(arrNotes) To UBound(arrNotes)
          If Trim(arrNotes(lngIndex)) <> "" Then
            Set oDoc = Documents.Add
            oDoc.Range = arrNotes(lngIndex)
            On Error Resume Next
            strTitle = arrTitles(lngIndex)
            If Err.Number <> 0 Then
              oDoc.SaveAs ThisDocument.Path & "\" & strTitle
            Else
               lngNum = lngNum + 1
              oDoc.SaveAs ThisDocument.Path & "\" & "Misc " & Format(lngNum, "000")
            End If
            oDoc.Close True
          End If
       Next lngIndex
     End Sub
    It works well, prompting me for split file names. But, as mentioned above, it doesn't preserve formatting.

    To address this issue, another, final code was provided. This lengthier code, unfortunately, doesn't prompt for file names (in fact, running it, I don't even know where the split files landed).

    It would be great to have both : a code that prompts for split file names and preserve formatting.

    Thank you in advance,

    Jean

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •