PDA

View Full Version : [SOLVED:] Split Text File Into Multiple Text Files keeping the header row



dodonohoe
09-20-2023, 08:41 AM
Hi everyone, the following Macro works really well. You select the text file you want to split into smaller files, you then tell it how many rows max per new file you want, 10,000 for example. I would like for the macro to also keep the headings (first row) at the start of every new file created. Any help would be appreciated.


Public Sub Split_Text_File()


Dim FSO As Object
Dim TSRead As Object, TSWrite As Object
Dim inputFile As Variant, outputFile As String
Dim part As Long, i As Long, n As Long, p As Long
Dim maxRows As Long

inputFile = Application.GetOpenFilename(Title:="Select a text file to be split into separate parts")
If inputFile = False Then Exit Sub


maxRows = Application.InputBox("Max number of lines/rows?", Type:=1)
If maxRows = 0 Then Exit Sub

ReDim outputLines(maxRows - 1) As String
p = InStrRev(inputFile, ".")
part = 0
n = 0

Set FSO = CreateObject("Scripting.FileSystemObject")

Set TSRead = FSO.OpenTextFile(inputFile)

While Not TSRead.AtEndOfStream
outputLines(n) = TSRead.ReadLine
n = n + 1
If n = maxRows Then
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputLines, vbCrLf)
TSWrite.Close
ReDim outputLines(maxRows - 1) As String
n = 0
End If
Wend

TSRead.Close


If n > 0 Then
ReDim outputlines2(n - 1) As String
For i = 0 To n - 1
outputlines2(i) = outputLines(i)
Next
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines2, vbCrLf)
TSWrite.Close
End If

MsgBox "Done"

End Sub

June7
09-20-2023, 08:54 AM
Save the header row to a variable that can be called to build the output text.

Could you attach sample text file?

dodonohoe
09-20-2023, 12:22 PM
I can't attach the sample file, it keeps saying "File Invalid" even though it is a plain .txt file. The heading that I want repeated on each new page is actually line two on the page. Here is a mock up of the data.

H|2022_246|METLIFE|Ash|2020829||202928|4|200705 10:45
C|Project Code|Record Identifier|Client Id|NIPR #|Salutation|First Name|Middle Name|Last Name|Name Suffix|Address Line1|Address Line2|Address Line3|Address Line4|City|State|Zip|Country Name|Method Of Delivery|Postage Rate Code|Email Address|Email Template Code|Number of Additional Copies|Printing Location|Document List|Product List
D|2022_246|1|20000000002011|||Mark|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002012|||Mary|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002013|||May|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002014|||Mitch|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002015|||Mark|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Maven|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002016|||Tim|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Tom|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Tiny|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Trish|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Trevor|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Tina|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Terry|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Trev|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Chad|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET
D|2022_246|1|20000000002011|||Chuck|M|DAUNT||261 STREET||||NORTh|PA|7661637|US|P|3|||0|ML_DPL|36074|PREFPREMIERR_MET

June7
09-20-2023, 02:10 PM
Should be able to attach up to 2mb zip file. However, posted lines are enough to test with.

Consider:


...
Dim strH As String
...
Set TSRead = FSO.OpenTextFile(inputFile)
strH = TSRead.readline & vbCrLf
strH = strH & TSRead.readline & vbCrLf
...
TSWrite.Write strH & Join(outputLines, vbCrLf)
...
TSWrite.Write strH & Join(outputlines2, vbCrLf)
...

If you really don't want to include line 1 in subsequent pages, that gets a bit more complicated. Consider:


...
Dim strH1 As String, strH2 As String
...
Set TSRead = FSO.OpenTextFile(inputFile)
strH1 = TSRead.readline & vbCrLf
strH2 = TSRead.readline & vbCrLf
...
TSWrite.Write IIf(part = 1, strH1, "") & strH2 & Join(outputLines, vbCrLf)
...
TSWrite.Write strH2 & Join(outputlines2, vbCrLf)
...

dodonohoe
09-20-2023, 02:44 PM
Hi June7,

In this instance, line 1 can be deleted immediately and then line 2 (which has the headings) appears at the top of very new file created, if that makes sense?

Cheers,
Des

Aussiebear
09-20-2023, 02:56 PM
Here is the file. I have no idea why it wouldn't accept the file.

June7
09-20-2023, 03:55 PM
If you don't want line 1 even on the first part, then just modify code to not use strH1 in the Write line


TSWrite.Write strH2 & Join(outputLines, vbCrLf)

arnelgp
09-20-2023, 11:01 PM
you may also try this:


Public Sub Split_Text_File()




Dim FSO As Object
Dim TSRead As Object, TSWrite As Object
Dim inputFile As Variant, outputFile As String
Dim part As Long, i As Long, n As Long, p As Long
Dim maxRows As Long
Dim outputlines() As String

'arnelgp
Dim hdr(1) As String, sValue As String

inputFile = Application.GetOpenFilename(Title:="Select a text file to be split into separate parts")
If inputFile = False Then Exit Sub




maxRows = Application.InputBox("Max number of lines/rows?", Type:=1)
If maxRows = 0 Then Exit Sub

ReDim outputlines(maxRows + 2 - 1) As String
p = InStrRev(inputFile, ".")
part = 0
n = 0

Set FSO = CreateObject("Scripting.FileSystemObject")

Set TSRead = FSO.OpenTextFile(inputFile)

'get the header
Do While Not TSRead.AtEndOfStream
sValue = TSRead.ReadLine
hdr(n) = sValue
n = n + 1
If n > 1 Then
Exit Do
End If
Loop
n = 0
'loop till end of file
While Not TSRead.AtEndOfStream
outputlines(n + 2) = TSRead.ReadLine
n = n + 1
If n = maxRows Then
part = part + 1
outputlines(0) = hdr(0): outputlines(1) = hdr(1)
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines, vbCrLf)
TSWrite.Close
ReDim outputlines(maxRows + 2 - 1) As String
n = 0
End If
Wend

TSRead.Close




If n > 0 And Len(Trim$(outputlines(n - 1) & "")) <> 0 Then
ReDim outputlines2(n + 2 - 1) As String
outputlines2(0) = hdr(0): outputlines2(1) = hdr(1)
For i = 0 To n - 1
outputlines2(i + 2) = outputlines(i)
Next
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines2, vbCrLf)
TSWrite.Close
End If

MsgBox "Done"

End Sub

Paul_Hossler
09-21-2023, 07:23 AM
Here is the file. I have no idea why it wouldn't accept the file.

I made a TXT file and when I tried to upload it as a TXT file

31060


it fails

31059

You have to fool the system and call it a CSV

dodonohoe
09-25-2023, 03:40 AM
Hi June 7, I tried testing there and I might not have been clear on what I am looking to do. I want the first line completely ignored. The 2nd line is the header and I want that repeated at the top of every new file created if that makes sense?

dodonohoe
09-25-2023, 03:44 AM
Hi Arnelgp, I tested this and it leaves out one record. It also leaves in the first line which I want ignored as it is not a heading (I probably didn't explain that properly. What would also be super useful to the user is if a 2nd pop up box prompted them to tell the macro at what line the heading was as this will be variable depending on which file is being processed e.g Some files may have the heading on line 10 and lines 1-9 can be ignored. Thanks



you may also try this:


Public Sub Split_Text_File()




Dim FSO As Object
Dim TSRead As Object, TSWrite As Object
Dim inputFile As Variant, outputFile As String
Dim part As Long, i As Long, n As Long, p As Long
Dim maxRows As Long
Dim outputlines() As String

'arnelgp
Dim hdr(1) As String, sValue As String

inputFile = Application.GetOpenFilename(Title:="Select a text file to be split into separate parts")
If inputFile = False Then Exit Sub




maxRows = Application.InputBox("Max number of lines/rows?", Type:=1)
If maxRows = 0 Then Exit Sub

ReDim outputlines(maxRows + 2 - 1) As String
p = InStrRev(inputFile, ".")
part = 0
n = 0

Set FSO = CreateObject("Scripting.FileSystemObject")

Set TSRead = FSO.OpenTextFile(inputFile)

'get the header
Do While Not TSRead.AtEndOfStream
sValue = TSRead.ReadLine
hdr(n) = sValue
n = n + 1
If n > 1 Then
Exit Do
End If
Loop
n = 0
'loop till end of file
While Not TSRead.AtEndOfStream
outputlines(n + 2) = TSRead.ReadLine
n = n + 1
If n = maxRows Then
part = part + 1
outputlines(0) = hdr(0): outputlines(1) = hdr(1)
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines, vbCrLf)
TSWrite.Close
ReDim outputlines(maxRows + 2 - 1) As String
n = 0
End If
Wend

TSRead.Close




If n > 0 And Len(Trim$(outputlines(n - 1) & "")) <> 0 Then
ReDim outputlines2(n + 2 - 1) As String
outputlines2(0) = hdr(0): outputlines2(1) = hdr(1)
For i = 0 To n - 1
outputlines2(i + 2) = outputlines(i)
Next
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines2, vbCrLf)
TSWrite.Close
End If

MsgBox "Done"

End Sub

June7
09-25-2023, 10:41 AM
Yes, I understand and my code allows line 1 to be bypassed and is not used in the output as instructed in post 7. I am sure there is more than one way to accomplish 'bypassing' line 1.

Header not located on line 2 is another issue altogether.

Why would you expect or depend on user to know which line header is on? This just allows for human error and corrupted output. Will header always be same text? Arnel has right idea to pull header row. Here is my version - tested and works.


'get the header
Do While Not TSRead.AtEndOfStream
strH = TSRead.readline & vbCrLf
If Left(strH, 1) = "C" Then Exit Do
Loop

dodonohoe
09-28-2023, 08:59 AM
Yes, I understand and my code allows line 1 to be bypassed and is not used in the output as instructed in post 7. I am sure there is more than one way to accomplish 'bypassing' line 1.

Header not located on line 2 is another issue altogether.

Why would you expect or depend on user to know which line header is on? This just allows for human error and corrupted output. Will header always be same text? Arnel has right idea to pull header row. Here is my version - tested and works.


'get the header
Do While Not TSRead.AtEndOfStream
strH = TSRead.readline & vbCrLf
If Left(strH, 1) = "C" Then Exit Do
Loop



Hi June 7, thats perfect and thanks for the help.

Aussiebear
09-28-2023, 01:25 PM
For future reference, please note that the Admin for the site has not allowed txt files to be added.