PDA

View Full Version : [SOLVED:] ORDER OF PDF MERGING - PLEASE HELP



branston
07-14-2021, 03:24 AM
Hi

I have several folders which all contain some pdf files. When running the script (thanks to KenB) to merge all the files inside each folder (and output all merged files to a different folder) the script works fine. The issue I have is that the merge is not merging the pdf files in each folder IN ORDER.

For e.g.

Folder 1 contains files 1.pdf, 2.pdf, 3.pdf, 4.pdf
Merged pdf file (called merged.pdf for folder 1) has files in the order 4.pdf, 2.pdf, 1.pdf, 3.pdf (as one pdf file)

Why is that? I need the merged.pdf output file in the same order as the pdfs in the folders. Can anyone help?

TIA

SamT
07-14-2021, 07:28 AM
The order of files on the Hard Drive is not the order that Explorer shows them in.

You will have to get all the file names, sort them in the order you want, then merge them from the list of Sorted Names.

Kenneth Hobs
07-14-2021, 10:28 AM
What thread has the code? There are several factors to consider.

1. The method used to do the merge.
2. The order of files sent to merge.
3. Both 1 and 2.

When I do a whole folder, depending or (1), I sometimes sort the array of files. Sorting can be tricky if you have character and numbers in the filename. Basically, if your order in File Explorer is right, one can use how it was set to set the same order in the sorted array. e.g. Descending filename, Ascending filename, Ascending file modification date, etc...

branston
07-14-2021, 01:04 PM
Hi Ken

Not 100% sure to be honest as like I said the merge works fine, it's the output pdf file that has the order different to the order of the files in the folder. I suspect it's something to do with filesize, date modified etc.?? but regardless I need the merged files to just be merged in the same order as how they are listed inside each folder.


I've put the merge code below. Had to install some 3rd party software to get it work.



Sub iSubfolders()
Dim a, f, i As Long, p As String
Dim p2 As String, r As String, fso As Object

'Parent folder
p = ThisWorkbook.Path & ""
'p = "C:\Users\lenovo1\Dropbox\_Excel\pdf\Acrobat"

'Folder to copy merged pdfs in subfolders to, p2 initional, and r actual.
p2 = p & "MergedPDFs"
If Dir(p2, vbDirectory) = "" Then MkDir p2
'Make a new folder in p2 to store this run's merged pdf files.

'SubFolders Array
a = aFiles(p, "/ad", True)

Do
i = i + 1
r = p2 & "\Run" & i & ""
Loop Until Dir(r, vbDirectory) = ""
MkDir r

Set fso = CreateObject("Scripting.FileSystemObject")

'SubFolders Array
f = Split(CreateObject("Wscript.Shell").Exec("cmd /c dir " & _
"""" & p & """" & " /ad/b/s").StdOut.ReadAll, vbCrLf)
'Add parent folder to f:
f(UBound(f)) = Left(p, Len(p) - 1)

'Merge pdfs in subfolders, save merged file in r folder with subfolder's name.pdf.
For i = 0 To UBound(f)
a = aFiles(f(i) & "", "*.pdf", False)
If a(1) <> "" And InStr(f(i), p2) = 0 Then
pdftkMerge a, r & fso.GetFolder(f(i)).Name & ".pdf" 'Acrobat
'PDFCreatorCombine a, r & fso.GetFolder(f(i)).Name & ".pdf" 'PDFCreator
'pdftkMerge a, r & fso.GetFolder(f(i)).Name & ".pdf" 'pdftk
End If
Next i
Set fso = Nothing
MsgBox "PDF files merged to folder: " & r
End Sub
'https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Sub pdftkMerge(arrayPDFs, pdfOut As String)
Dim a, i As Long
a = arrayPDFs
For i = LBound(a) To UBound(a)
a(i) = """" & a(i) & """"
Next i
'Command line options, https://www.pdflabs.com/docs/pdftk-man-page/
'8191 character limit length for command line string.
'Not sure what limit pdftk has, same probably.
Shell "pdftk " & Join(a, " ") & " cat output " & """" & pdfOut & """", vbHide
End Sub
'Similar to: NateO's code, http://www.mrexcel.com/forum/showpost.php?p=1228168&postcount=2
Function aFiles(strDir As String, searchTerm As String, _
Optional SubFolders As Boolean = True)
Dim fso As Object
Dim strName As String
Dim i As Long
ReDim strArr(1 To Rows.Count)

'strDir must not have a trailing \ for subFolders=True
If Right(strDir, 1) <> "" Then strDir = strDir & ""

'Exit if strDir does not exist
If Dir(strDir, vbDirectory) = "" Then Exit Function

Let strName = Dir$(strDir & searchTerm)
Do While strName <> vbNullString
Let i = i + 1
Let strArr(i) = strDir & strName
Let strName = Dir$()
Loop
Set fso = CreateObject("Scripting.FileSystemObject")
'Strip trailing \ if subFolders=False
If SubFolders = False Then strDir = Left(strDir, Len(strDir) - 1)
Call recurseSubFolders(fso.GetFolder(strDir), strArr, i, searchTerm)
Set fso = Nothing
If i = 0 Then i = 1 'Returns one empty array element in strArr
ReDim Preserve strArr(1 To i)
aFiles = strArr
End Function




Private Sub recurseSubFolders(ByRef Folder As Object, _
ByRef strArr, _
ByRef i As Long, _
ByRef searchTerm As String)
Dim SubFolder As Object
Dim strName As String
For Each SubFolder In Folder.SubFolders
Let strName = Dir$(SubFolder.Path & "" & searchTerm)
Do While strName <> vbNullString
Let i = i + 1
Let strArr(i) = SubFolder.Path & "" & strName
Let strName = Dir$()
Loop
recurseSubFolders SubFolder, strArr, i, searchTerm
Next
End Sub


Thanks for your help

Kenneth Hobs
07-14-2021, 07:15 PM
An easy way to do it is by adding another command line switch to this line:

"""" & p & """" & " /ad/b/s").StdOut.ReadAll, vbCrLf)

to order by date, we use /od. e.g.

"""" & p & """" & " /ad/b/s/od").StdOut.ReadAll, vbCrLf)

to reverse order by date, we use /o-d. e.g.

"""" & p & """" & " /ad/b/s/o-d").StdOut.ReadAll, vbCrLf)

to order by name, /n. etc.

You can see those Command Line Interface (CLI) switches/options for the shell's DIR by:
1. Win+R
2. CMD, Enter key
3. help dir, enter key
4. exit, enter key

branston
07-15-2021, 05:03 AM
Ordered by name after adding a 1.,2.,3.,4. etc before each file name so they are kept in the order I need. Then did the merge and all working now. Thanks a lot !

branston
07-15-2021, 11:56 AM
Ok so seem to have a small issue.

If the files are named for e.g. 1.pdf, 4.pdf, 5.pdf, 8.pdf etc. the merge works fine.

If a pdf is begins with 10.pdf (or 100.pdf) and is included in a batch of pdfs then the order messes up again when the merge is done. I suspect it's something to do with the way 10 and 100 are being read? Any ideas?

TIA