PDA

View Full Version : [SOLVED] Match FileNames and then Merge those PDFs



branston
11-21-2019, 11:28 AM
Hi

I am trying to merge pdf files that have the same name in two different folders and output the merged files into a 3rd folder. I can merge files in VBA but my issue is merging the matching name pdfs so it's the matching part I am struggling with. I have looked online and there's lots of 'merging pdf files' but can't quite find what I need for the matching bit. Ken's post (Similar to Ken's post here : http://www.vbaexpress.com/forum/archive/index.php/t-51366.html ) is the closest I have found to this but still cannot merge only matching file names.

So current folders are

FolderA
(contains) File1.pdf, File2.pdf, File3.pdf, File4.pdf, File5.pdf

FolderB
(contains) File1.pdf, File2.pdf, File3.pdf, File4.pdf, File5.pdf



Desired output should be

FolderC (create if doesn't exist)
File1merged.pdf, File2merged.pdf, File3merged.pdf, File4merged.pdf, File5merged.pdf

I have pdftk installed and am using that at the moment.
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/


Thanks for any help.

paulked
11-21-2019, 12:10 PM
I can merge files in VBA


Post your code that you do this with and I'll see if I can adapt it.

branston
11-21-2019, 01:52 PM
Thanks paulked.

This will merge pdf files in various folders and output the results in a new MergedPDFs folder (thanks to Ken for the help). Hope it helps you somewhat.

E.g. run of code below

Folder1(contains File1.pdf, File2.pdf, File3.pdf, File4.pdf, File5.pdf)
Folder2(contains File6.pdf, File7.pdf, File8.pdf, File9.pdf, File10.pdf)
Folder3(contains File11.pdf, File12.pdf, File13.pdf, File14.pdf, File15.pdf)
etc.

Output is a:
MergedPDFs Folder(containing Folder1MergedFiles.pdf, Folder2MergedFiles.pdf,
Folder3MergedFiles.pdf) etc.







Sub CombineFiles()
Dim a, f, i As Long, p As String
Dim p2 As String, r As String, fso As Object

'Parent folder
p = ThisWorkbook.Path & ""

'Folder to copy merged pdfs in subfolders to, p2 initional, and r actual.
p2 = p & "MergedPDFs"
If Dir(p2, vbDirectory) = "" Then MkDir p2
'Make a new folder in p2 to store this run's merged pdf files.

'SubFolders Array
a = aFiles(p, "/ad", True)

Do
i = i + 1
r = p2 & "\Run" & i & ""
Loop Until Dir(r, vbDirectory) = ""
MkDir r

Set fso = CreateObject("Scripting.FileSystemObject")

'SubFolders Array
f = Split(CreateObject("Wscript.Shell").Exec("cmd /c dir " & _
"""" & p & """" & " /ad/b/s").StdOut.ReadAll, vbCrLf)
'Add parent folder to f:
f(UBound(f)) = Left(p, Len(p) - 1)

'Merge pdfs in subfolders, save merged file in r folder with subfolder's name.pdf.
For i = 0 To UBound(f)
a = aFiles(f(i) & "", "*.pdf", False)
If a(1) <> "" And InStr(f(i), p2) = 0 Then
pdftkMerge a, r & fso.GetFolder(f(i)).Name & ".pdf" 'Acrobat
'PDFCreatorCombine a, r & fso.GetFolder(f(i)).Name & ".pdf" 'PDFCreator
'pdftkMerge a, r & fso.GetFolder(f(i)).Name & ".pdf" 'pdftk
End If
Next i
Set fso = Nothing
MsgBox "PDF files merged to folder: " & r
End Sub
'https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Sub pdftkMerge(arrayPDFs, pdfOut As String)
Dim a, i As Long
a = arrayPDFs
For i = LBound(a) To UBound(a)
a(i) = """" & a(i) & """"
Next i

Shell "pdftk " & Join(a, " ") & " cat output " & """" & pdfOut & """", vbHide
End Sub

Function aFiles(strDir As String, searchTerm As String, _
Optional SubFolders As Boolean = True)
Dim fso As Object
Dim strName As String
Dim i As Long
ReDim strArr(1 To Rows.Count)

'strDir must not have a trailing \ for subFolders=True
If Right(strDir, 1) <> "" Then strDir = strDir & ""

'Exit if strDir does not exist
If Dir(strDir, vbDirectory) = "" Then Exit Function

Let strName = Dir$(strDir & searchTerm)
Do While strName <> vbNullString
Let i = i + 1
Let strArr(i) = strDir & strName
Let strName = Dir$()
Loop
Set fso = CreateObject("Scripting.FileSystemObject")
'Strip trailing \ if subFolders=False
If SubFolders = False Then strDir = Left(strDir, Len(strDir) - 1)
Call recurseSubFolders(fso.GetFolder(strDir), strArr, i, searchTerm)
Set fso = Nothing
If i = 0 Then i = 1 'Returns one empty array element in strArr
ReDim Preserve strArr(1 To i)
aFiles = strArr
End Function

Private Sub recurseSubFolders(ByRef Folder As Object, _
ByRef strArr, _
ByRef i As Long, _
ByRef searchTerm As String)
Dim SubFolder As Object
Dim strName As String
For Each SubFolder In Folder.SubFolders
Let strName = Dir$(SubFolder.Path & "" & searchTerm)
Do While strName <> vbNullString
Let i = i + 1
Let strArr(i) = SubFolder.Path & "" & strName
Let strName = Dir$()
Loop
recurseSubFolders SubFolder, strArr, i, searchTerm
Next
End Sub

paulked
11-21-2019, 03:54 PM
Sub MergePDFs()
Dim dA$, dB$, dC$, PDFa$, PDFb$, strDir$, i&, j&, in1, in2
dA = "C:\Temp\A\"
dB = "C:\Temp\B\"
dC = "C:\Temp\C\"
If Dir(dC, vbDirectory) = "" Then MkDir (dC)
strDir = Dir(dA & "*.pdf")
i = 1
j = 1
Do While Len(strDir) > 0
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & " " & dA & strDir
End If
strDir = Dir
i = i + 1
Loop
strDir = Dir(dB & "*.pdf")
Do While Len(strDir) > 0
If j = 1 Then
PDFb = dB & strDir
Else
PDFb = PDFb & " " & dB & strDir
End If
strDir = Dir
j = j + 1
Loop
If i <> j Then MsgBox "The number of PDF's in A & B don't match!": Exit Sub
in1 = Split(PDFa)
in2 = Split(PDFb)
For i = LBound(in1) To UBound(in1)
'From the website
'pdftk in1.pdf in2.pdf cat output out1.pdf
Shell "c:\pdftk " & in1(i) & " " & in2(i) & " cat output " & dC & i + 1 & ".pdf", vbHide
Next
End Sub


I haven't been able to test the merging because i don't have any pdf tools!

Since posting I've downloaded the PDFtk and tested it... it does work!

branston
11-22-2019, 04:17 PM
Hi paulked

The code runs but I am getting nothing as an output

Is my line correct here? I'm assuming this is the location of pdftk.exe?



For i = LBound(in1) To UBound(in1)
'From the website
'pdftk in1.pdf in2.pdf cat output out1.pdf
Shell "C:\Program Files (x86)\PDFtk\bin\Pdftk.exe " & in1(i) & " " & in2(i) & " cat output " & dC & i + 1 & ".pdf", vbHide
Next

paulked
11-23-2019, 04:05 AM
It is correct. PDFtk needs reference to the libiconv2.dll file... I had to copy that file from the PDFtk installation directory to C:\Temp\ to make it work.

branston
11-23-2019, 07:21 AM
Hi

Have tried that but no joy.

For some reason, and I don't know why, but I think there is an issue with the files in




dA = "C:\Temp\A\"
dB = "C:\Temp\B\"



I think the code thinks that there are no files in these directories hence no output. Of course there is and the pdf files are in A and B folders.

Is it possible to check here




Do While Len(strDir) > 0
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & " " & dA & strDir
End If
strDir = Dir
i = i + 1
Loop



if pdf files exist in Folder A and if not get a message "no pdf files exist in this directory". That way I can narrow down where the issue is and if no message pops up then we know the issue is something else.

Thanks for your help.

paulked
11-23-2019, 07:44 AM
Sub MergePDFs() Dim dA$, dB$, dC$, PDFa$, PDFb$, strDir$, i&, j&, in1, in2
dA = "C:\Temp\A\"
dB = "C:\Temp\B\"
dC = "C:\Temp\C\"
If Dir(dC, vbDirectory) = "" Then MkDir (dC)
strDir = Dir(dA & "*.pdf")
i = 1
j = 1
If Len(strDir) = 0 Then MsgBox "No files in " & dA
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & " " & dA & strDir
End If
strDir = Dir
i = i + 1
End If
Loop
strDir = Dir(dB & "*.pdf")
If Len(strDir) = 0 Then MsgBox "No files in " & dB
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If j = 1 Then
PDFb = dB & strDir
Else
PDFb = PDFb & " " & dB & strDir
End If
strDir = Dir
j = j + 1
End If
Loop
If i <> j Then MsgBox "The number of PDF's in A & B don't match!": Exit Sub
in1 = Split(PDFa)
in2 = Split(PDFb)
For i = LBound(in1) To UBound(in1)
'pdftk in1.pdf in2.pdf cat output out1.pdf
Shell "c:\pdftk " & in1(i) & " " & in2(i) & " cat output " & dC & i + 1 & ".pdf", vbHide
Next
End Sub

Use F8 to step through the code and hover over PDFa or PDFb to see the file name.

I've also put a check in for pdf files.

branston
11-23-2019, 08:54 AM
Thanks paulked.

Get no error message and if I remove pdf files from one of the folders I do get the error message "no files in.."

But what's really strange is that when stepping through the code … hovering over for e.g. dA … I get dA = "". Same for dB (dB ="").

Totally baffled now!

paulked
11-23-2019, 09:07 AM
Example:

If you hover over dA when the row is highlighted in yellow, it will be "", once you hit F8 and highlight the next line then hover over dA it will show the value ("C:\Temp\A" unless you have changed these to suit your folders).

Are you using the code as supplied, or have you modified anything? If you've made changes then post all the code and I'll run through it.

branston
11-23-2019, 09:24 AM
I've attached a pic to show you the hovering output. Also one of folders.

code is below. I moved all of my files to C:\Temp\ (created a Temp directory) just to keep things consistent with your code. Didn't change much apart from where my pdftk.exe sits .... near the end of the code.




Sub MergePDFs()
Dim dA$, dB$, dC$, PDFa$, PDFb$, strDir$, i&, j&, in1, in2
dA = "C:\Temp\A\"
dB = "C:\Temp\B\"
dC = "C:\Temp\C\"
If Dir(dC, vbDirectory) = "" Then MkDir (dC)
strDir = Dir(dA & "*.pdf")
i = 1
j = 1
If Len(strDir) = 0 Then MsgBox "No files in " & dA
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & " " & dA & strDir
End If
strDir = Dir
i = i + 1
End If
Loop
strDir = Dir(dB & "*.pdf")
If Len(strDir) = 0 Then MsgBox "No files in " & dB
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If j = 1 Then
PDFb = dB & strDir
Else
PDFb = PDFb & " " & dB & strDir
End If
strDir = Dir
j = j + 1
End If
Loop
If i <> j Then MsgBox "The number of PDF's in A & B don't match!": Exit Sub
in1 = Split(PDFa)
in2 = Split(PDFb)
For i = LBound(in1) To UBound(in1)
'pdftk in1.pdf in2.pdf cat output out1.pdf
Shell "C:\Program Files (x86)\PDFtk\bin\Pdftk.exe " & in1(i) & " " & in2(i) & " cat output " & dC & i + 1 & ".pdf", vbHide
Next
End Sub

paulked
11-23-2019, 09:37 AM
25480
As I said above, hit F8 to go to next line, then look at the value... that line of code hasn't executed yet!

Step through the code with F8 to make sure your files are being picked up.

branston
11-23-2019, 09:52 AM
Sorry ... was stuck on the yellow line and hadn't realised the code hadn't executed (!) :banghead:

So looping through, the files are being picked apart so any ideas why the merge is not happening?

Please see attached pic. as I am not sure but maybe the issue is here?


Thanks for all your help so far.

paulked
11-23-2019, 10:09 AM
1. Try copying the dll file to C:\Windows\System32
2. What version of windows are you using? It could be something to do with they way Shell is set up (may need to add more quotation marks)

branston
11-23-2019, 10:13 AM
Copying to system32 didn't fix it.

Windows 10 Home edition.

Think you are right about the Shell setup.

Kenneth Hobs
11-23-2019, 10:41 AM
Here is a method that I used.


Sub pdftkMerge(arrayPDFs, pdfOut As String)
Dim a, i As Long
a = arrayPDFs
For i = LBound(a) To UBound(a)
a(i) = """" & a(i) & """"
Next i
Shell "pdftk " & Join(a, " ") & " cat output " & """" & pdfOut & """", vbHide
End Sub

I don't know why DLL file would be an issue.

pdftk installs without the need for a the full path in Shell(). I am not sure what c:\pdftk does. I guess that would work if pdftk was installed in the root folder of c:\. I would not recommend that.

The filename inputs should be delimited with quotes. One could probably use ChDir() for one of the paths but there could still be a space character in the filename.

Maybe:

Dim q As String
q = """"

'Shell "c:\pdftk " & in1(i) & " " & in2(i) & " cat output " & dC & i + 1 & ".pdf", vbHide
Shell "pdftk " & q & in1(i) & q & " " & q & in2(i) & q & " cat output " & q & dC & i + 1 & ".pdf" & q, vbHide

paulked
11-23-2019, 10:42 AM
I'm using Win 10 too, and it works for me.

Try:

Change the Shell line to
Shell "cmd /k " & "c:\pdftk " & in1(i) & " " & in2(i) & " cat output " & dC & i + 1 & ".pdf", vbNormalFocus

When the window opens copy and paste this after the command prompt

pdftk C:\Temp\A\1.pdf C:\Temp\B\1.pdf cat output C:\Temp\C\1.pdf

Hit enter. Does the merge happen?

paulked
11-23-2019, 10:55 AM
Hi Ken, I put the pdftk file in C:\ just temporarily so I could remember where it is!

branston
11-23-2019, 11:08 AM
Hi

Yes it does! .... after entering
pdftk C:\Temp\A\1.pdf C:\Temp\B\1.pdf cat output C:\Temp\C\1.pdf in command prompt.

But of course this relies on actual files names being entered in cmd prompt?

branston
11-23-2019, 11:23 AM
Hi paulked

Was just playing around with this and think may have stumbled across something. As silly as it sounds, when I removed any spaces in the filenames of files in folder A and B, the merge worked. When the full file name is used (with spaces e.g. "File 1") the merge doesn't work. Could it be the spaces !?

paulked
11-23-2019, 11:32 AM
Yes. Try Ken's suggestion, that will enclose the file names in quotes or


Shell "c:\pdftk " & """" & in1(i) & """" & " " & """" & in2(i) & """" & " cat output " & dC & i + 1 & ".pdf"

branston
11-23-2019, 11:57 AM
Hi paulked

Sorry I didn't see Ken's post 16. (Thanks Ken.)

Could you check the code below? I'm not sure how to incorporate Ken's sub routine within your code. So far I have the below (and it's still not working with spaces). Thanks for your patience as I am still making lots of errors but still learning loads.


Sub MergePDFs()
Dim dA$, dB$, dC$, PDFa$, PDFb$, strDir$, i&, j&, in1, in2, q As String
dA = "C:\Temp\A\"
dB = "C:\Temp\B\"
dC = "C:\Temp\C\"
q = """"

If Dir(dC, vbDirectory) = "" Then MkDir (dC)
strDir = Dir(dA & "*.pdf")
i = 1
j = 1
If Len(strDir) = 0 Then MsgBox "No files in " & dA
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & " " & dA & strDir
End If
strDir = Dir
i = i + 1
End If
Loop

strDir = Dir(dB & "*.pdf")
If Len(strDir) = 0 Then MsgBox "No files in " & dB
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If j = 1 Then
PDFb = dB & strDir
Else
PDFb = PDFb & " " & dB & strDir
End If
strDir = Dir
j = j + 1
End If
Loop

If i <> j Then MsgBox "The number of PDF's in A & B don't match!": Exit Sub
in1 = Split(PDFa)
in2 = Split(PDFb)
For i = LBound(in1) To UBound(in1)
'pdftk in1.pdf in2.pdf cat output out1.pdf
'Shell "C:\Program Files (x86)\PDFtk\bin\Pdftk " & """" & in1(i) & """" & " " & """" & in2(i) & """" & " cat output " & dC & i + 1 & ".pdf"
Shell "C:\Program Files (x86)\PDFtk\bin\Pdftk " & q & in1(i) & q & " " & q & in2(i) & q & " cat output " & q & dC & i + 1 & ".pdf" & q, vbHide
Next
End Sub

paulked
11-23-2019, 12:05 PM
Sub MergePDFs()
Dim dA$, dB$, dC$, PDFa$, PDFb$, strDir$, i&, j&, in1, in2
dA = "C:\Temp\A\"
dB = "C:\Temp\B\"
dC = "C:\Temp\C\"
If Dir(dC, vbDirectory) = "" Then MkDir (dC)
strDir = Dir(dA & "*.pdf")
i = 1
j = 1
If Len(strDir) = 0 Then MsgBox "No files in " & dA
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & "," & dA & strDir
End If
strDir = Dir
i = i + 1
End If
Loop
strDir = Dir(dB & "*.pdf")
If Len(strDir) = 0 Then MsgBox "No files in " & dB
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If j = 1 Then
PDFb = dB & strDir
Else
PDFb = PDFb & "," & dB & strDir
End If
strDir = Dir
j = j + 1
End If
Loop
If i <> j Then MsgBox "The number of PDF's in A & B don't match!": Exit Sub
in1 = Split(PDFa, ",")
in2 = Split(PDFb, ",")
Dim strExe
For i = LBound(in1) To UBound(in1)
Shell "pdftk " & """" & in1(i) & """" & " " & """" & in2(i) & """" & " cat output " & dC & i + 1 & ".pdf", 0
Next
End Sub

This code works with files with spaces.

branston
11-23-2019, 12:22 PM
Thanks paulked - that's a great help and yes it does work now.

Last question - ideally I would like the output files to be the same as the name of the two files merged. At the moment I am having to open up the file to see who it belongs to if you know what I mean. Could I add an alternative to 'PDFa' … like 'FileName' at the end of this line here :




Shell "pdftk " & """" & in1(i) & """" & " " & """" & in2(i) & """" & " cat output " & dC & i + 1 & ".pdf", 0


Once again thanks for all your help

paulked
11-23-2019, 03:29 PM
Sub MergePDFs()
Dim dA$, dB$, dC$, PDFa$, PDFb$, strDir$, i&, j&, in1, in2, op$
dA = "C:\Temp\A\"
dB = "C:\Temp\B\"
dC = "C:\Temp\C\"
If Dir(dC, vbDirectory) = "" Then MkDir (dC)
strDir = Dir(dA & "*.pdf")
i = 1
j = 1
If Len(strDir) = 0 Then MsgBox "No files in " & dA
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If i = 1 Then
PDFa = dA & strDir
Else
PDFa = PDFa & "," & dA & strDir
End If
strDir = Dir
i = i + 1
End If
Loop
strDir = Dir(dB & "*.pdf")
If Len(strDir) = 0 Then MsgBox "No files in " & dB
Do While Len(strDir) > 0
If Right(strDir, 3) = "pdf" Or Right(strDir, 3) = "PDF" Then
If j = 1 Then
PDFb = dB & strDir
Else
PDFb = PDFb & "," & dB & strDir
End If
strDir = Dir
j = j + 1
End If
Loop
If i <> j Then MsgBox "The number of PDF's in A & B don't match!": Exit Sub
in1 = Split(PDFa, ",")
in2 = Split(PDFb, ",")
Dim strExe
For i = LBound(in1) To UBound(in1)
op = dC & Mid(in1(i), Len(dA) + 1, Len(in1(i)) - Len(dA) - 4) & "-Merged.pdf"
Shell "pdftk " & """" & in1(i) & """" & " " & """" & in2(i) & """" & " cat output " & """" & op & """", 0
Next
End Sub


This will give the output pdf the filename from A plus -Merged.pdf

branston
11-23-2019, 05:12 PM
Thanks paulked. That's great !

paulked
11-23-2019, 08:18 PM
Glad to have helped. You can mark it Solved now using the Thread Tools (top right).:thumb