Consulting

Results 1 to 4 of 4

Thread: Data consolidation from multiple PDFs to excel

  1. #1

    Data consolidation from multiple PDFs to excel

    Dear readers,

    I created the profile here today because I’m completely lost with power query. I have some beginner knowledge when it comes to VBA, but this problem just stumped me…

    I would appreciate any help or an advice on where can I learn more about this topic

    Essentially, what I’m trying to do is: I have a folder with PDF documents, that every morning I name with todays date and giving them a unique identifier (9 files a day) so they would look something like this 04.17.2022 – 0001, 04.17.2022 – 0002, and so on, what I’m trying to do is have these 9 documents import under 1 consolidated table.

    To be honest I don’t know if its even possible to have excel find the 9 documents of a given date in a large folder and consolidate them…

    Thank you for reading,
    Sincerely,

  2. #2
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,875
    Say a bit more about what you data you want to get from the .pdf files, how you want it arranged (mock up an Excel file of what you want to see and attach it here?), but most important of all is to attach 2 or 3 .pdf files here so we can see how consistent they are and how easy/difficult it is to extract data from.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  3. #3
    Hey, thank you for your reply,

    I'm sorry I haven't responded earlier, I was trying to figure it out myself, but I kept failing...
    I found a few YouTube videos that pointed me in the right direction

    So, first probably with format, on average I receive 5-9 PDF's a day they are exactly the same in terms of format: they have 9 columns, number of rows vary from 50 to 300. PDFs have a varying number of pages and each page has the first row with the names of each column. I've attached a sample of the combined PDF document which consists of 3 PDF's of the sample report.

    My main problem is that on daily basis I need to open each one, convert it into excel, format it, and paste it into a consolidated excel document.

    My thought process (which may be completely wrong) is:

    1) In VBA I use an option to open a folder and select the needed PDF documents.
    2) Use power query to add one new column that will have the name of the file from which the data was extracted in each row (each row will have the date/unique identifier)
    3) Paste the created table into a different excel document where all the other data has been posted from the previous dates.

    My biggest challenge is that I don't understand how to make power query to analyze multiple PDF's simultaneously and post the data extracted one after the other

    Again, thank you for your reply.
    Attached Images Attached Images
    Attached Files Attached Files

  4. #4
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,875
    Are you sure that pdf file you sent is correct? The 3rd page is a duplicate of most of the transactions on the 2nd page.

    Send 2 or 3 different pdfs so we can experiment with multiple files.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •