Consulting

Results 1 to 10 of 10

Thread: macro to extract specific data from text file to excel sheet

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    VBAX Regular
    Joined
    May 2017
    Posts
    10
    Location

    macro to extract specific data from text file to excel sheet

    I have a pdf file which has been converted to txt .

    My issue is to extract specific information as per attached which are numbered from 25 to 42. It consists to extract numbered headers with corresponding data .

    I don't know if this achievable with vba code

    note the text file was unable to load , rename as csv , thus rename as txt before opening
    Attached Files Attached Files

  2. #2
    VBAX Expert Leith Ross's Avatar
    Joined
    Oct 2012
    Location
    San Francisco, California
    Posts
    552
    Location
    Hello learn72,

    I do not see any headers that match in these two files. Did you post the correct files?
    Sincerely,
    Leith Ross

    "1N73LL1G3NC3 15 7H3 4B1L17Y 70 4D4P7 70 CH4NG3 - 573PH3N H4WK1NG"

  3. #3
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,844
    This is a horrible task! Surely you can get the data another way?
    In the attached there is a flaky solution; loads of assumptions are made, a few guesses too. The likelihood that it will go wrong and/or give wrong results is high.
    I'm not even sure it's giving you what you need.
    Anyway, I've only done some very cursory checking - I leave that to you.
    On Sheet2 of the attached is a button in the vicinity of cell C8 for you to click. Initially, select the same .txt file you renamed .csv and attached here, in order to duplicate what I was working with and hopefully it won't fall over. It places the results starting at cell A10 of the same sheet.
    Sheet5 of the attached is the output I got using the text file you attached.

    The code's in the file but here it is too:
    Sub blah()
      Set Destn = Sheets("Sheet2").Range("A10")
      fpath = Application.GetOpenFilename("Text Files (*.txt),*.txt")
      If Not fpath = False Then
        zzz = CreateObject("scripting.filesystemobject").opentextfile(fpath).readall
        'zzz = CreateObject("scripting.filesystemobject").opentextfile("C:\Users\mmmmmm\AppData\Local\Temp\data.txt").readall
        yyy = Split(zzz, vbCrLf & vbCrLf & "25")
        For Each rcd In yyy
          If InStr(rcd, "Marks & Numbers") > 0 Then
            'Stop
            *** = Split(rcd, vbCrLf & vbCrLf)
            For i = LBound(***) To UBound(***)
              Select Case True
                Case InStr(***(i), "Number and Kind") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  B = www(0)
                Case InStr(***(i), "34  FOB Ncy") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  L = www(1)
                  M = Application.Trim(Split(www(2), "|")(0))
                  N = Application.Trim(Split(www(3), "|")(0))
                Case InStr(***(i), "37  Other Costs") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  O = Application.Trim(Split(www(0), "|")(0))
                  On Error Resume Next
                  P = www(2)
                  Q = www(3)
                  On Error GoTo 0
                Case InStr(***(i), "Description of goods") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  C = www(0)
                Case InStr(***(i), "31  Gross mass") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  ColmI = www(0)
                  J = www(1)
                  K = Application.Trim(Split(www(2), "|")(0))
                Case InStr(***(i), "28  Cty. Org") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  F = www(1)
                  G = www(2)
                  H = www(3)
                Case InStr(***(i), "Marks & Numbers") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  D = www(1)
                  ' Debug.Assert D <> "098"
                  E = www(2)
                Case InStr(***(i), "40  Tax") > 0
                  www = ShortArray(Split(***(i + 1), "  "))
                  X = www(UBound(www))
                  'Stop
                  WriteToSheet Destn, Array(A, B, C, D, E, F, G, H, ColmI, J, K, L, M, N, O, P, Q, , , , , , , X)
                  ii = i + 1
                  Do
                    R = Empty: S = Empty: T = Empty: U = Empty: V = Empty: W = Empty:
                    ccc = ShortArray(Split(***(ii), "  "))
                    If UBound(ccc) > 2 Then
                      'Stop
                      R = ccc(0)
                      S = ccc(1)
                      T = ccc(2)
                      U = ccc(3)
                      V = ccc(4)
                      W = ccc(5)
                      WriteToSheet Destn.Offset(, 17), Array(R, S, T, U, V, W)
                      Set Destn = Destn.Offset(1)
                    End If
                    ii = ii + 1
                  Loop Until InStr(***(ii), "Total") > 0 Or UBound(ccc) < 3
              End Select
              'www = Split(thing, "  ")
            Next i
          End If
        Next rcd
      End If
    End Sub
    
    Function ShortArray(myArr)
    ReDim NewArr(LBound(myArr) To UBound(myArr))
    J = 0
    For i = LBound(myArr) To UBound(myArr)
      Z = Application.Trim(myArr(i))
      If Z <> "" Then
        NewArr(J) = Z
        J = J + 1
      End If
    Next i
    ReDim Preserve NewArr(LBound(myArr) To J - 1)
    ShortArray = NewArr
    End Function
    
    Sub WriteToSheet(Dest, myArr)
    Dest.Resize(, UBound(myArr) + 1).Value = myArr
    End Sub
    I note there are instances of *** in the code above, that's where I had three x's!

    There are things to do still; eg. I haven't cleared the variables for each loop iteration (the variable names are the column letters of their destination, except for ColmI because I used i for something else) and this could lead to wrong data/results.
    Actually, after posting I found it does produce wrong data in colums P and Q starting at item no. 99. I'll post an update tomorrow - too tired now.
    Attached Files Attached Files
    Last edited by p45cal; 02-11-2018 at 08:37 PM.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  4. #4
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,844
    Quote Originally Posted by p45cal View Post
    Actually, after posting I found it does produce wrong data in colums P and Q starting at item no. 99. I'll post an update tomorrow - too tired now.
    Update attached.
    Attached Files Attached Files
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  5. #5
    VBAX Regular
    Joined
    May 2017
    Posts
    10
    Location
    Thank you a lot p45cal for the time dedicated for this,it is not an easy task but very tough , I never thought it would be possible. The original file is a pdf file .

    I will make some test ,if any issues I will revert back

    Otherwise can you suggest is there is another technique to extract data from a pdf file .

  6. #6
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,844
    Quote Originally Posted by learn72 View Post
    Otherwise can you suggest is there is another technique to extract data from a pdf file .
    Where has the pdf file come from? How has it been made?
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  7. #7
    VBAX Regular
    Joined
    May 2017
    Posts
    10
    Location
    I obtained the pdf file from a system but can't do much more .

    Sent message , Hope it helps

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •