Consulting

Results 1 to 5 of 5

Thread: Loop in .txt file to extract recurring strings

  1. #1
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location

    Loop in .txt file to extract recurring strings

    I'm trying to extract two text strings that will have varying numbers of occurrences in a text file (uploaded a sample with two occurrences - sorry, could not upload a .txt file so put in excel text box).

    The strings I want will always follow certain key words so thinking I will use variations of the code below to select the line contents after the following key words:
    1) "Interest Holder Name:" - I want the bank name that follows this
    2) "Document Reference:" - I want the date that follows this

    strIntHlder = Trim(Split(Split(newFileContent, "Interest Holder Name:")(1), " " & vbCrLf)(0))
         
     strDocRef = Trim(Split(Split(newFileContent, "Document Reference:")(1), " " & vbCrLf)(0))
    A couple of issues:

    1) I want to store all occurrences of the text strings (there can be several), can I use a loop to do this in a text file?

    2) The key words will also occur in other areas in the document that I do not want to reference (not shown in the attachment). So, can I isolate a loop range - maybe to a range defined by the first and last text lines shown in the attachment?

    Sorry, could not upload a .txt file so put in excel text box.

    Thanks for any assistance
    Attached Files Attached Files
    Last edited by BrI; 07-11-2017 at 01:36 PM.

  2. #2
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    FYI, You can attach a zip file if you can't post a specific file type.

    Give this a try.
    Sub Test()
    Dim objFSO As Object
    Dim objTF As Object
    Dim strIn 'As String
    Dim txt
    Dim Pth
    
    Pth = "C:\VBAX\Bank.txt"   'Change to suit
    
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTF = objFSO.OpenTextFile(Pth, 1)
    strIn = objTF.readall
    
    
    strt = InStr(1, strIn, "RECORDED INTERESTS AND INSTRUMENTS")
    endd = InStr(1, strIn, "NON-ENABLING INSTRUMENTS")
    txt = Mid(strIn, strt, endd - strt)
    
    strIntHlder = Split(txt, "Interest Holder Name:")
    For i = LBound(strIntHlder) To UBound(strIntHlder) - 1
    Cells(i + 1, 1) = Trim(Split(Split(txt, "Interest Holder Name:")(i + 1), " " & vbCrLf)(0))
    Next i
    
    strDocRef = Split(txt, "Document Reference:")
    For i = LBound(strDocRef) To UBound(strDocRef) - 1
    Cells(i + 1, 2) = Split(Trim(Split(txt, "Document Reference:")(i + 1)), " ")(0)
    Next i
    
    Cells.Replace What:="Qualifier:", Replacement:="", LookAt:=xlPart
    
    End Sub
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  3. #3
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Excellent! Tested on several docs and working perfectly - will also be very helpful in other situations. Really appreciate your help.

    One question, is there a reasonably easy way to extract the date from the "Document Reference:" line rather than the document number?

    For example, your code currently extracts "75680398987" (note, number of digits will vary in this number) from the line below - could it be modified to extract "2004-06-15" instead?

    Document Reference: 75680398987 2004-06-15 12:27:39

  4. #4
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    Sub Test()
    Dim objFSO As Object
    Dim objTF As Object
    Dim strIn 'As String
    Dim txt
    
    
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTF = objFSO.OpenTextFile("C:\VBAX\Bank.txt", 1)
    strIn = objTF.readall
    
    
    strt = InStr(1, strIn, "RECORDED INTERESTS AND INSTRUMENTS")
    endd = InStr(1, strIn, "NON-ENABLING INSTRUMENTS")
    
    
    txt = Mid(strIn, strt, endd - strt)
    
    
    Do Until InStr(txt, "  ") = 0       ' Loop until there are no more double spaces
            txt = Replace(txt, "  ", " ")   ' Replace 2 spaces with 1 space
    Loop
    
    
    strIntHlder = Split(txt, "Interest Holder Name:")
    For i = LBound(strIntHlder) To UBound(strIntHlder) - 1
    Cells(i + 1, 1) = Trim(Split(Split(txt, "Interest Holder Name:")(i + 1), " " & vbCrLf)(0))
    Next i
    
    
    strDocRef = Split(txt, "Document Reference:")
    For i = LBound(strDocRef) To UBound(strDocRef) - 1
    Cells(i + 1, 2) = Split(Trim(Split(txt, "Document Reference:")(i + 1)), " ")(0)
    Cells(i + 1, 3) = Split(Trim(Split(Trim(Split(txt, "Document Reference:")(i + 1)), vbCrLf)(0)), " ")(1)
    
    
    Next i
    
    
    Cells.Replace What:="Qualifier:", Replacement:="", LookAt:=xlPart
    
    
    End Sub
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  5. #5
    VBAX Regular
    Joined
    Apr 2017
    Posts
    66
    Location
    Yes! Working on all my tests. Can't thank you enough for applying your expertise to this, a great help. Really outstanding!!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •