Consulting

Results 1 to 13 of 13

Thread: extract only specific words/lines from a txt file

  1. #1

    extract only specific words/lines from a txt file

    For this thread I would like to know once I opened a txt file instead of opening the entire contents from the file, I would like my macro to

    1) Look for “Sub Pre_Shorts” and “subend”. “Sub Pre_Shorts” is the starting point and “subend” is ending point. I need to extract information data that are between these two points
    (I managed to do this, take a look at my attached file)

    Once locate these two points, need to do the following steps:

    2) Look for lines that start with “!”, if the next word is something like, test "analog/….” Or test "Digital/….” Get their information onto the spreadsheet. For e.g. if the line is

    !#@ test "analog/l1" on boards BoardSet_boards_1_to_2(*)

    - “!” should be column A, “#@” not needed
    - “analog ” in column B
    - “l1 ” in column C
    - The rest of the words in the line not needed

    3) Look for lines that start with “test” , if the next word is something like, "analog/….” Or "Digital/….” Get their information onto the spreadsheet. For e.g. if the line is

    test "digital/u3" on boards BoardSet_boards_1_to_2(*) !ok

    - “Digital ” in column B
    - “u3 ” in column C
    - The rest of the words in the line not needed

    This is only extracting data for Sub Pre_Short s. There is still few other “Sub …. “. So once the macro complete the task for “Sub Pre_Shorts” . It should look for the other “Sub …” and repeat step 2 and 3. Below is the name of the other “Sub .. “ names which need to look for,

    Start Point End Point

    Start Point --------------------------------- End Point
    Sub Analog_Tests --------------------------- subend
    sub BScan_Powered_Shorts_Tests ----------- subend
    sub BScan_Interconnect_Tests -------------- subend
    sub BScan_Incircuit_Tests ------------------- subend
    sub BScan_Silicon_Nails_Tests --------------- subend
    sub Digital_Tests ---------------------------- subend
    sub Functional_Tests ------------------------ subend
    sub Analog_Functional_Tests ----------------- subend

    Inside the txt file (attached) there is other names starting with “Sub..", e.g. “sub Connect_Check” those are not needed to extract, so can ignore.

    Important: If one of the start point cannot be found in the txt file, the macro should ignore that start point and should continue with the other start points, (e.g. if “Sub Analog_Tests” is not in the txt file, just ignore it and continue with the rest of the start points).

    Right now, I am stuck at step 2 and 3, and also not sure how to loop so that once it complete one start point it goes on with another start point.
    I have attached some pictures to illustrate more clearly what i want to achieve.

    http://www.iimmgg.com/image/374f1100...74d58449af0f9f
    http://www.iimmgg.com/image/63416a9a...723a5f266d09fa
    http://www.iimmgg.com/image/53dfadfb...a4178f460e7bba
    http://www.iimmgg.com/image/025563ee...24c79f3661ce97

    I have attached sample txt file
    I have also attached a workbook, Sheet1 consist a button with my initial codes and sheet2 is what my desired output should look like
    Attachment 4853

  2. #2
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    I think you should have continued this in the other thread, or at least provided a link as this thread is very much related.

    This is certainly becoming more challenging.

  3. #3
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    Give this a try.

    This assumes that Column C data will always start with "/" and end with a quote.

    [vba]Option Explicit
    Option Compare Text
    Private Sub CommandButton1_Click()

    Dim fs As Object, a As Object
    Dim t As String
    Dim MyFile As String
    Dim i As Long
    Dim Sub2Srch() As String
    Dim lCount As Long
    Dim Col3 As Long
    Dim EndQuote As Long

    Columns("A:A").Select
    Selection.ClearContents
    Range("A1").Select

    Sub2Srch = Split("sub analog_tests,sub bscan_powered_shorts_tests,sub bscan_interconnect_tests," & _
    "sub bscan_incircuit_tests,sub bscan_silicone_nails_tests,sub digital_tests," & _
    "sub functional_tests,sub analog_functional_Tests", ",", , vbTextCompare)
    MyFile = Application.GetOpenFilename("Text Files,*.txt")
    If MyFile = "False" Then Exit Sub

    Set fs = CreateObject("Scripting.FileSystemObject")
    For lCount = 0 To UBound(Sub2Srch)

    On Error GoTo Skip
    Set a = fs.openTextFile(MyFile)

    Do
    t = a.readline
    Loop Until InStr(1, t, Sub2Srch(lCount)) > 0

    i = 1

    Do
    t = a.readline
    If Left(t, 1) = "!" Then
    If InStr(t, "test") Then
    If InStr(t, "analog") Then
    Cells(i, 1) = "!"
    Cells(i, 2) = "Analog"

    Col3 = InStr(t, "/")
    If Col3 <> 0 Then
    EndQuote = InStr(Col3, t, Chr(34))
    If EndQuote <> 0 Then
    Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
    End If
    End If
    i = i + 1
    End If

    If InStr(t, "digital") Then
    Cells(i, 1) = "!"
    Cells(i, 2) = "Digital"

    Col3 = InStr(t, "/") + 1
    If Col3 <> 0 Then
    EndQuote = InStr(Col3, t, Chr(34))
    If EndQuote <> 0 Then
    Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
    End If
    End If
    i = i + 1
    End If
    End If
    End If

    Loop Until InStr(1, t, "subend") > 0
    Skip:
    a.Close
    Next
    End Sub
    [/vba]

    David

  4. #4
    hi Tinbendr, thanks for your help, but your codes didn't do all the steps i mentioned. If all the steps were are done successfully, there should be over 500 of lines extracted into the spreadsheet but right now there is only 50 plus of lines
    I have attached the workbook wit your codes. Do take a look.Thank you.
    Attachment 4855

  5. #5
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    If you cannot find the problem, I suggest you post the text file. Without this, it will be difficult to assist.

    Edit
    I see it is in the initial zip file.
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  6. #6
    Administrator
    VP-Knowledge Base VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,489
    Location
    If you copy and paste your text file in sheet 3, this version of David's code will show used lines and highlight processed lines on the text file.

    Note that the i=1 line was moves to prevent it resetting.

    [vba]
    Option Explicit
    Option Compare Text
    Private Sub CommandButton1_Click()

    Dim fs As Object, a As Object
    Dim t As String
    Dim MyFile As String
    Dim i As Long
    Dim Sub2Srch() As String
    Dim lCount As Long
    Dim Col3 As Long
    Dim EndQuote As Long
    Dim x As Long

    Columns("A:A").Select
    Selection.ClearContents
    Range("A1").Select

    Sub2Srch = Split("sub analog_tests,sub bscan_powered_shorts_tests,sub bscan_interconnect_tests," & _
    "sub bscan_incircuit_tests,sub bscan_silicone_nails_tests,sub digital_tests," & _
    "sub functional_tests,sub analog_functional_Tests", ",", , vbTextCompare)
    MyFile = Application.GetOpenFilename("Text Files,*.txt")
    If MyFile = "False" Then Exit Sub

    Set fs = CreateObject("Scripting.FileSystemObject")
    i = 1
    For lCount = 0 To UBound(Sub2Srch)
    x = 0
    On Error GoTo Skip
    Set a = fs.openTextFile(MyFile)

    Do
    t = a.readline
    x = x + 1
    Loop Until InStr(1, t, Sub2Srch(lCount)) > 0



    Do
    t = a.readline
    x = x + 1
    If Left(t, 1) = "!" Then
    If InStr(t, "test") Then
    If InStr(t, "analog") Then

    Cells(i, 1) = "!"
    Cells(i, 2) = "Analog"
    Cells(i, 4) = t
    Sheets(3).Cells(x, 1).Interior.ColorIndex = 6
    Sheets(3).Cells(x, 2) = "Analog"
    Col3 = InStr(t, "/")
    If Col3 <> 0 Then
    EndQuote = InStr(Col3, t, Chr(34))
    If EndQuote <> 0 Then
    Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
    End If
    End If
    i = i + 1
    End If

    If InStr(t, "digital") Then
    Cells(i, 1) = "!"
    Cells(i, 2) = "Digital"
    Cells(i, 4) = t
    Sheets(3).Cells(x, 1).Interior.ColorIndex = 7
    Sheets(3).Cells(x, 2) = "Digital"

    Col3 = InStr(t, "/") + 1
    If Col3 <> 0 Then
    EndQuote = InStr(Col3, t, Chr(34))
    If EndQuote <> 0 Then
    Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
    End If
    End If
    i = i + 1
    End If
    End If
    End If

    Loop Until InStr(1, t, "subend") > 0
    Skip:
    a.Close
    Next
    End Sub



    [/vba]
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

  7. #7
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    OK, I see where I goofed.

    Back to the chalk board.

    David

  8. #8
    Thanks MD, your codes did work but u didn't include this part:

    3) Look for lines that start with “test” , if the next word is something like, "analog/….” Or "Digital/….” Get their information onto the spreadsheet. For e.g. if the line is

    test "digital/u3" on boards BoardSet_boards_1_to_2(*) !ok

    - “Digital ” in column B
    - “u3 ” in column C
    - The rest of the words in the line not needed

    I tried to modify your codes and added something like this
    [vba]

    If Left(t, 1) = "test" Then
    'If InStr(t, "test") Then
    If InStr(t, "analog") Then

    Cells(i, 1) = "!"
    Cells(i, 2) = "Analog"
    Cells(i, 4) = t
    Sheets(3).Cells(x, 1).Interior.ColorIndex = 6
    Sheets(3).Cells(x, 2) = "Analog"
    Col3 = InStr(t, "/")
    If Col3 <> 0 Then
    EndQuote = InStr(Col3, t, Chr(34))
    If EndQuote <> 0 Then
    Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
    End If
    End If
    i = i + 1
    End If
    [/vba]

    But it didn't work. Can u take a look.
    I have attached the workbook with your codes
    Attachment 4858

  9. #9
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    OK, started over from scratch.

    A couple of observations.

    1) The only difference in 2 & 3 is whether there is a exclamation point in the left column. So, I removed that from the drill down logic and simply added after 'test' was found.

    2) There was one 'Mixed/u31' in the data, so I changed the logic for writing the phrase to the column to include it as well.

    I added columns D/E to check to see if the data was coming from the correct Sub.

    I did not check the accuracy of the data extracted. I'll leave that to you.

    Also, you said in your earlier post that there should be about 500 entries, but this version extracts 743. Is it pulling too many?

    Drop this in a module.

    Hope this helps!
    David

    [vba]Sub ParseFlie2()
    Dim WB As Workbook
    Dim WS As Worksheet
    Dim LastRow As Long

    Dim FName As String
    Dim FNum As Long
    Dim Sub2Srch() As String
    Dim LineStr As String
    Dim lCount As Long
    Dim RowCount As Long
    Dim DataCount As Long
    Dim Phrase() As String
    Dim StartQuote As Long
    Dim EndQuote As Long
    Dim B As Long

    FName = Application.GetOpenFilename("Text Files,*.txt")
    If FName = "False" Then Exit Sub

    Set WB = ActiveWorkbook
    Set WS = WB.Worksheets(1)

    LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    WS.Range("A1:E" & LastRow).Clear

    Sub2Srch = Split("sub Analog_Tests,sub bscan_powered_shorts_tests,sub bscan_interconnect_tests," & _
    "sub bscan_incircuit_tests,sub bscan_silicone_nails_tests,sub digital_tests," & _
    "sub functional_tests,sub analog_functional_Tests", ",", , vbTextCompare)

    RowCount = 1: DataCount = 1

    FNum = FreeFile

    For lCount = 0 To UBound(Sub2Srch)
    On Error GoTo skip
    Open FName For Input As FNum

    Line Input #FNum, LineStr

    Do Until EOF(FNum)
    DataCount = DataCount + 1
    If InStr(1, UCase(LineStr), UCase(Sub2Srch(lCount))) > 0 Then
    Do Until InStr(LCase(LineStr), "subend") > 0
    ' test "analog/c1"
    If InStr(1, LineStr, "test") Then
    If Left(LineStr, 1) = "!" Then
    WS.Range("A" & RowCount) = "!"
    End If
    StartQuote = InStr(1, LineStr, Chr(34)) + 1
    EndQuote = InStr(StartQuote, LineStr, Chr(34))
    Phrase = Split(Mid(LineStr, StartQuote, EndQuote - StartQuote), "/")
    WS.Range("B" & RowCount) = Phrase(0)
    WS.Range("C" & RowCount) = Phrase(1)
    'Extra columns for testing
    WS.Range("d" & RowCount) = Sub2Srch(lCount)
    DataCount = DataCount + 1
    WS.Range("E" & RowCount) = DataCount
    RowCount = RowCount + 1

    End If
    Line Input #FNum, LineStr
    Loop
    End If
    Line Input #FNum, LineStr
    Loop
    skip:
    Close #FNum
    Next

    End Sub

    [/vba]

  10. #10
    hi Tinbendr, thanks for your help. But this time round i can't even open the file.
    I getting an error called: object variable or With block is not set
    and when i press debug, it highlighted

    [VBA]
    LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    [/VBA]

    I have attached th workbook with your codes in it.

    Attachment 4859

  11. #11
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    Agh!

    Sorry about that.

    Change the line to
    [vba]LastRow = WS.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row[/vba]

  12. #12
    Thanks Tinbendr, it worked. One last question, is it possible, if for e.g. "Sub Analog_Tests" is missing in my txt file, will the codes able to ignore it and continue with the rest of sub and extract theie details.
    I tried on error resume next but it gives error message.

  13. #13
    VBAX Expert Tinbendr's Avatar
    Joined
    Jun 2005
    Location
    North Central Mississippi (The Pines)
    Posts
    993
    Location
    Quote Originally Posted by rafi_07max
    is it possible, if for e.g. "Sub Analog_Tests" is missing in my txt file, will the codes able to ignore it and continue with the rest of sub and extract the details.
    No. If the sub title is missing, it will not extract any details.
    I tried on error resume next but it gives error message.
    So are you telling me that it errors on your end when a Sub that is listed in the Sub2Srch is missing? I'd have to see the spreadsheet, because there are other Sub titles in the list that aren't in the text file either, and they don't error.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •