PDA

View Full Version : extract only specific words/lines from a txt file



rafi_07max
11-06-2010, 11:21 PM
For this thread I would like to know once I opened a txt file instead of opening the entire contents from the file, I would like my macro to

1) Look for “Sub Pre_Shorts” and “subend”. “Sub Pre_Shorts” is the starting point and “subend” is ending point. I need to extract information data that are between these two points (I managed to do this, take a look at my attached file)

Once locate these two points, need to do the following steps:

2) Look for lines that start with “!”, if the next word is something like, test "analog/….” Or test "Digital/….” Get their information onto the spreadsheet. For e.g. if the line is

!#@ test "analog/l1" on boards BoardSet_boards_1_to_2(*)

- “!” should be column A, “#@” not needed
- “analog ” in column B
- “l1 ” in column C
- The rest of the words in the line not needed

3) Look for lines that start with “test” , if the next word is something like, "analog/….” Or "Digital/….” Get their information onto the spreadsheet. For e.g. if the line is

test "digital/u3" on boards BoardSet_boards_1_to_2(*) !ok

- “Digital ” in column B
- “u3 ” in column C
- The rest of the words in the line not needed

This is only extracting data for Sub Pre_Short s. There is still few other “Sub …. “. So once the macro complete the task for “Sub Pre_Shorts” . It should look for the other “Sub …” and repeat step 2 and 3. Below is the name of the other “Sub .. “ names which need to look for,

Start Point End Point


Start Point --------------------------------- End Point

Sub Analog_Tests --------------------------- subend
sub BScan_Powered_Shorts_Tests ----------- subend
sub BScan_Interconnect_Tests -------------- subend
sub BScan_Incircuit_Tests ------------------- subend
sub BScan_Silicon_Nails_Tests --------------- subend
sub Digital_Tests ---------------------------- subend
sub Functional_Tests ------------------------ subend
sub Analog_Functional_Tests ----------------- subend

Inside the txt file (attached) there is other names starting with “Sub..", e.g. “sub Connect_Check” those are not needed to extract, so can ignore.

Important: If one of the start point cannot be found in the txt file, the macro should ignore that start point and should continue with the other start points, (e.g. if “Sub Analog_Tests” is not in the txt file, just ignore it and continue with the rest of the start points).

Right now, I am stuck at step 2 and 3, and also not sure how to loop so that once it complete one start point it goes on with another start point.
I have attached some pictures to illustrate more clearly what i want to achieve.

http://www.iimmgg.com/image/374f110056092dd1ac74d58449af0f9f
http://www.iimmgg.com/image/63416a9afcc8b567e8723a5f266d09fa
http://www.iimmgg.com/image/53dfadfb1859afb814a4178f460e7bba
http://www.iimmgg.com/image/025563eecc2759afac24c79f3661ce97

I have attached sample txt file
I have also attached a workbook, Sheet1 consist a button with my initial codes and sheet2 is what my desired output should look like
4853

Tinbendr
11-07-2010, 05:13 AM
I think you should have continued this in the other thread (http://www.vbaexpress.com/forum/showthread.php?t=34781), or at least provided a link as this thread is very much related.

This is certainly becoming more challenging.

Tinbendr
11-07-2010, 06:35 AM
Give this a try.

This assumes that Column C data will always start with "/" and end with a quote.

Option Explicit
Option Compare Text
Private Sub CommandButton1_Click()

Dim fs As Object, a As Object
Dim t As String
Dim MyFile As String
Dim i As Long
Dim Sub2Srch() As String
Dim lCount As Long
Dim Col3 As Long
Dim EndQuote As Long

Columns("A:A").Select
Selection.ClearContents
Range("A1").Select

Sub2Srch = Split("sub analog_tests,sub bscan_powered_shorts_tests,sub bscan_interconnect_tests," & _
"sub bscan_incircuit_tests,sub bscan_silicone_nails_tests,sub digital_tests," & _
"sub functional_tests,sub analog_functional_Tests", ",", , vbTextCompare)
MyFile = Application.GetOpenFilename("Text Files,*.txt")
If MyFile = "False" Then Exit Sub

Set fs = CreateObject("Scripting.FileSystemObject")
For lCount = 0 To UBound(Sub2Srch)

On Error GoTo Skip
Set a = fs.openTextFile(MyFile)

Do
t = a.readline
Loop Until InStr(1, t, Sub2Srch(lCount)) > 0

i = 1

Do
t = a.readline
If Left(t, 1) = "!" Then
If InStr(t, "test") Then
If InStr(t, "analog") Then
Cells(i, 1) = "!"
Cells(i, 2) = "Analog"

Col3 = InStr(t, "/")
If Col3 <> 0 Then
EndQuote = InStr(Col3, t, Chr(34))
If EndQuote <> 0 Then
Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
End If
End If
i = i + 1
End If

If InStr(t, "digital") Then
Cells(i, 1) = "!"
Cells(i, 2) = "Digital"

Col3 = InStr(t, "/") + 1
If Col3 <> 0 Then
EndQuote = InStr(Col3, t, Chr(34))
If EndQuote <> 0 Then
Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
End If
End If
i = i + 1
End If
End If
End If

Loop Until InStr(1, t, "subend") > 0
Skip:
a.Close
Next
End Sub


David

rafi_07max
11-07-2010, 07:35 AM
hi Tinbendr, thanks for your help, but your codes didn't do all the steps i mentioned. If all the steps were are done successfully, there should be over 500 of lines extracted into the spreadsheet but right now there is only 50 plus of lines
I have attached the workbook wit your codes. Do take a look.Thank you.
4855

mdmackillop
11-07-2010, 07:53 AM
If you cannot find the problem, I suggest you post the text file. Without this, it will be difficult to assist.

Edit
I see it is in the initial zip file.

mdmackillop
11-07-2010, 08:32 AM
If you copy and paste your text file in sheet 3, this version of David's code will show used lines and highlight processed lines on the text file.

Note that the i=1 line was moves to prevent it resetting.


Option Explicit
Option Compare Text
Private Sub CommandButton1_Click()

Dim fs As Object, a As Object
Dim t As String
Dim MyFile As String
Dim i As Long
Dim Sub2Srch() As String
Dim lCount As Long
Dim Col3 As Long
Dim EndQuote As Long
Dim x As Long

Columns("A:A").Select
Selection.ClearContents
Range("A1").Select

Sub2Srch = Split("sub analog_tests,sub bscan_powered_shorts_tests,sub bscan_interconnect_tests," & _
"sub bscan_incircuit_tests,sub bscan_silicone_nails_tests,sub digital_tests," & _
"sub functional_tests,sub analog_functional_Tests", ",", , vbTextCompare)
MyFile = Application.GetOpenFilename("Text Files,*.txt")
If MyFile = "False" Then Exit Sub

Set fs = CreateObject("Scripting.FileSystemObject")
i = 1
For lCount = 0 To UBound(Sub2Srch)
x = 0
On Error GoTo Skip
Set a = fs.openTextFile(MyFile)

Do
t = a.readline
x = x + 1
Loop Until InStr(1, t, Sub2Srch(lCount)) > 0



Do
t = a.readline
x = x + 1
If Left(t, 1) = "!" Then
If InStr(t, "test") Then
If InStr(t, "analog") Then

Cells(i, 1) = "!"
Cells(i, 2) = "Analog"
Cells(i, 4) = t
Sheets(3).Cells(x, 1).Interior.ColorIndex = 6
Sheets(3).Cells(x, 2) = "Analog"
Col3 = InStr(t, "/")
If Col3 <> 0 Then
EndQuote = InStr(Col3, t, Chr(34))
If EndQuote <> 0 Then
Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
End If
End If
i = i + 1
End If

If InStr(t, "digital") Then
Cells(i, 1) = "!"
Cells(i, 2) = "Digital"
Cells(i, 4) = t
Sheets(3).Cells(x, 1).Interior.ColorIndex = 7
Sheets(3).Cells(x, 2) = "Digital"

Col3 = InStr(t, "/") + 1
If Col3 <> 0 Then
EndQuote = InStr(Col3, t, Chr(34))
If EndQuote <> 0 Then
Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
End If
End If
i = i + 1
End If
End If
End If

Loop Until InStr(1, t, "subend") > 0
Skip:
a.Close
Next
End Sub

Tinbendr
11-07-2010, 09:42 AM
OK, I see where I goofed.

Back to the chalk board.

David

rafi_07max
11-07-2010, 06:20 PM
Thanks MD, your codes did work but u didn't include this part:

3) Look for lines that start with “test” , if the next word is something like, "analog/….” Or "Digital/….” Get their information onto the spreadsheet. For e.g. if the line is

test "digital/u3" on boards BoardSet_boards_1_to_2(*) !ok

- “Digital ” in column B
- “u3 ” in column C
- The rest of the words in the line not needed

I tried to modify your codes and added something like this


If Left(t, 1) = "test" Then
'If InStr(t, "test") Then
If InStr(t, "analog") Then

Cells(i, 1) = "!"
Cells(i, 2) = "Analog"
Cells(i, 4) = t
Sheets(3).Cells(x, 1).Interior.ColorIndex = 6
Sheets(3).Cells(x, 2) = "Analog"
Col3 = InStr(t, "/")
If Col3 <> 0 Then
EndQuote = InStr(Col3, t, Chr(34))
If EndQuote <> 0 Then
Cells(i, 3) = Mid(t, Col3, EndQuote - Col3)
End If
End If
i = i + 1
End If


But it didn't work. Can u take a look.
I have attached the workbook with your codes
4858

Tinbendr
11-07-2010, 07:32 PM
OK, started over from scratch.

A couple of observations.

1) The only difference in 2 & 3 is whether there is a exclamation point in the left column. So, I removed that from the drill down logic and simply added after 'test' was found.

2) There was one 'Mixed/u31' in the data, so I changed the logic for writing the phrase to the column to include it as well.

I added columns D/E to check to see if the data was coming from the correct Sub.

I did not check the accuracy of the data extracted. I'll leave that to you.

Also, you said in your earlier post that there should be about 500 entries, but this version extracts 743. Is it pulling too many?

Drop this in a module.

Hope this helps!
David

Sub ParseFlie2()
Dim WB As Workbook
Dim WS As Worksheet
Dim LastRow As Long

Dim FName As String
Dim FNum As Long
Dim Sub2Srch() As String
Dim LineStr As String
Dim lCount As Long
Dim RowCount As Long
Dim DataCount As Long
Dim Phrase() As String
Dim StartQuote As Long
Dim EndQuote As Long
Dim B As Long

FName = Application.GetOpenFilename("Text Files,*.txt")
If FName = "False" Then Exit Sub

Set WB = ActiveWorkbook
Set WS = WB.Worksheets(1)

LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
WS.Range("A1:E" & LastRow).Clear

Sub2Srch = Split("sub Analog_Tests,sub bscan_powered_shorts_tests,sub bscan_interconnect_tests," & _
"sub bscan_incircuit_tests,sub bscan_silicone_nails_tests,sub digital_tests," & _
"sub functional_tests,sub analog_functional_Tests", ",", , vbTextCompare)

RowCount = 1: DataCount = 1

FNum = FreeFile

For lCount = 0 To UBound(Sub2Srch)
On Error GoTo skip
Open FName For Input As FNum

Line Input #FNum, LineStr

Do Until EOF(FNum)
DataCount = DataCount + 1
If InStr(1, UCase(LineStr), UCase(Sub2Srch(lCount))) > 0 Then
Do Until InStr(LCase(LineStr), "subend") > 0
' test "analog/c1"
If InStr(1, LineStr, "test") Then
If Left(LineStr, 1) = "!" Then
WS.Range("A" & RowCount) = "!"
End If
StartQuote = InStr(1, LineStr, Chr(34)) + 1
EndQuote = InStr(StartQuote, LineStr, Chr(34))
Phrase = Split(Mid(LineStr, StartQuote, EndQuote - StartQuote), "/")
WS.Range("B" & RowCount) = Phrase(0)
WS.Range("C" & RowCount) = Phrase(1)
'Extra columns for testing
WS.Range("d" & RowCount) = Sub2Srch(lCount)
DataCount = DataCount + 1
WS.Range("E" & RowCount) = DataCount
RowCount = RowCount + 1

End If
Line Input #FNum, LineStr
Loop
End If
Line Input #FNum, LineStr
Loop
skip:
Close #FNum
Next

End Sub

rafi_07max
11-07-2010, 08:25 PM
hi Tinbendr, thanks for your help. But this time round i can't even open the file.
I getting an error called: object variable or With block is not set
and when i press debug, it highlighted


LastRow = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row


I have attached th workbook with your codes in it.

4859

Tinbendr
11-07-2010, 08:32 PM
Agh!

Sorry about that.

Change the line to
LastRow = WS.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row

rafi_07max
11-08-2010, 12:13 AM
Thanks Tinbendr, it worked. One last question, is it possible, if for e.g. "Sub Analog_Tests" is missing in my txt file, will the codes able to ignore it and continue with the rest of sub and extract theie details.
I tried on error resume next but it gives error message.

Tinbendr
11-08-2010, 03:29 AM
is it possible, if for e.g. "Sub Analog_Tests" is missing in my txt file, will the codes able to ignore it and continue with the rest of sub and extract the details.No. If the sub title is missing, it will not extract any details.

I tried on error resume next but it gives error message. So are you telling me that it errors on your end when a Sub that is listed in the Sub2Srch is missing? I'd have to see the spreadsheet, because there are other Sub titles in the list that aren't in the text file either, and they don't error.