View Full Version : extract specific data from a txt file
rafi_07max
01-19-2011, 11:20 PM
I would like to extract specific words from a txt file into the spreadsheet
 
Look for the line that start with “test”, and once detected 
Extract the next word besides “test” and place it onto column A of sheet 1
Take a look at the image below:
http://www.iimmgg.com/image/5008a60a4c5f11897beaba709f2806bf
 
The final outcome will look something like the image below:
http://www.iimmgg.com/image/3057f59b3bd2952f3c2aff3a17c81e58
 
I have attached a sample txt file to extract
I have also attached a sample workbook 
Sheet 1 contains a button to insert the codes and it also shows how the data looks like once extracted.
5271
Unfortunately, many systems block personal storage or image storage sites.  Could you tell us what specific words we are looking for; are we ripping the entire line if we find a certain word, etc?
rafi_07max
01-20-2011, 05:20 AM
Hi GTO, thanks for the reply.
I have uploaded the two pics in zip file below. 
Do take a look.
5272
Kenneth Hobs
01-20-2011, 06:19 AM
Private Sub CommandButton1_Click()
  Dim inFile As String, fni As Integer, str As String
  Dim sTest As String, s() As String, sTab() As String
  Dim r As Range
  
  inFile = ThisWorkbook.Path & "\test.txt"
  sTest = "test"
  
  Set r = Range("A" & Rows.Count).End(xlUp).Offset(1)
  
  'Iterate through the infile by each line.
  fni = FreeFile
  Open inFile For Input As #fni
  'Line Input #fni, str
  Do While Not EOF(fni)
    Line Input #fni, str
    s() = Split(str)
    If UBound(s) = -1 Then GoTo skip
    If s(0) = sTest Then
      sTab() = Split(s(1), vbTab)
      r.Value = Replace(sTab(0), """", "")
      Set r = r.Offset(1)
    End If
skip:
  Loop
  Close #fni
End Sub
rafi_07max
01-20-2011, 07:09 AM
Thanks Kenneth Hobs for your help.
It works, but it is not extracting certain line
e.g.
 
"   test "VOUT1_Low""
Eventhough “test” is the first word of the line, there are empty spaces before the word “test”, I think your codes assume the empty spaces to be the first word of the line and thus will ignore this line.
 
Is it possible to solve this problem?
Okay, is that the only exception?  Disregard non-printing characters, but 'test' must be the first word>
Kenneth Hobs
01-20-2011, 07:55 AM
Change the line after Line Input to:
s() = Split(Trim(str))
I would certainly try Kenneth's first.  Just as I was too obstinate to give up on a RegExp... (Oy vey! I really need to break down and buy a book!)
 
Option Explicit
    
Sub exa()
Dim REX                         As Object ' RegExp
Dim FSO                         As Object ' FileSystemObject
Dim fsoStream                   As Object ' TextStream
Dim sLineText                   As String
Dim i                           As Long
Dim x                           As Long
Dim y                           As Long
Dim aryTemp(1 To 1, 1 To 40000) As String '<oversize last dimension
Dim aryOutput                   As Variant
    
    Set FSO = CreateObject("Scripting.FileSystemObject")
    If FSO.FileExists(ThisWorkbook.Path & "\test.txt") Then
        Set fsoStream = FSO.OpenTextFile(ThisWorkbook.Path & "\test.txt", 1, False, -2)
    Else
        Exit Sub
    End If
    
    Set REX = CreateObject("VBScript.RegExp")
    With REX
        .Global = False
        .IgnoreCase = True
        .MultiLine = False
        .Pattern = "^(\s*\btest\b\s+)(\S+)"
    End With
     
     With fsoStream
        Do While Not .AtEndOfStream
            sLineText = .ReadLine
            If REX.Test(sLineText) Then
                i = i + 1
                aryTemp(1, i) = REX.Execute(sLineText)(0).SubMatches(1)
            End If
        Loop
        .Close
    End With
    
    ReDim aryOutput(1 To i, 1 To 1)
    For x = 1 To i
        aryOutput(x, 1) = aryTemp(1, x)
    Next
    
    Range("A1").Resize(i).Value = aryOutput
End Sub
 
Hi Ken :-)
 
Mark
rafi_07max
01-22-2011, 02:01 AM
Thanks kenneth Hobs and GTO. Thanks for your time and help. Both the codes works :)
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.