PDA

View Full Version : Extract data from text file to Excel sheet



reddragon477
10-05-2017, 06:03 AM
Dear All,


I am new to excel VBA and have been taking some interest to automate few tasks using VBA. I have some pdfs where I need to re-enter the data into the excel sheet manually and later create pdf reports. So first I successfully created a way to convert pdf to text file and now I would like to parse specific data from this text file to a blank excel sheet.


My sample code:



Option Explicit


Private Sub CommandButton1_Click()


Dim myFile As String, text As String, textline As String, posLat As Integer, posLong As Integer
myFile = "C:\test\sample.txt"


Open myFile For Input As #1
Do Until EOF(1)
Line Input #1, textline
text = text & textline
Loop
Close #1


posLat = InStr(text, "Inst. name")
posLong = InStr(text, "Country")


Range("A1").Value = Mid(text, posLat + 49, 22)
Range("A2").Value = Mid(text, posLong + 30, 40)


End Sub



Basically its a tabulated pdf converted to text file and the information on the 1st column is the same for all, but the form/pdf information could vary and I would like to extract this data to the excel sheet. So I tried using



But I have two problems
1) The information could be bigger that the range limit I used in "InStr()".
2) Secondly, certain information like address could be spanning in other lines of the text file.


So I guess I need to use a loop or if statement method. Which I am not sure of. Could someone kindly direct me in the right way for me to extract the data to the excel sheet.

I have attached the sample text file for reference.20582


Thank you all

Kenneth Hobs
10-05-2017, 08:55 AM
Welcome to the forum!

You are bound to run into problems. Address and names can be problems enough. Without a set standard structure, there is no sure way to parse such.

Your Instr() method should work. You would do it similar to what I did using Split(). Another method would be to use Regular Expression object. I am not that good with the patterns or I would show you that. The concept is similar to the other two methods.


Sub Main()
Dim s As String, fn As String, ad As String, a, i As Integer

fn = ThisWorkbook.Path & "\ParseTxtFile.txt"

'https://msdn.microsoft.com/en-us/library/aa265347(v=vs.60).aspx
s = CreateObject("Scripting.FileSystemObject") _
.OpenTextFile(fn, 1, 0).ReadAll

'1 line address
ad = Trim(Split(Split(s, "Address")(1), vbCrLf)(0))
'Debug.Print ad, vbCrLf, vbCrLf

'Address to Contact Phone
ad = Split(Split(s, "Address")(1), "Contact Phone")(0)
a = Split(ad, vbCrLf)
For i = 0 To UBound(a)
a(i) = Trim(a(i))
Next i
Debug.Print Join(a, vbCrLf)
End Sub

Debug.Print shows a Run result in the VBE Immediate window. Click the VBE View menu to enable it if needed. I generally put it below the Code window.

Here is a regex method.

Sub Main2()
Dim s As String, fn As String, ad As String, a, i As Integer
Dim rx As Object, tf As Boolean, m As Object

fn = ThisWorkbook.Path & "\ParseTxtFile.txt"
'https://msdn.microsoft.com/en-us/library/aa265347(v=vs.60).aspx

s = CreateObject("Scripting.FileSystemObject") _
.OpenTextFile(fn, 1, 0).ReadAll

Set rx = CreateObject("VBScript.RegExp")
With rx
'Includes Address and Contact Phone
.Pattern = "Address([\s\S]*?)Contact Phone"
tf = .Test(s)
If tf Then
Set m = .Execute(s)
ad = m(0)
ad = Right(ad, Len(ad) - Len("Address"))
ad = Left(ad, Len(ad) - Len("Contact Phone"))
a = Split(ad, vbCrLf)
For i = 0 To UBound(a)
a(i) = Trim(a(i))
Next i
Debug.Print Join(a, vbCrLf)
End If
End With
End Sub