Log in

View Full Version : Parsing textfile



arm
08-21-2009, 03:23 AM
Hi,

Is there any easy way to read and parse file which contains information delimited (by ~) this way:

---
111111~X~Some text. Some text. Some text.~
222222~X~Some text. Some text. Some text.
Some text. Some text. Some text.~
333333~X~~
444444~X~Some text. Some text.
~
---

So, I need to get that number, X string and text for variables, for example:

varA = 111111
varB = X
varC = Some text. Some text. Some text.

Thanks!

OBP
08-21-2009, 09:26 AM
Yes you use 3 for next loops, 1 for looping through the Files Lines (records) and the other 2 for parsing the String.
The first loop is use to find the first position of the "~" and the second to find the next "~" starting from the position of the first one.
You can however use the Instr() function to find the first one and then use that with Left() to obtain the varA.
The data between that "~" and the next will be varB using the mid() function.

geekgirlau
08-24-2009, 07:14 PM
Dim strStart As String
Dim strVar() As String


strStart = "111111~X~Some text. Some text. Some text.~"
strVar = Split(strStart, "~")

OBP
08-25-2009, 04:43 AM
geekgirlau, thanks, I forgot about the split() function

arm
08-25-2009, 01:14 PM
Thanks for help!

Especially split() function looks handy.

I'm no a programmer, so I still need help. It would be easy to read string "111111~X~Some text. Some text. Some text.~" line by line, and put it to a string and use split().

But, I'm bit confused how to read file if sometimes text string (varC in example) is divided into several lines?

geekgirlau
08-25-2009, 09:19 PM
I'm assuming that you are reading the text from a text file currently - is this the case? Can you attach a sample file?

arm
08-25-2009, 10:36 PM
Hi geekgirlau,

I have a text file (over 2MB) which I use to update Access database. I try to read it, parse fields and put them to database.

From this file, I need to parse records which consists of three fields as (in example):
-> 111111 (6 number, key value)
-> X (1 char)
-> Some text. Some text. Some text. Bla bla bla. (Memo)

In the file, each records are delimited by "~" just as described in first message:
---
111111~X~Some text. Some text. Some text.~
222222~X~Some text. Some text. Some text. Line feed.
Some text. Some text. Some text.~
333333~X~~
444444~X~Some text. Some text. Line feed.
~
555555~X~Some text. Some text. Bla bla blaa. Line feed.
Bla bla blaa. Line feed.
Bla bla blaa. Line feed.
Bla bla blaa. Like this way.~
666666~X~Some text. Some text. Bla bla blaa.~
---


New record is started always in the beginning of line as "111111~X~", but my problem is, that third string (for memo-field in database) may be divided into several lines including line feeds (or may be empty "~~").

For my brain, it's too complicated, how to read this file, parse each three field records and put them to a database (this is OK). And which is fastest way, file consists thousands of lines. (Comparising old fields in the database and adding only changed fields from the file may be too complicated?)

Thanks for help!

OBP
08-26-2009, 04:38 AM
That is not too complicated for VBA, just a case of looping trhough the records and putting the data in to the Table using a VBA Recordset.
I am sure if you email geekgirlau of myself a copy of the file and your table layout we can creat something for you.

geekgirlau
08-26-2009, 05:09 PM
Why don't you post a small sample text file? Just 100 or so records, with some containing a line feed as in your example.

arm
08-26-2009, 10:16 PM
That file is formed as described before.

geekgirlau
08-27-2009, 05:57 PM
There can be a difference in the character used for the line feed, which is why I requested a sample.

geekgirlau
08-27-2009, 06:36 PM
Sub ReadFile()
Dim strLine As String
Dim strConcat As String
Dim strSQL As String
Dim strVar() As String
Dim i As Integer

Const cstrText = "C:\Data Extract\SampleFile.txt"


On Error GoTo ErrHandler

If Dir(cstrText, vbNormal) = "" Then
MsgBox "Text file not found", vbInformation, "File Not Found"
Else
' read text file, line by line
Open cstrText For Input As #1

Do While Not EOF(1)
' need to clear variables in case of blank fields
ReDim strVar(0 To 2) As String

Input #1, strLine

' cope with line breaks in text file
If Right(strLine, 1) <> "~" Then
strConcat = strConcat & strLine & Chr(10)
Else
strConcat = strConcat & strLine
strVar = Split(strConcat, "~")
strConcat = ""

strSQL = "INSERT INTO MyTable (Field1, Field2, Field3 ) " & _
"SELECT " & strVar(0) & " AS Fld1, " & _
Chr(34) & strVar(1) & Chr(34) & " AS Fld2, " & _
Chr(34) & strVar(2) & Chr(34) & " AS Fld3"

DoCmd.SetWarnings False
DoCmd.RunSQL strSQL
DoCmd.SetWarnings True
End If
Loop
End If

ExitHere:
On Error Resume Next
Close #1
Exit Sub
ErrHandler:
MsgBox Err.Description, vbCritical, "Unexpected error (" & Err.Number & ")"
Resume ExitHere
End Sub

arm
08-31-2009, 03:25 AM
geekgirlau, many thanks for code! It helps a lot.

But, I have a problem with strVar(2) because it contains bad characters for SQL as ", gives error code 3075.

Unfortunately I can't send a sample from the file.

geekgirlau
09-01-2009, 12:28 AM
Try changing strVar to a variant