PDA

View Full Version : dealing with text files



BrianMH
03-28-2011, 04:37 AM
Hi,

I currently work for a company that recieves files in multiple formats. Our team converts these files so that it can be read in our main database and the appropriate action taken the accounts. Conversion of these files is easy however effectively capturing them can sometimes prove difficult.

Each file comes in as a csv file. It comes in with a header with file details and a footer advising how many 'flows' are included in the file.

Each action that needs to be taken in our system is contained in the flow type and the information required is included.

An example of this is:

"T01",1177,"SR","P",20110325,201304,1949
"T09",7793168,"7065406546","1322092","EH32","9GL",20110420,20110325,201304
"T09",7793201,"7065406912","999454","LE67","3BB",20110419,20110325,201304
"T09",7793204,"7456718602","521167","LN4","2UQ",20110419,20110325,201304
"T17",7716326,"","7481238806","718409","LE11","5DY",20110406,"GUK"
"T17",7716363,"","7489999306","448977","NG16","",20110406,"GUK"
"T17",7719301,"","7487155304","434805","MK19","7EG",20110415,"BGT"
"T17",7719302,"","7486688211","460905","NG24","4JT",20110415,"BGT"
"T16",7727512,"15695509","31547","DUMM","7485299102",1339046,"ATC0008637","E9701","1BB","1","5","S80","2LQ","736404",2000,5,"CRED","M","32","U6","UGI","N",20110406,"",20110325,201304,"N","SOP","6734"
"T16",7727513,"15696777","31490","MORR","7486281604",1339416,"ATC0032402","E9701","4BS","1","21","DE11","0QB","3499",2003,5,"CRED","M","32","G4","ACT","N",20110406,"",20110325,201304,"N","BGT","20245"
"T99",11

Currently to work with this data our team is running a macro to import these files into a spreadsheet then we copy each line and put it into an access database table for each different flow or action type.

Sometimes an action has further information in new lines with a related code.

For instance
T44, data, data, data
T01, reason for t44
T01, further reason for t44
T01, futher reason for t44

This is industry data so I cannot get the way it comes in redesigned but I would like to programmatically handle this data. I have tried to import this data directly into access in the past but since the columns on the file can hold different data types due to the lengths of each individual flow it often has errors importing it or completely misses some of it out. Another issue is that any automatic sorting of a table messes with the structure and relationship. So for example the T44 above needs the T01 directly after it as there is nothing to discern its relationship other than its position in the file.

I am looking for some suggestions as to how I can read these text files directly and how I can break them down to fields.

Thanks for any advice, links, etc in advance.

kenhigg
03-29-2011, 09:04 AM
Brian,

You should be able to attach to the text file, loop through with recordsets and parse them out with logic code...

Hope that makes sense :)

ken

dicepackage
03-30-2011, 10:11 AM
Here is a rough example. This code will need to be tweaked a bit to work on your example. What this is doing is running through line by line. The data for each line is placed in the String variable LineData. From LineData you can have a variable to pull each specific field. Every line is then added to a table by using the runSQL command to insert a line into the table.


Dim LineData As String ' Variable to take in all the data at once
Set cncurrent = CurrentProject.Connection
Open "C:\TextFile.txt" For Input As #1 ' Open the table to insert the text file into
Dim FIELD1 As String
Dim FIELD2 As String
Dim FIELD3 As String
Dim FIELD4 As String
Do While Not EOF(1) ' Read a line of data.
Line Input #1, LineData
FIELD1 = Left(LineData, InStr(LineData, ","))
LineData = Mid(LineData, InStr(LineData, ","), Len(LineData))
FIELD2 = Left(LineData, InStr(LineData, ","))
LineData = Mid(LineData, InStr(LineData, ","), Len(LineData))
FIELD3 = Left(LineData, InStr(LineData, ","))
LineData = Mid(LineData, InStr(LineData, ","), Len(LineData))
FIELD4 = Left(LineData, InStr(LineData, ","))
DoCmd.RunSQL "INSERT INTO TABLENAME (FIELD1, FIELD2, FIELD3, FIELD4) Values (" & FIELD1 & ",""" + FIELD2 + """," & """" & FIELD3 & """," & """" & FIELD4 & """" & ");"
Loop

BrianMH
04-11-2011, 02:21 AM
Hi,

Thanks for the above code. I was wondering if there is a prebuilt way to define the data a csv and input the line directly into an array of undefined dimensions and work with the data from there and make it recognize the quotes as text definitions.

So for instance "T09",7793168,"7065406546","1322092","EH32","9GL",20110420,20110325,201 304
would come out as an array of
T09,7793168,7065406546,1322092,EH32,9GL,20110420,20110325,201 304
with a ubound of 8 and if a shorter or longer field the appropriate size of the array would be automatically defined. Something like
array = split(string,",") but also ignoring the quotes.

I see data like this all the time so I would think there is some prebuilt functions to deal with it but perhaps not.

dicepackage
04-11-2011, 06:00 AM
Yes you can accomplish this by looping while InStr(LineData,",") > 0. You could then put your data into an array. The problem however is that when uploading the data into sql you need to know what fields go where. If the order is inconsistent it will mess up the uploading process.

BrianMH
04-11-2011, 06:15 AM
Hi,

Thanks for the reply. This will be fine as I won't upload the whole file into sql immediately I will work with each line as I need to and code around that. Thanks for the suggestions so far. Any other ideas on efficiency would be appreciated.

hansup
04-11-2011, 09:54 AM
I think you were heading in the right direction with the Split() function. If the remaining issue is discarding the quotes, you can use the Replace() function to replace quotes with empty strings.

Replace(LineData, """", "")

That was 4 double quotes as the second argument and 2 as the third.

You could do that immediately before calling the Split() function on LineData, or combine Split() and Replace() in one command line.

VarArrary = Split(Replace(LineData, """", ""), ",")