PDA

View Full Version : Solved: Dynamic field parser/importer



nsaint
04-01-2009, 06:48 AM
I am asking for help from the experts on this forum.

I need to import a CSV file into Excel and do a number of things with the values imported (that is not important right now). The problem I am having is with the import - with the number of fields and the order in which the fields appear in the CSV. In the CSV I need to import, there are 512 fields – more than excel can handle. That is o.k., because I don’t want all of them initially and not the problem I am having. I know how to browse to the file and import the data, skipping the fields I don’t need. The problem is the fields sometimes change position on the line. For instance, Field1 is sometimes first and other times 10th or something else, but it is always called Field1.

So I would like to have the user provide a list of fields to get and then figure out where they are on the line so I can know which to import and which to skip.

It seems like I need to read the first line into an array, search for the items I need, figure out the position they hold and then map that to my ActiveSheet.QueryTables.Add(…..).TextFileColumnDataTypes = Array(1, 9, 9,….) to know which ones I have to skip and which ones I have to read in.

When using two files, Test1.csv and Test2 I want to be able to dynamically find Field10 and get the data in that column (0.92 in this case)


Any help is greatly appreciated.

File One: Test1.csv
Field1,Field2,Field3,Field4,Field5,Field6,Field7,Field8,Field9,Field10
0.92,1.08,0.18,0.17,1.80,0.26,1.91,0.49,1.47,0.92

File One: Test2.csv
Field10,Field4,Field6,Field2,Field3,Field5,Field9,Field8,Field7,Field1
0.92,2.08,0.38,0.11,3.60,1.2,0.91,0.5,1.7,1.2

nsaint
04-01-2009, 07:02 AM
I tried posting this on Mr. Excel, but didn't get any feedback. I prefer this forum anyway as the help has been more responsive.

CreganTur
04-01-2009, 07:14 AM
How are you currently importing the records? Are you using an ADO connection? If so, then you can explicitly set the import order by refering to the field names in the recordset. If rst is the recordset object, then you can refer to the field named 'Field1' like this:
rst!Field1
This will call the value of the field named 'Field1' regardless of its physical position within the recordset.

It's a lot better than referring to a field's index position, which can change.

HTH:thumb

nsaint
04-01-2009, 08:02 AM
I am not using ADO with this import. Not sure how I would do that right now.

Below is a portion of the code where I browse for the file and then use the QuerytablesAdd..

I have used ADO to connect to a db before, but not to import a .csv file.


Dim sh As Worksheet
Dim CompressorFile As String

CompressorFile = Application.GetOpenFilename(FileFilter:="CSV Files (*.csv), *.csv", MultiSelect:=False)
If CompressorFile = "False" Then
Dim Response, Msg, Style, Title
Msg = "Nothing has been selected." & CR & _
"No import will be done" & CRLF & _
"Re-Run Compressor Import!"
Style = vbOK + vbExclamation
Title = "Compressor Log File Import Failure"
Response = MsgBox(Msg, Style, Title)
If Response = vbOK Then
End
End If
End
End If


With ActiveSheet.QueryTables.Add(Connection:= _
"TEXT;" & CompressorFile _
, Destination:=Range("A1"))
.Name = " & CompressorFile & "
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = False
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 437
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = True
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.TextFileColumnDataTypes = Array(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 _
, 1, 1, 1, 1, 1, 1, 1, 1)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With

nsaint
04-01-2009, 08:19 AM
If I used ADO for this, would I construct an SQL "Select" statement?

When I used this with Access my recordset was "Select something From TableName Where match criteria

How would I use this to import a .csv file rst!Field1??

If you could give a brief example of ADO in this context it would help. Also would want to have multiple fields as my recordset for import.

CreganTur
04-01-2009, 08:34 AM
Here's the ADO you would use to connect to a .csv file or other delimited text file:

conn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" _
& "Data Source=" & strFilepath & ";" _
& "Extended Properties=""text;HDR=Yes;FMT=Delimited(,)"";"

Set rst = New ADODB.Recordset
rst.Open "SELECT * FROM [" & strFileName & "];", conn

Now, the strFilePath variable needs to hold the filepath to the Folder you want to look in, not the file itself. strFileName is the actual name of the file you want to open. The above will create a recordset for every record in the file. The HDR=Yes extended property means that the first line of the file contains field headers. This is how fields are created within a recordset- it does it itself. It uses the same names in the file to create the recordset's field names.

You could refine the rst.Open SQL statement by adding in a WHERE clause, or any other standard SQL keywords.

HTH:thumb

nsaint
04-01-2009, 09:11 AM
This is great and I will start playing with this. The FilePath and FileName information. I need this to be dynamic as it may be in different places depending on when it is used or called upon. That is the reason I used the "Application.GetOpenFilename()" call above, but it always returns the fully qualified name, including path.

What is the quickest way to parse out the path and filename using the last "\" as the seperator?

CreganTur
04-01-2009, 11:03 AM
What is the quickest way to parse out the path and filename using the last "\" as the seperator?
Look into the InStr() and InStrRev functions; great resources in Access Help. You can use those along with other text Functions (Right(), Left(), etc) to manipulate and parse the filepath.

Let me know if you need more specific help.

nsaint
04-01-2009, 01:41 PM
I used the following;

Filename = Right(CompressorFile, (Len(CompressorFile) - InStrRev(CompressorFile, "\", , vbTextCompare)))
DataSource = Left(CompressorFile, (Len(CompressorFile) - Len(Filename)))


and it appears to have worked. Thanks for pointing me in the right direction. The String "CompressorFile" is derived from browsing to a file location as seen above.


Now off to the other work on this project.

Thanks for the help again!

CreganTur
04-02-2009, 05:16 AM
You're welcome!

If your issue is resolved, please remember to help us out by marking the thread as Solved- click Thread Tools-> Mark as Solved at the top of this thread :thumb

nsaint
04-02-2009, 01:17 PM
:banghead:
The code worked on sample data with sample Filed names etc. But when I used the actual data is where I ran into some issues. I think it may have to do with the filed separators. The fields are separated by a colon, but many of the fields have a period(.) in the name.

Here is a portion of the code that opens the ADO connection and gets the recordset. Worked fine for sample data, but actual data failed.


SQLStr1 = " SELECT samplingTimeSeconds, write-io.average-bytes, write-io.deviation-bytes " & _
" FROM [" & Filename & "]"

' Open connection using DataSource
Cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & DataSource & ";" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited(,)"";"

' Open recordset and pass in results from SQLstr statement and dump results to worksheet location
rs.Open SQLStr1, Cn, adOpenStatic
With ActiveSheet.Range("B4")
.ClearContents
.CopyFromRecordset rs
End With
rs.Close

If I get just the first field only, it works, but when including the other fields it fails with "Method 'Open' of Object'_recordset' failed"

I can't change the source data so fields don't have a period in the middle of the name.....any thoughts?

CreganTur
04-02-2009, 01:23 PM
Whenever you have field names with illegal characters (spaces and symbols), you must wrap the field name with brackets. Adjust your SQL to this:

SQLStr1 = " SELECT samplingTimeSeconds, [write-io.average-bytes], [write-io.deviation-bytes] " & _
" FROM [" & Filename & "]"


To SQL, periods are an indication of Table.Field relationships. It expects the name to the left of the period to be a valid table name, and what is on the right to be a valid field name. That's why it was failing.

nsaint
04-02-2009, 01:45 PM
I tried what you have just given me and I received an error;
"Invalid bracketing of name [write-io.average-bytes]"

nsaint
04-02-2009, 01:53 PM
I also get errors when one of the fields are empty.

attached are the sample file and worksheet I am working from.

Remove the first line of the csv file. Unless I can specify the headers are on the second line.

CreganTur
04-03-2009, 05:13 AM
I tried what you have just given me and I received an error;
"Invalid bracketing of name [write-io.average-bytes]"

Okay, then ADO can't get past the fact that it thinks the period is defining a Table.Field Relationship.

nsaint
04-03-2009, 05:28 AM
That's too bad...this was so close to providing me with what I needed to do. Is there no other way to form the Select....and From statement to account for this?

If I have to start back at square one, any suggestions?