PDA

View Full Version : Reading a csv file using vba excel!



agarwaldvk
05-14-2007, 02:54 PM
Hi Everybody

I am trying to read a csv file using VBA Excel. No problem. I know how to do that.

The problem lies here :-
In the text file that I am trying to read, there are some text fields. One of those fields have free flowing text containing new line characters. In other words, the content of this text field sometimes is in more than one paragraph. So as it reads a record in this file which has the content of this file in more than one paragraph, when it comes to the new line character, it deems it to be the end of line and treats the contents beyond that new line character as the beginning of a new record.

Is there a way to get around this problem.


Best regards


Deepak

malik641
05-14-2007, 06:59 PM
Hi Deepak :)

Could you provide a small sample of code to show us the method you read the csv file?

Thanks

agarwaldvk
05-14-2007, 08:17 PM
Wernher von Braun

I shall do that once I get home today


Deepak

mvidas
05-15-2007, 07:48 AM
Deepak,

What character does your CSV file use to designate a new record? Does it use a Carriage Return & Line Feed (ascii 13 & ascii 10) for a new record, and then just Line Feed for the break in the text field? As long as they are not the same delimiter you can do what you're looking for (if they are, theres no way to do it as theres no way for the code to know what it should be).

You can read the entire file as one string, split it by the CRLF to get each record, then split each record by Comma to get each field.

For example, take a look at the following attachment (deepak.csv, or probably will end up being deepak.zip with deepak.csv inside), see if that is what you're referring to. I made a 2-record file from your original question text, with line feed characters throughout. Unzip it to your C:\ (or change the location within the following code). First try opening it directly in excel, or with your current code. Note how the data goes from A1:C8 (it should be A1:C2). Then run the following:

Sub DeepakCSV()
Dim FileContents As String, vFF As Long, vFile As String
Dim vRecords() As String, vTempArr() As Variant, vFileRecords() As String
Dim RecordCount As Long, FieldCount As Long, i As Long, j As Long

'pull entire text file into FileContents variable
vFile = "C:\deepak.csv"
vFF = FreeFile
Open vFile For Binary Access Read As #vFF
FileContents = Space$(LOF(vFF))
Get #vFF, , FileContents
Close #vFF

'split by record-delimiter
vRecords = Split(FileContents, vbCrLf)

'Create overall array to store entire text file's fields and records
' Note that this step is not necessary, just speeds up overall processing time

'Split each record by field delimiter
RecordCount = UBound(vRecords)
ReDim vTempArr(RecordCount)
For i = 0 To RecordCount
vTempArr(i) = Split(vRecords(i), ",")
Next

'determine max number of fields
FieldCount = 0
For i = 0 To RecordCount
If UBound(vTempArr(i)) > FieldCount Then FieldCount = UBound(vTempArr(i))
Next

'create 2 dimensional array to store all data
ReDim vFileRecords(RecordCount, FieldCount)
For i = 0 To RecordCount
For j = 0 To UBound(vTempArr(i))
vFileRecords(i, j) = vTempArr(i)(j)
Next
Next

'Transfer array data onto spreadsheet
Workbooks.Add
With Range("A1").Resize(RecordCount + 1, FieldCount + 1)
.NumberFormat = "@" 'make text format
.Value = vFileRecords
End With
End SubMatt

agarwaldvk
05-16-2007, 05:00 PM
Dear mvidas

I tried your code on the file that you attached. It works fine (absolute beauty)!

Ordinarily when you create a .txt file using Notepad, how do you create just a linefeed without a carriage return? Because in my understanding, whenever, you press the enter key, you are effectively creating a "CrLf" entry, are you not? So, how do you create a "lf" alone in the file that you sent me?

I have attached a file here - a very small file - as I was given. The only modification that I have made to it is that I have replaced some text characters only (due to privacy reasons) with meaningless characters. The format has been maintained as is.

Even with your code, it still terminates the entry (reproduced below as seen when opened in Notepad) at the end of the characters "l;d." when I can't even see a vbCrLf character at the end of these characters. What is happending here?


"sec ok.... alf l as jlasj asl asl sdfal as l;d.
have se copy in batch."" Can you please lo this"

But when I open this open is Excel, I do see a sort of square box (probably representing a vbCrLf character at the end of these characters. Why do I not see this box in Notepad that I see in Excel. And how can I convert this vbCrLf entry to vbLf only entry?

I tried to get the ASCII values of all the individual characters between "d" and "h" inclusive and I got values of 100, 46, 32, 13, 10, 104 respectively indicating that there is a linefeed character followed by a carriage return character.


I hope I have not confused the issue too much.


Best regards


Deepak

mvidas
05-16-2007, 06:21 PM
I see the square you're referring to, when opened in excel directly.

The reason you dont see it in the text file is because it is a normal newline (char13 & char10, or vbCrLf), which as you said is the regular return character.

The reason you do see it in excel is because a 'hard return' in a cell (to force text onto the next line, what happens when you press alt-end while in a cell) is simply a line feed (char10, or vbLf). The reason you see the block at the end of a line is because it is vbcrlf and not just lf (you're seeing the carriage return). Excel doesnt start a new line with this because it is between quotes, so excel is treating it as normal text. this is good though, since it is still part of the same field.

You can't just replace all vbcrlf's with vblf's in the raw text file, since you wont ever have multiple records.

The good news is, you won't need to use vba for this. You could manually delete the leftover square after turning Wrap Text on within that cell. You could also remove it using the substitute function on individual cells after opening, like:
=SUBSTITUTE(A1,CHAR(13),"")

Or, to use VBA and 'clean' the whole sheet, simply run the line:Cells.Replace Chr(13), ""
Your choice :)
Matt