PDA

View Full Version : Solved: Reading data from a .CSV file



Duncs
05-09-2008, 07:23 AM
Many thanks to all for providing this resource.

I have an Access database that loads several .CSV files however, I have a couple of issues with the loading of the files.

1. The file has one header record, numerous detail records and then a trailer record. Within this trailer record, there is a count of the number of detail records that are in the file. What I want to know is, is there a way to retrieve the trailer record--without reading through all records first--in order to obtain the count of records? I would like to include some kind of progress indicator within the load routine, but without knowing how many records will be imported, I have nothing to set the maximum value against. I don't want to read through all records, as there could be upwards of 60, 000 records in the file, and this just seems silly.

2. Within the files, there is a text field which can sometimes contain a CRLF combination. I'm using "Line Input #lFile, strLine" to read in the data, and the CRLF seems to be treated as an EOL and there is no more data being read form the line. This also causes problems with the remaining lines that are read in. So, is there a way to examine the line of text that is in the file, and identify these characters, whilst treating the real CRLF at the end of the line, as normal!

Hope you can help with these.

TIA

Duncs

Oorang
05-09-2008, 10:43 PM
Hi Duncs,
Welcome to the board :) Without seeing your code I am just taking a stab here, but it sounds like you using the Get methods to parse the csv file in conjunction with dao or ado to the import. This makes sense in some cases but you might want to look at the TransferText method to see if it is for you. However to speak to your original question of line count you can use the code below. It should also resolve your linebreak issue:

Option Explicit

Public Sub GetLines()
Dim lngFileNum As Long
Dim lngFileLen As Long
Dim strFilePath As String
Dim strFileText As String
Dim strLines() As String
'Set Path:
strFilePath = "c:\test\test.csv"
'Get size of file in bytes:
lngFileLen = FileLen(strFilePath)
'Get free file number.
lngFileNum = FreeFile
'Open file:
Open strFilePath For Binary Access Read Lock Write As #lngFileNum
'Set buffer to receive text.
strFileText = Space$(lngFileLen)
'Read in whole file:
Get #lngFileNum, , strFileText
'Release file:
Close #lngFileNum
'Count line breaks:
MsgBox CountInStr(strFileText, vbNewLine, vbBinaryCompare)
'Load lines into array for parsing:
strLines = Split(strFileText, vbNewLine, compare:=vbBinaryCompare)
'Do something with array:
MsgBox strLines(2)
End Sub
Private Function CountInStr(text As String, substring As String, Optional ByVal _
compare As VbCompareMethod = VbCompareMethod.vbBinaryCompare) As Long
CountInStr = (LenB(text) - LenB(Replace(text, substring, vbNullString, _
compare:=compare))) / LenB(substring)
End Function

Duncs
05-12-2008, 04:44 AM
Many thanks for your reply Aaron. The code you provided works a treat, and it actually allows me to re-do some sections of processing, which makes things better.

Once again, many thanks

Duncs

Duncs
05-13-2008, 06:50 AM
Aaron,

I thought I had it solved, but the problem is still there.

As an example, I have the following line in my .CSV file:

"D","4664225","0001105042709","S66X49777","2","1100015750060","20080321004810","1","JEAN HIGGINS 13-MAR-2008
","964","14"

After the date, there appears to be a hard carriage return, which is causing the problem. When the data is imported into the database, it chops off the last two field entries. This then causes a problem with the database insertion routine. For your reference, I have included the code I'm using below.

So, what I need to do is, read the text in and if there is a hard carriage return in the line, and the next line doesn't start with a "D", the next line should be appended to line that has been read. Does that make sense?

Any help you can provide will be appreciated.

Code below

TIA

Duncs

Private Function ImportFile(strFileName As String) As Long
Dim rs As Recordset, lRecordCount As Long
Dim lFile As Long
Dim strReportDate As String
Dim varValues As Variant
Dim strStatus As String
Dim booAdded As Boolean
Dim i As Byte
Dim intStartPos, intCount As Byte
Dim lngFileNum As Long
Dim lngFileLen As Long
Dim strFileText As String
Dim strLines() As String
Dim lngNumLines As Long
'Get size of file in bytes:
lngFileLen = FileLen(strFileName)

'Get free file number.
lngFileNum = FreeFile

'Open file:
Open strFileName For Binary Access Read Lock Write As #lngFileNum

'Set buffer to receive text.
strFileText = Space$(lngFileLen)

'Read in whole file:
Get #lngFileNum, , strFileText

'Release file:
Close #lngFileNum

'Count line breaks:
lngNumLines = CountInStr(strFileText, vbNewLine, vbBinaryCompare)

'Load lines into array for parsing:
strLines = Split(strFileText, vbNewLine, compare:=vbBinaryCompare)

'Do something with array:
Set rs = db.OpenRecordset("tblActarisDebtReport")

intStartPos = InStr(1, strFileName, "SA_", vbTextCompare) + 3
intCount = InStr(1, strFileName, "_01.CSV", vbTextCompare) - intStartPos

strStatus = Mid(strFileName, intStartPos, intCount)

SysCmd acSysCmdInitMeter, "Progress", lngNumLines

With rs
For lRecordCount = 0 To lngNumLines
booAdded = False

If Len(strLines(lRecordCount)) > 0 Then
strLines(lRecordCount) = Replace(strLines(lRecordCount), """", "")
'strLines(lRecordCount) = Replace(strLines(lRecordCount), vbCrLf, "")

varValues = Split(strLines(lRecordCount), ",")
Select Case Left(strLines(lRecordCount), 1)
Case "H"
strReportDate = varValues(1)
Case "T"
lngRecordsInFile = varValues(2)
Case "D"
.AddNew
Select Case strStatus
Case "Cancelled"
!fldSAID = varValues(1)
!fldCRN = varValues(2)
!fldMSN = varValues(3)
!fldMeterType = varValues(4)
!fldMPAN = varValues(5)
!fldStatus = strStatus
!fldStatusDate = MakeDate(varValues(6))
!fldClosureReason = varValues(7)
!fldClosureNotes = varValues(8)
!fldSATotalDebt = MakeCur(varValues(9))
!fldSADebtRecoveryRate = MakeCur(varValues(10))
booAdded = True
Case "Created"
!fldSAID = varValues(1)
!fldCRN = varValues(2)
!fldMSN = varValues(3)
!fldMeterType = varValues(4)
!fldMPAN = varValues(5)
!fldStatus = strStatus
!fldStatusDate = MakeDate(varValues(6))
!fldSATotalDebt = MakeCur(varValues(7))
!fldSADebtRecoveryRate = MakeCur(varValues(8))
!fldMeterLearningDate = MakeDate(varValues(9))
!fldQueuedSAID = varValues(10)
!fldLastVendDate = MakeDate(varValues(11))
!fldLastVendSite = varValues(12)
!fldLastVendDevice = varValues(13)
!fldLastVendNSP = varValues(14)
!fldMeterSettingsDate = MakeDate(varValues(15))
!fldMeterTotalDebt = MakeCur(varValues(16))
!fldMeterDebtRecoveryRate = MakeCur(varValues(17))
booAdded = True
Case "Distributed"
!fldSAID = varValues(1)
!fldCRN = varValues(2)
!fldMSN = varValues(3)
!fldMeterType = varValues(4)
!fldMPAN = varValues(5)
!fldStatus = strStatus
!fldStatusDate = MakeDate(varValues(6))
!fldDistributionLevel = varValues(7)
!fldSATotalDebt = MakeCur(varValues(8))
!fldSADebtRecoveryRate = MakeCur(varValues(9))
!fldLastVendDate = MakeDate(varValues(10))
!fldLastVendSite = varValues(11)
!fldLastVendDevice = varValues(12)
!fldLastVendNSP = varValues(13)
!fldMeterSettingsDate = MakeDate(varValues(14))
!fldMeterTotalDebt = MakeCur(varValues(15))
!fldMeterDebtRecoveryRate = MakeCur(varValues(16))
booAdded = True
Case "Failed"
!fldSAID = varValues(1)
!fldCRN = varValues(2)
!fldMSN = varValues(3)
!fldMeterType = varValues(4)
!fldMPAN = varValues(5)
!fldStatus = strStatus
!fldStatusDate = MakeDate(varValues(6))
!fldSATotalDebt = MakeCur(varValues(7))
!fldSADebtRecoveryRate = MakeCur(varValues(8))
!fldLastVendDate = MakeDate(varValues(9))
!fldLastVendSite = varValues(10)
!fldLastVendDevice = varValues(11)
!fldLastVendNSP = varValues(12)
!fldMeterSettingsDate = MakeDate(varValues(13))
!fldMeterTotalDebt = MakeCur(varValues(14))
!fldMeterDebtRecoveryRate = MakeCur(varValues(15))
booAdded = True
Case "On_Meter"
!fldSAID = varValues(1)
!fldCRN = varValues(2)
!fldMSN = varValues(3)
!fldMeterType = varValues(4)
!fldMPAN = varValues(5)
!fldStatus = strStatus
!fldStatusDate = MakeDate(varValues(6))
!fldForcedCompletion = varValues(7)
!fldSATotalDebt = MakeCur(varValues(8))
!fldSADebtRecoveryRate = MakeCur(varValues(9))
!fldLastVendDate = MakeDate(varValues(10))
!fldLastVendSite = varValues(11)
!fldLastVendDevice = varValues(12)
!fldLastVendNSP = varValues(13)
!fldMeterSettingsDate = MakeDate(varValues(14))
!fldMeterTotalDebt = MakeCur(varValues(15))
!fldMeterDebtRecoveryRate = MakeCur(varValues(16))
booAdded = True
Case "On_Token"
!fldSAID = varValues(1)
!fldCRN = varValues(2)
!fldMSN = varValues(3)
!fldMeterType = varValues(4)
!fldMPAN = varValues(5)
!fldStatus = strStatus
!fldStatusDate = MakeDate(varValues(6))
!fldSATotalDebt = MakeCur(varValues(7))
!fldSADebtRecoveryRate = MakeCur(varValues(8))
!fldLastVendDate = MakeDate(varValues(9))
!fldLastVendSite = varValues(10)
!fldLastVendDevice = varValues(11)
!fldLastVendNSP = varValues(12)
!fldMeterSettingsDate = MakeDate(varValues(13))
!fldMeterTotalDebt = MakeCur(varValues(14))
!fldMeterDebtRecoveryRate = MakeCur(varValues(15))
booAdded = True
Case "Creation", "Update"
!fldCRN = varValues(1)
!fldMSN = varValues(2)
!fldMPAN = varValues(3)
!fldEffectiveDate = MakeDate(varValues(4))
!fldCreationDate = MakeDate(varValues(5))
!fldEarliestMeterSettingsDate = MakeDate(varValues(6))
!fldEarliestTotalDebt = MakeCur(varValues(7))
!fldEarliestDRR = MakeCur(varValues(8))
!fldEarliestTotalCreditAccepted = MakeCur(varValues(9))
!fldEarliestCreditBalance = MakeCur(varValues(10))
!fldLastVendDate = MakeDate(varValues(11))
!fldLastVendSite = varValues(12)
!fldLastVendDevice = varValues(13)
!fldLastVendNSP = varValues(14)
!fldMeterSettingsDate = MakeDate(varValues(15))
!fldMeterTotalDebt = MakeCur(varValues(16))
!fldMeterDebtRecoveryRate = MakeCur(varValues(17))
!fldStatus = "NonDebtSA " & strStatus
booAdded = True
Case Else
Log "Invalid status for file " & strFileName
End Select

If booAdded Then
!fldReportDate = MakeDate(strReportDate)
.Update

SysCmd acSysCmdUpdateMeter, , lRecordCount
Else
Log "Problem recording record number " & lRecordCount & " from file " & strFileName
End If
Case "I"
' Do nothing
Case Else
If lRecordCount <> 0 Then
Log "Invalid row data identifier in file " & strFileName
End If
End Select
End If
Next lRecordCount
End With

rs.Close
Set rs = Nothing
SysCmd acSysCmdClearStatus
ImportFile = lngNumLines 'return number of records inserted
End Function

Duncs
05-15-2008, 02:07 PM
I hope I haven't broken any rules by posting such a large section of code. Aaron, I'd appreciate any input you can provide with my query.

Oorang
05-15-2008, 04:41 PM
Nope you're fine:)

This will return a csv file ina two dimensional array free of linebreaks:
Public Function GetLines(ByVal strFilePath As String, _
Optional ByVal blnRemoveLineBreaks As Boolean = False) As String()
Dim lngFileNum As Long
Dim lngFileLen As Long
Dim lngIndx As Long
Dim strFileText As String
Dim strLines() As String

'Get size of file in bytes:
lngFileLen = FileLen(strFilePath)
'Get free file number.
lngFileNum = FreeFile
'Open file:
Open strFilePath For Binary Access Read Lock Write As #lngFileNum
'Set buffer to receive text.
strFileText = Space$(lngFileLen)
'Read in whole file:
Get #lngFileNum, , strFileText
'Release file:
Close #lngFileNum
'Load lines into array for parsing:
strLines = Split(strFileText, vbNewLine, compare:=vbBinaryCompare)
'Remove Line Breaks
If blnRemoveLineBreaks Then
For lngIndx = LBound(strLines) To UBound(strLines)
Do Until Not CBool(InStrB(strLines(lngIndx), vbLf))
strLines(lngIndx) = Replace(strLines(lngIndx), vbLf, vbNullString)
Loop
Do Until Not CBool(InStrB(strLines(lngIndx), vbCr))
strLines(lngIndx) = Replace(strLines(lngIndx), vbCr, vbNullString)
Loop
Next
End If
GetLines = strLines
End Function

mickeydee
06-27-2008, 10:10 AM
Hi, I've attempted to use this code in an Access database to amend a csv file which I can then go on to use the transfertext method with. The csv file has embedded carriage returns which cause all sorts of problems like the previous poster encountered.

However, I cannot get the code to work properly, it errors at getlines = strlines stating type mismatch. I also don't understand how to use the data out of the csv file & this function in the transfertext method using this function.

Here is my code at the moment...

Function GetLines()
Dim lngFileNum As Long
Dim lngFileLen As Long
Dim lngIndx As Long
Dim strFileText As String
Dim strLines() As String
Dim strFilePath As String
Dim blnRemoveLineBreaks As Boolean

strFilePath = "M:\Tour Ops Finance\Controlling\Sports Division\Database\Data Files\test2.csv"
blnRemoveLineBreaks = True

'Get size of file in bytes:
lngFileLen = FileLen(strFilePath)
'Get free file number.
lngFileNum = FreeFile
'Open file:
Open strFilePath For Binary Access Read Lock Write As #lngFileNum
'Set buffer to receive text.
strFileText = Space$(lngFileLen)
'Read in whole file:
Get #lngFileNum, , strFileText
'Release file:
Close #lngFileNum
'Load lines into array for parsing:
strLines = Split(strFileText, vbNewLine, compare:=vbBinaryCompare)
'Remove Line Breaks
If blnRemoveLineBreaks Then
For lngIndx = LBound(strLines) To UBound(strLines)
Do Until Not CBool(InStrB(strLines(lngIndx), vbLf))
strLines(lngIndx) = Replace(strLines(lngIndx), vbLf, vbNullString)
Loop
Do Until Not CBool(InStrB(strLines(lngIndx), vbCr))
strLines(lngIndx) = Replace(strLines(lngIndx), vbCr, vbNullString)
Loop
Next
End If
GetLines = strLines
End Function

Transfertext code...

Function import()
DoCmd.SetWarnings (WarningsOff)

DoCmd.DeleteObject acTable, "Import_Bookings"
DoCmd.TransferText acImportDelim, "BookingsImport", "Import_Bookings", "M:\Tour Ops Finance\Controlling\Sports Division\Database\Data Files\test2.csv", -1

End Function

Appreciate any help

Thanks