Excel Hints

Results 1 to 8 of 8

Thread: Urgent :How to FAST write large recordset in to a text file?

  1. #1

    Urgent :How to FAST write large recordset in to a text file?

    Hi Everybody


    I have a large record set (about 3.5 - 4 million records with 4 fields in each record). I want to write this to a text file. The resulting text file is approximately just under 200 mB file.

    I tried to write record by record by string concatenating the fields and then using the Print # statement to write to a text file. Whilst it doesn't take awfully long time, it does take a little while to do so. I was wondering if there is a faster way to do the same thing.

    The other option that I am looking at (it works but haven't time tested for the same data set yet) is to use the GetString() function with the appropriate arguments for comma delimited columns and vbCrLf delimited rows and then store the whole record set in to a single string and then use the Print# statement to write this string to a text file. This works but I have to time test it - I am looking at doing that tomorrow at work.

    I am wondering if "TextStream.Write (string)" would be faster where 'string' would be the result from the GetString() function. Or is this much of a muchness in that I wouldn't be gaining a lot ether way?
    Any suggestions on this would be highly valued.

    Thanks in advance.



    Best regards



    Deepak

  2. #2

  3. #3
    [VBA]sub snb
    ' reference to Microsoft ActiveX DataObject 2.0 Library

    With New Recordset
    .Open "SELECT * FROM `Sheet1$`;", "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='" & ThisWorkbook.FullName & "';Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"
    CreateObject("scripting.filesystemobject").CreateTextFile("G:\OF\snb.txt"). Write .GetString(, , "|", vbCrlf)
    End With
    end sub[/VBA]

  4. #4
    Hi Kenneth
    Sorry but the link that you suggested only works with data in an Excel spreadsheet. My data that I want to write to a text file is in a recordset and not in an Excel worksheet - it anyway won't fit in a worksheet, it contains around 4 million records


    Hi Snb
    That was exactly what I have tried and it works. I was wondering if 'Textstream.Write String' would be any faster.


    I am going to time test the 'TextStream.Write String' now and see if its any faster.

    The other option is to get is to load the recordset in an array using GetRows() function but I don't know of any function that will dump the contents of the entire array in to a text file without having to go field by field and record by record.


    Any further suggestions?



    Deepak

  5. #5
    SNB is very good at coming up with short code examples.

    Have you tried just saving it as a CSV file? You can copy the sheet to one workbook and save that one.

    ADO is far more advanced that just reading the data from Excel. It can read and write data from various ADO sources. See ConnectionStrings.com. You just have to figure out the details. I probably won't have time to dig much into it until this weekend.

    I would imagine that you would want to read the Excel data and then update a CSV file. To do this, you need one step to read the data and another to add it to another data set which in this case is the CSV file. It is not that much different than one MDB table to another table. The concepts are similar.


    Here is an example going from a CSV to Excel. Of course reading the Excel data can be an easy step.

    [vba]Option Explicit

    Sub pull_data()
    Dim s1 As Worksheet, ML_Dir As String
    ML_Dir = ThisWorkbook.Path
    'Open_Sort_CSV ML_Dir, "ImportCSV.csv", ActiveSheet.Name, True, "Data Used"
    Open_Sort_CSV ML_Dir, "ImportCSV.csv", ActiveSheet.Name, False, "F1"
    End Sub

    'Add Tools > References... > Microsoft ActiveX Data Objects 2.8 Library
    Sub Open_Sort_CSV(CSV_Dir, CSV_name, Data_Sheet, Optional Header As Boolean = True, _
    Optional SortField As String = "", Optional SortASC As Boolean = True)

    Dim connectionString As String
    'Late binding:
    'Dim objConnection As Object, objRecordset As Object
    ' Early Binding:
    Dim objConnection As Connection, objrecordset As Recordset
    Dim A As Integer
    Dim Location As Range, Rw As Long, col As Integer, c As Integer, MyField As Variant

    'set record set variables
    Const adOpenStatic = 3
    Const adLockOptimistic = 3
    Const adCmdText = 1 '&H1

    'set connection and recordset
    Set objConnection = CreateObject("ADODB.Connection")
    Set objrecordset = CreateObject("ADODB.Recordset")

    'oopen connection (headers,Delimited style,mixed data taken as text(not sure imex works))
    connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    "Data Source=" & CSV_Dir & ";" & _
    "Extended Properties=""text;HDR=" & Header & ";FMT=Delimited;IMEX=3"""
    objConnection.Open connectionString

    'get data from csv
    Select Case True
    Case SortField = ""
    objrecordset.Open "SELECT * FROM " & CSV_name, _
    objConnection, adOpenStatic, adLockOptimistic, adCmdText
    Case SortField <> "" And SortASC
    objrecordset.Open "SELECT * FROM " & CSV_name & " Order By `" & SortField & "` ASC", _
    objConnection, adOpenStatic, adLockOptimistic, adCmdText
    Case SortField <> "" And SortASC = False
    objrecordset.Open "SELECT * FROM " & CSV_name & " Order By `" & SortField & "` DESC", _
    objConnection, adOpenStatic, adLockOptimistic, adCmdText
    End Select

    'Loop across the fields
    If Header Then
    With objrecordset
    For A = 0 To .Fields.Count - 1
    ' Add field names to data sheet
    If Right(.Fields(A).Name, 7) <> ".NoName" Then _
    ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).Offset(0, A).Value = .Fields(A).Name
    Next A
    End With
    'copy data into worksheet under headers
    ThisWorkbook.Worksheets(Data_Sheet).Cells(2, 1).CopyFromRecordset objrecordset
    'Write RecordSet to results area
    Set Location = ThisWorkbook.Worksheets(Data_Sheet).Range("A2")
    Rw = Location.Row
    col = Location.Column
    c = col
    With objrecordset
    Do Until .EOF
    For Each MyField In .Fields
    Cells(Rw, c) = MyField
    c = c + 1
    Next MyField
    .MoveNext
    Rw = Rw + 1
    c = col
    Loop
    End With


    Else
    'copy data into worksheet
    ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).CopyFromRecordset objrecordset
    End If

    'end connection and recordset
    Set objConnection = Nothing
    Set objrecordset = Nothing

    End Sub


    'Pulls Data from CSV to Data sheet
    Sub Open_Sort_CSV_o(CSV_Dir, CSV_name, Data_Sheet, Optional Header As String = "No")

    Dim connectionString As String, objConnection As Object, objrecordset As Object
    Dim A As Integer
    Dim Location As Range, Rw As Long, col As Integer, c As Integer, MyField As Variant

    'set record set variables
    Const adOpenStatic = 3
    Const adLockOptimistic = 3
    Const adCmdText = &H1

    'ser connection and recordset
    Set objConnection = CreateObject("ADODB.Connection")
    Set objrecordset = CreateObject("ADODB.Recordset")

    'oopen connection (headers,Delimited style,mixed data taken as text(not sure imex works))
    'connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    "Data Source=" & CSV_Dir & ";" & _
    "Extended Properties=""text;HDR=" & Header & ";FMT=Delimited;IMEX=1"""
    connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    "Data Source=" & CSV_Dir & ";" & _
    "Extended Properties=""text;HDR=" & Header & ";FMT=Delimited(,);"""
    objConnection.Open connectionString

    'get data from csv
    objrecordset.Open "SELECT * FROM " & CSV_name, _
    objConnection, adOpenStatic, adLockOptimistic, adCmdText

    'Loop across the fields
    If Header = "Yes" Then
    With objrecordset
    For A = 0 To .Fields.Count - 1

    ' Add field names to data sheet
    ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).Offset(0, A).Value = .Fields(A).Name
    Next A
    End With

    'this errors for no good reason so stop any errors
    'On Error Resume Next

    'copy data into worksheet under headers
    ThisWorkbook.Worksheets(Data_Sheet).Cells(2, 1).CopyFromRecordset objrecordset
    'Write RecordSet to results area


    Else
    'copy data into worksheet
    ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).CopyFromRecordset objrecordset
    End If

    'end connection and recordset
    Set objConnection = Nothing
    Set objrecordset = Nothing

    End Sub


    [/vba]

    Here is an example for getrows.

    [VBA]'romperstomper, http://www.excelforum.com/excel-prog...-an-array.html
    Sub GetData()
    ' Sample demonstrating how to return a recordset from a workbook
    Dim cn As ADODB.Connection, strQuery As String, rst As ADODB.RecordSet, strConn As String
    Dim varData As Variant
    Set cn = New ADODB.Connection
    ' strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & ActiveWorkbook.FullName & ";" & _
    ' "Extended Properties=""Excel 8.0;HDR=Yes;"""
    With cn
    .Provider = "Microsoft.Jet.OLEDB.4.0"
    .ConnectionString = "Data Source=" & ActiveWorkbook.FullName & ";" & _
    "Extended Properties=""Excel 8.0;HDR=Yes;"""
    .Open
    End With
    strQuery = "SELECT * FROM [Sheet1$];"
    Set rst = New ADODB.RecordSet
    rst.Open strQuery, strConn, adOpenStatic, adLockReadOnly, adCmdText
    ' dump array of data into variable
    varData = rst.GetRows
    rst.Close
    Set rst = Nothing
    ' cn.Close
    Set cn = Nothing
    End Sub[/VBA]

  6. #6
    Hi Kenneth


    Have we got our question mixed up? My understanding is that .copyfromrecordset copies recordset content to an Excel worksheet. This is not something that I want. I want the contents of the recordset to be dumped to a text file (.txt file).

    Also, I had a look at the ConnectionStrings.com. I take it that I may need to use the Standard Microsoft Text ODBC drive like so :-

    Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=c:\txtFilesFolder\;Extensions=asc,csv,tab,txt;

    Here, is the Dbq the path to the output .txt file? Where do I specify the field/row delimiters?




    Deepak

  7. #7
    The field row delimiters are by your options. I would have to look into how those can be changed via VBA.

    The CopyFromRecordSet is for a Range object. I showed how to use the GetRows.

    Here is the saveas method.

    [VBA]
    Sub SaveAsCSV()
    Dim csv As String, wb1Name As String, wb1 As Workbook, wba As Workbook

    Set wba = ThisWorkbook
    csv = wba.Path & "\csv.csv"
    wb1Name = wba.Path & "\" & Environ("username") & ".xlsm"

    If Dir(csv) <> "" Then Kill csv
    If Dir(wb1Name) <> "" Then Kill wb1Name

    wba.Save
    wba.SaveCopyAs wb1Name
    Set wb1 = Workbooks.Open(wb1Name)
    wb1.Worksheets("Sheet1").SaveAs csv, xlCSVWindows
    ActiveWorkbook.Close True
    Kill wb1Name
    End Sub[/VBA]

  8. #8
    @Kenneth

    alternative ?

    [vba]Sub SaveAsCSV()
    ThisWorkbook.sheets("sheet1").copy
    With Activeworkbook
    .SaveAs thisworkbook.path & "\" & Environ("username") & ".csv", xlCSVWindows
    .close false
    end with
    End Sub
    [/vba]

    This forum is messing up the formatting of the code !!
    Last edited by Aussiebear; 04-17-2012 at 04:01 PM. Reason: Corrected the spacing issue

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •