Microsoft Excel Webinar

Results 1 to 8 of 8

Thread: Urgent :How to FAST write large recordset in to a text file?

  1. #1

    Urgent :How to FAST write large recordset in to a text file?

    Hi Everybody


    I have a large record set (about 3.5 - 4 million records with 4 fields in each record). I want to write this to a text file. The resulting text file is approximately just under 200 mB file.

    I tried to write record by record by string concatenating the fields and then using the Print # statement to write to a text file. Whilst it doesn't take awfully long time, it does take a little while to do so. I was wondering if there is a faster way to do the same thing.

    The other option that I am looking at (it works but haven't time tested for the same data set yet) is to use the GetString() function with the appropriate arguments for comma delimited columns and vbCrLf delimited rows and then store the whole record set in to a single string and then use the Print# statement to write this string to a text file. This works but I have to time test it - I am looking at doing that tomorrow at work.

    I am wondering if "TextStream.Write (string)" would be faster where 'string' would be the result from the GetString() function. Or is this much of a muchness in that I wouldn't be gaining a lot ether way?
    Any suggestions on this would be highly valued.

    Thanks in advance.



    Best regards



    Deepak

  2. #2

  3. #3
    VB:
    Sub snb 
         ' reference to Microsoft ActiveX DataObject 2.0 Library
         
        With New Recordset 
            .Open "SELECT * FROM `Sheet1$`;", "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='" & ThisWorkbook.FullName & "';Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";" 
            CreateObject("scripting.filesystemobject").CreateTextFile("G:\OF\snb.txt").Write .GetString(, , "|", vbCrlf) 
        End With 
    End Sub 
    
    
    Formatting tags added by mark007

  4. #4
    Hi Kenneth
    Sorry but the link that you suggested only works with data in an Excel spreadsheet. My data that I want to write to a text file is in a recordset and not in an Excel worksheet - it anyway won't fit in a worksheet, it contains around 4 million records


    Hi Snb
    That was exactly what I have tried and it works. I was wondering if 'Textstream.Write String' would be any faster.


    I am going to time test the 'TextStream.Write String' now and see if its any faster.

    The other option is to get is to load the recordset in an array using GetRows() function but I don't know of any function that will dump the contents of the entire array in to a text file without having to go field by field and record by record.


    Any further suggestions?



    Deepak

  5. #5
    SNB is very good at coming up with short code examples.

    Have you tried just saving it as a CSV file? You can copy the sheet to one workbook and save that one.

    ADO is far more advanced that just reading the data from Excel. It can read and write data from various ADO sources. See ConnectionStrings.com. You just have to figure out the details. I probably won't have time to dig much into it until this weekend.

    I would imagine that you would want to read the Excel data and then update a CSV file. To do this, you need one step to read the data and another to add it to another data set which in this case is the CSV file. It is not that much different than one MDB table to another table. The concepts are similar.


    Here is an example going from a CSV to Excel. Of course reading the Excel data can be an easy step.

    VB:
    Option Explicit 
     
    Sub pull_data() 
        Dim s1 As Worksheet, ML_Dir As String 
        ML_Dir = ThisWorkbook.Path 
         'Open_Sort_CSV ML_Dir, "ImportCSV.csv", ActiveSheet.Name, True, "Data Used"
        Open_Sort_CSV ML_Dir, "ImportCSV.csv", ActiveSheet.Name, False, "F1" 
    End Sub 
     
     'Add Tools > References... > Microsoft ActiveX Data Objects 2.8 Library
    Sub Open_Sort_CSV(CSV_Dir, CSV_name, Data_Sheet, Optional Header As Boolean = True, _ 
        Optional SortField As String = "", Optional SortASC As Boolean = True) 
         
        Dim connectionString As String 
         'Late binding:
         'Dim objConnection As Object, objRecordset As Object
         ' Early Binding:
        Dim objConnection As Connection, objrecordset As Recordset 
        Dim A As Integer 
        Dim Location As Range, Rw As Long, col As Integer, c As Integer, MyField As Variant 
         
         'set record set variables
        Const adOpenStatic = 3 
        Const adLockOptimistic = 3 
        Const adCmdText = 1 '&H1
         
         'set connection and recordset
        Set objConnection = CreateObject("ADODB.Connection") 
        Set objrecordset = CreateObject("ADODB.Recordset") 
         
         'oopen connection (headers,Delimited style,mixed data taken as text(not sure imex works))
        connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _ 
        "Data Source=" & CSV_Dir & ";" & _ 
        "Extended Properties=""text;HDR=" & Header & ";FMT=Delimited;IMEX=3""" 
        objConnection.Open connectionString 
         
         'get data from csv
        Select Case True 
        Case SortField = "" 
            objrecordset.Open "SELECT * FROM " & CSV_name, _ 
            objConnection, adOpenStatic, adLockOptimistic, adCmdText 
        Case SortField <> "" And SortASC 
            objrecordset.Open "SELECT * FROM " & CSV_name & " Order By `" & SortField & "` ASC", _ 
            objConnection, adOpenStatic, adLockOptimistic, adCmdText 
        Case SortField <> "" And SortASC = False 
            objrecordset.Open "SELECT * FROM " & CSV_name & " Order By `" & SortField & "` DESC", _ 
            objConnection, adOpenStatic, adLockOptimistic, adCmdText 
        End Select 
         
         'Loop across the fields
        If Header Then 
            With objrecordset 
                For A = 0 To .Fields.Count - 1 
                     ' Add field names to data sheet
                    If Right(.Fields(A).Name, 7) <> ".NoName" Then _ 
                    ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).Offset(0, A).Value = .Fields(A).Name 
                Next A 
            End With 
             'copy data into worksheet under headers
            ThisWorkbook.Worksheets(Data_Sheet).Cells(2, 1).CopyFromRecordset objrecordset 
             'Write RecordSet to results area
            Set Location = ThisWorkbook.Worksheets(Data_Sheet).Range("A2") 
            Rw = Location.Row 
            col = Location.Column 
            c = col 
            With objrecordset 
                Do Until .EOF 
                    For Each MyField In .Fields 
                        Cells(Rw, c) = MyField 
                        c = c + 1 
                    Next MyField 
                    .MoveNext 
                    Rw = Rw + 1 
                    c = col 
                Loop 
            End With 
             
             
        Else 
             'copy data into worksheet
            ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).CopyFromRecordset objrecordset 
        End If 
         
         'end connection and recordset
        Set objConnection = Nothing 
        Set objrecordset = Nothing 
         
    End Sub 
     
     
     'Pulls Data from CSV to Data sheet
    Sub Open_Sort_CSV_o(CSV_Dir, CSV_name, Data_Sheet, Optional Header As String = "No") 
         
        Dim connectionString As String, objConnection As Object, objrecordset As Object 
        Dim A As Integer 
        Dim Location As Range, Rw As Long, col As Integer, c As Integer, MyField As Variant 
         
         'set record set variables
        Const adOpenStatic = 3 
        Const adLockOptimistic = 3 
        Const adCmdText = &H1 
         
         'ser connection and recordset
        Set objConnection = CreateObject("ADODB.Connection") 
        Set objrecordset = CreateObject("ADODB.Recordset") 
         
         'oopen connection (headers,Delimited style,mixed data taken as text(not sure imex works))
         'connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & CSV_Dir & ";" & _ 
        "Extended Properties=""text;HDR=" & Header & ";FMT=Delimited;IMEX=1""" 
        connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _ 
        "Data Source=" & CSV_Dir & ";" & _ 
        "Extended Properties=""text;HDR=" & Header & ";FMT=Delimited(,);""" 
        objConnection.Open connectionString 
         
         'get data from csv
        objrecordset.Open "SELECT * FROM " & CSV_name, _ 
        objConnection, adOpenStatic, adLockOptimistic, adCmdText 
         
         'Loop across the fields
        If Header = "Yes" Then 
            With objrecordset 
                For A = 0 To .Fields.Count - 1 
                     
                     ' Add field names to data sheet
                    ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).Offset(0, A).Value = .Fields(A).Name 
                Next A 
            End With 
             
             'this errors for no good reason so stop any errors
             'On Error Resume Next
             
             'copy data into worksheet under headers
            ThisWorkbook.Worksheets(Data_Sheet).Cells(2, 1).CopyFromRecordset objrecordset 
             'Write RecordSet to results area
             
             
        Else 
             'copy data into worksheet
            ThisWorkbook.Worksheets(Data_Sheet).Cells(1, 1).CopyFromRecordset objrecordset 
        End If 
         
         'end connection and recordset
        Set objConnection = Nothing 
        Set objrecordset = Nothing 
         
    End Sub 
    
    
    Formatting tags added by mark007
    Here is an example for getrows.

    VB:
     'romperstomper, [url]http://www.excelforum.com/excel-programming/665066-send-results-of-sql-query-to-an-array.html[/url]
    Sub GetData() 
         ' Sample demonstrating how to return a recordset from a workbook
        Dim cn As ADODB.Connection, strQuery As String, rst As ADODB.RecordSet, strConn As String 
        Dim varData As Variant 
        Set cn = New ADODB.Connection 
         '   strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & ActiveWorkbook.FullName & ";" & _
         '         "Extended Properties=""Excel 8.0;HDR=Yes;"""
        With cn 
            .Provider = "Microsoft.Jet.OLEDB.4.0" 
            .ConnectionString = "Data Source=" & ActiveWorkbook.FullName & ";" & _ 
            "Extended Properties=""Excel 8.0;HDR=Yes;""" 
            .Open 
        End With 
        strQuery = "SELECT * FROM [Sheet1$];" 
        Set rst = New ADODB.RecordSet 
        rst.Open strQuery, strConn, adOpenStatic, adLockReadOnly, adCmdText 
         ' dump array of data into variable
        varData = rst.GetRows 
        rst.Close 
        Set rst = Nothing 
         '   cn.Close
        Set cn = Nothing 
    End Sub 
    
    
    Formatting tags added by mark007

  6. #6
    Hi Kenneth


    Have we got our question mixed up? My understanding is that .copyfromrecordset copies recordset content to an Excel worksheet. This is not something that I want. I want the contents of the recordset to be dumped to a text file (.txt file).

    Also, I had a look at the ConnectionStrings.com. I take it that I may need to use the Standard Microsoft Text ODBC drive like so :-

    Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=c:\txtFilesFolder\;Extensions=asc,csv,tab,txt;

    Here, is the Dbq the path to the output .txt file? Where do I specify the field/row delimiters?




    Deepak

  7. #7
    The field row delimiters are by your options. I would have to look into how those can be changed via VBA.

    The CopyFromRecordSet is for a Range object. I showed how to use the GetRows.

    Here is the saveas method.

    VB:
    Sub SaveAsCSV() 
        Dim csv As String, wb1Name As String, wb1 As Workbook, wba As Workbook 
         
        Set wba = ThisWorkbook 
        csv = wba.Path & "\csv.csv" 
        wb1Name = wba.Path & "\" & Environ("username") & ".xlsm" 
         
        If Dir(csv) <> "" Then Kill csv 
        If Dir(wb1Name) <> "" Then Kill wb1Name 
         
        wba.Save 
        wba.SaveCopyAs wb1Name 
        Set wb1 = Workbooks.Open(wb1Name) 
        wb1.Worksheets("Sheet1").SaveAs csv, xlCSVWindows 
        ActiveWorkbook.Close True 
        Kill wb1Name 
    End Sub 
    
    
    Formatting tags added by mark007

  8. #8
    @Kenneth

    alternative ?

    VB:
    Sub SaveAsCSV() 
        [LEFT]  ThisWorkbook.sheets("sheet1").copy 
        With Activeworkbook 
            .SaveAs thisworkbook.path & "\" & Environ("username") & ".csv", xlCSVWindows 
            .close False 
        End With 
    End Sub[/LEFT] 
    
    
    Formatting tags added by mark007

    This forum is messing up the formatting of the code !!
    Last edited by Aussiebear; 04-17-2012 at 04:01 PM. Reason: Corrected the spacing issue

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •