Consulting

Page 1 of 6 1 2 3 ... LastLast
Results 1 to 20 of 119

Thread: Process All CSV Files In SubFolders

  1. #1
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location

    Process All CSV Files In SubFolders

    Hi,

    I have a dataset consisting of more than 50k .csv files, all files are in the same format now and I need to perform some procedures on all of these files (text to columns, basic calculation - summary of certain columns, etc.). I googled some codes, but none worked for me, could you pls help me? I have Office 2013.

    Thanks in advance.

    M

  2. #2
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    I have a dataset consisting of more than 50k .csv files, all files are in the same format now and I need to perform some procedures on all of these files (text to columns, basic calculation - summary of certain columns, etc.). I googled some codes, but none worked for me, could you pls help me?
    Maxxino,

    Let me be the first to welcome you to VBA Express, IMO, the best site for help on all Microsoft Office Questions.

    Unfortunately, you did not provide enough information for us to get started. Can youGive a 5 line sample of real data from one of the CSV files?
    Then add another sheet to the (xlsm) workbook to show what you need as a summary.

    Be sure to use the actual data in the five rows on the CSV sheet. Make the example complete, that is use all the formulas needed, although we might be using VBA code to accomplish the same things.

    With only 5 rows used, there will be plenty of room for notes so we understand exactly what is needed.
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  3. #3
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location
    VBA.xlsx

    Hi SamT,

    thank you for the answer, I attach the excel file with the 5 line sample, another sheet and an explanation of the calculation in the sheet "adjusted",
    I will be very grateful for any kind of help you can provide.

    Kind regards,

    M

  4. #4
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Maxino,

    What do we do with the Adjusted file after we put the (N2) number in the new file (B1, B2, B3 Etc)

    Can you use A VBA that only calculated N2, (according to all the formulas in K, L, and M,) and put the results in column B?

    SamT
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  5. #5
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location
    Hi,

    I will run another calculation (slightly different) on the basis of the code provided by you (hopefuly) and the results will go into the new file C1, C2, C3, etc., then I will
    just make some basic calculations between columns B and C directly in the new file.

    I am not sure whether I understand correctly the second question, do you want to know whether the calculation from the original file from N2 may go into the same original file in column B?
    It will not work for me, because there is over 50k files in the whole folder (divided into 500 subfolders) and I need to extract the number from cell N2 from every single original file and put it
    into the new (one) file, thus I would be able to make some calculations inside just this one new excel, but I would have information from the whole folder.

    I hope I clarified it a bit, if you need to know more, do not hesitate to ask. And once again I really appreciate your help, thank you very much!

    Kind regards,

    M

  6. #6
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,639
    This is all you need:

    Sub M_snb()
       sn = Filter(Split(CreateObject("wscript.shell").exec("cmd /c Dir G:\OF\*.csv /b /s").stdout.readall, vbCrLf), ".")
      
       With CreateObject("scripting.filesystemobject")
            For j = 0 To UBound(sn)
              y = 0
              sp = Split(.opentextfile(sn(j)).readall, vbCrLf)
            
              For jj = 0 To UBound(sp) - 1
                sp(jj) = Log(Split(sp(jj), ";")(5)) / Log(10)
                If jj > 0 Then y = y + 100 * (sp(jj) - sp(jj - 1)) ^ 2
              Next
              c00 = c00 & vbLf & sn(j) & "|" & y
            Next
        End With
        
        MsgBox c00
    End Sub
    NB. Replace "G\OF\" by the foldername you want to be searched.

  7. #7
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    I will run another calculation (slightly different) on the basis of the code provided by you (hopefuly) and the results will go into the new file C2, C3, etc., then I will
    just make some basic calculations between columns B and C directly in the new file.
    What will you do with the old file with all the formulas up to N2 inclusive?

    As you see from snb's post c00 is the same value as N2, but he never put the CSV file on a spreadsheet.


    I am trying to be more clear. With VBA, we can use the CSV file to compute the value you want in B2, B3, etc. We (may be,) can do the same with the value you want in C2, C3, etc. D2, D3, etc, E2, E3. Etc, etc etc. This is much faster

    Do you must also have 50k Excel files with all the numbers from all the 50K CSV files? Saving and keeping the old excel file is slower. If you must keep the numbers, is it OK to keep them all on one worksheet?
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  8. #8
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,639
    @Sam

    y is identical to [N2]

    c00= filename1 | [N2]
    filename2 | [N2]
    filename3 | [N2], etc.

  9. #9
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    >>
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  10. #10
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,639
    & &

  11. #11
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location
    Well it doesn't matter so much to me, in the old files all the formulas can be deleted or can remain (would be slightly better probably).

    Ye actually this is what I asked for, I want to keep the numbers (results from all 50k old files from cell N2) on one worksheet. And as I said, it would be very helpful, if the name of the particular old file could match to the particular result from N2 (the name and the number would be in the same row).

  12. #12
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location
    Hi snb,

    well this code does not work for me, I run the code, but the message box shows up with no value. And more importantly I don't need to show the number in the message box,
    I need to get the results from all the original 50k excel files from the cell N2 (for explanation see above) into the new excel file, thus in the new excel file, there will be 50k rows with 50k values(ideally with the name of the original file in the next column)

  13. #13
    Knowledge Base Approver VBAX Wizard
    Joined
    Apr 2012
    Posts
    5,639
    I could not post a 'solution' since you provided too little information to do so.
    So the suggestion I posted is a suggestion you will have to adapt to your situation yourself (the name of the folder for instance).

    If you are not able to put an outcome that is presented in my suggetion in a combobox into an Excel file, I'd advise not to use any VBA at all.
    If you are not open to any suggestion at all, why asking a question in the first place ?

  14. #14
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location
    Don't get offended pls, I did not mean wrong. I just need to process data for my diploma thesis, I am not a VBA user, I just read some basics online, thus ofc I am open to any suggestion and appreciate very much any kind of help you can provide.

    I changed the name of the folder, but it did not work.

    If you need more information to get a solution, I will be more than happy to give them to you.

  15. #15
    VBAX Regular
    Joined
    May 2015
    Posts
    33
    Location
    hi, pls can you reply? I tried that, but it shows the message box with no value.
    And as I said, with all respect, I don't need to show the number in the message box, I need to get the results from all the original 50k excel files from the cell N2 (for explanation see above) into the new excel file, thus in the new excel file, there will be 50k rows with 50k values(ideally with the name of the original file in the next column).
    Thanks in advance for any kind of help.
    DB

  16. #16
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    It is always fun and instructive to figure out one of snb's codes.

    I tried to make this verbose and simple enough that eve a Czech, (home of the Tatra ,) Statistical Analyst can figure it out

    If it errors, please note the error statement and number for us.

    Place this code in a standard Module, Edit the FolderPath Constant to fit. After you see the Message "IT Works! Stopping Sub" Then code your next formula and run it again. The Message shows after only10 files have been successfully processed. When the code is complete and runs all the way to the message, remove the "For Testing Only" section at the bottom.

    To run the Sub, make sure an empty sheet is visible in Excel, then in the VBA, put the mouse cursor anywhere in the sub and press F5. BTW, Press F8 to step (run) through the code one line at a time. If you do that, you can hover the mouse over a variable to see its Value at that time.

    Option Explicit
    
    
    
    Sub SamT_snb()
    
    Dim Filename As String
    Dim NameLength As Long
    
    Dim FileNameArray As Variant                'sn
    Dim FileLinesArray As Variant                     'sp
    
    Dim Formula1LinesArray As Variant
    Dim Formula1Result As Double                       'y
      'Repeat these two lies for each Formula. Edit the numbers to suit
    Dim Formula2LinesArray As Variant
    Dim Formula2Result As Double
    
    Dim Fn As Long 'Fn = Index number for FileNameArray                     'jj
    Dim Fl As Long 'Fl = FileLinesArray Index number                                    'j
    
    Const FolderPath As String = "The Path to your folder must go here\" 'include ending \
    
    
    '''' Put all the file names in the path in Array
      FileNameArray = Filter(Split(CreateObject("wscript.shell").exec(FolderPath & "*.csv /b /s").stdout.readall, vbCrLf), ".")
      
      '''' Open one file at a time
      With CreateObject("scripting.filesystemobject")
        For Fn = 0 To UBound(FileNameArray)
          
    
    ''''Put all lines from one file in Arrays
          FileLinesArray = Split(.opentextfile(FileNameArray(Fn)).readall, vbCrLf)
          Formula1LinesArray = FileLinesArray
          Formula2LinesArray = FileLinesArray
          'Initializ
          Formula1Result = 0
          Formula2Result = 0
    ''''Calcuate first result for one file
         For Fl = 0 To UBound(FileLinesArray)
         
    '''' Calculate first formula
            'Replace file line with Log of 6th value. Split(BlahBlah)(5)
            Formula1LinesArray(Fl) = Log(Split(Formula1LinesArray(Fl), ",")(5))
              'After the first line
            If Fl > 0 Then Formula1Result = Formula1Result + Formula1Result + _
              (Formula1LinesArray(Fl) - Formula1LinesArray(Fl - 1) ^ 2) * 100
            
    '''' Calculate second Formula here
                  'Replace file line with first part of formula. Think carefully
             'Formula2LinesArray (Fl) =  Your formula here
            
          Next Fl
           
    '''' Put results in sheet
    
        'Get FileName
          NameLength = Len(FileNameArray(Fn)) - InStrRev(FileNameArray, "\")
          Filename = Right(FileNameArray(Fn), NameLength)
          
           'Place result
          With ActiveSheet.Rows(Fn + 1)
            .Columns(2) = Formula1Result 'Column B
            .Columns(3) = Formula2Result
            .Columns(4) = Filename
            End With
            
            'Initialize Arrays
            FileLinesArray = ""
            Formula1LinesArray = FileLinesArray
            Formula2LinesArray = FileLinesArray
          
    ''''' For testing only
    If Fn >= 100 Then
      MsgBox "IT Works! Stopping Sub"
      Exit Sub
    End If
    ''''End Test
    
        Next Fn 'Work on next File
      End With
    End Sub
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  17. #17
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    I see that Sam and I were thinking along the same lines.

    M, when we see a new user's posts, we don't know their level of experience. We have to assume that they know some VBA basics. Sometimes we assume too much and some times, not enough. A MsgBox() or Debug.Print (output to Immediate Window in Visual Basic Editor, VBE), are tools we often use to debug code to show intermediate or final results. We leave it to the user to finish it up. Pressing F8 in a routine will execute one line of code at a time so you can see what is happening. Hover a cursor over a variable to see what it resolved to.

    When users ask for complex solutions, it is often hard for them to understand when they often know very little VBA. As with any learning experience, we learn by small steps. Ergo, the MsgBox() and such. I do these same small steps when I am trying to solve a problem. The key to solving problems is baby steps. SNB has solved most of what you need.

    Ok, now let's nail down a few snags that can happen in your problem and snb's solution. First off, always test code on backup copies of folders or files. In this case, I recommend that you copy say 3 CSV files to one folder. Change your folder path (between the double quotes) in my modified snb code in a Module and then run it.
    Sub M_snb2()    
      sn = Filter(Split(CreateObject("wscript.shell").exec _
          ("cmd /c Dir ""X:\snb\maxxino\*.csv"" /b /s").stdout.readall, vbCrLf), ".")
          
        With CreateObject("scripting.filesystemobject")
            For j = 0 To UBound(sn)
                y = 0
                sp = Split(.opentextfile(sn(j)).readall, vbCrLf)
                 
                For jj = 0 To UBound(sp) - 1
                    sp(jj) = Log(Split(sp(jj), ",")(5)) / Log(10)
                    If jj > 0 Then y = y + 100 * (sp(jj) - sp(jj - 1)) ^ 2
                Next
                c00 = c00 & vbLf & sn(j) & vbTab & y
            Next
        End With
         
        MsgBox c00
    End Sub
    Hopefully, it showed the expected results. You may have noticed that in the last split, that I change the semicolon to a comma. This has to do with the delimited character used. It is a case where had you attached a csv file, it would have helped. I do think that your example XLSX file was good though. Sometimes, that level of extra help to better help us help you, well, helps.

    If this works, we can then go on to fully solve your problem by poking the MsbBox data into cells. The problem I fear though is the time it will take. I have few speed tricks that can help. 500,000 files and therefore 1,000,000 cells to fill is going to take a very long time to fill. Likely as not, we would want to use another approach rather than string concatenation for the final part. We can easily fill each of the two values from each file once you understand that the first part was solved.
    Last edited by Kenneth Hobs; 05-30-2015 at 08:01 PM.

  18. #18
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Kenneth,

    I believe this is a one time use file so he can analyze a bunchaton of numbers for his thesis. That is why I wasn't worried about time. It will give him something to watch while he is cracking the books

    He doesn't say how long each file is, but I got the impression that thwey were pretty large. I don't think, well, hardly ever, that the final Poke of each file's computations will add that large a % to the total time. .
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

  19. #19
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    I had to tweak 2 lines of code in Sam's code. The filter line needed the "cmd /c Dir" and a loop needed a -1.

    I must be getting sleepy as I can't see what math needs to be fixed where you commented Sam.

    Anywho, here is my slight modification. M, you will notice that Sam took it to the last level to poke some values into the active sheet. The math needs to be resolved properly though. Maybe I can see more clearly tomorrow. Allergies and late night last night and being an old fogy does not help.

    Sub KenH_SamT_snb2()     
        Dim Filename As String
        Dim NameLength As Long
         
        Dim FileNameArray As Variant 'sn
        Dim FileLinesArray As Variant 'sp
         
        Dim Formula1LinesArray As Variant
        Dim Formula1Result As Double 'y
         'Repeat these two lies for each Formula. Edit the numbers to suit
        Dim Formula2LinesArray As Variant
        Dim Formula2Result As Double
         
        Dim Fn As Long 'Fn = Index number for FileNameArray                     'jj
        Dim Fl As Long 'Fl = FileLinesArray Index number                                    'j
         
        Const FolderPath As String = "X:\snb\maxxino\" 'include ending \
         
         
         '''' Put all the file names in the path in Array
        FileNameArray = Filter(Split(CreateObject("wscript.shell").exec("cmd /c Dir " & _
          FolderPath & "*.csv /b /s").stdout.readall, vbCrLf), ".")
    
    
         '''' Open one file at a time
        With CreateObject("scripting.filesystemobject")
            For Fn = 0 To UBound(FileNameArray)
                 
                 
                 ''''Put all lines from one file in Arrays
                FileLinesArray = Split(.opentextfile(FileNameArray(Fn)).readall, vbCrLf)
                Formula1LinesArray = FileLinesArray
                Formula2LinesArray = FileLinesArray
                 'Initializ
                Formula1Result = 0
                Formula2Result = 0
                 ''''Calcuate first result for one file
                 
                For Fl = 0 To UBound(FileLinesArray) - 1
                     
                     '''' Calculate first formula
                     'Replace file line with Log of 6th value. Split(BlahBlah)(5)
                    Formula1LinesArray(Fl) = Log(Split(Formula1LinesArray(Fl), ",")(5))
                     'After the first line
                    If Fl > 0 Then Formula1Result = Formula1Result + Formula1Result + _
                    (Formula1LinesArray(Fl) - Formula1LinesArray(Fl - 1) ^ 2) * 100
                     
                     '''' Calculate second Formula here
                     'Replace file line with first part of formula. Think carefully
                     'Formula2LinesArray (Fl) =  Your formula here
                     
                Next Fl
                 
                 '''' Put results in sheet
                 
                 'Get FileName
                NameLength = Len(FileNameArray(Fn)) - InStrRev(FileNameArray(Fn), "\")
                Filename = Right(FileNameArray(Fn), NameLength)
                 
                 'Place result
                With ActiveSheet.Rows(Fn + 1)
                    .Columns(2) = Formula1Result 'Column B
                    .Columns(3) = Formula2Result
                    .Columns(4) = Filename
                End With
                 
                 'Initialize Arrays
                FileLinesArray = ""
                Formula1LinesArray = FileLinesArray
                Formula2LinesArray = FileLinesArray
                 
                 ''''' For testing only
                If Fn >= 10 Then
                    MsgBox "IT Works! Stopping Sub"
                    Exit Sub
                End If
                 ''''End Test
                 
            Next Fn 'Work on next File
        End With
    End Sub
    Last edited by SamT; 05-30-2015 at 08:29 PM.

  20. #20
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    Thanks, Kenneth,

    I spent hours analyzing snb's code. That's my excuse and I'm sticking to it.
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •