PDA

View Full Version : Solved: Converting all files in a directory using VBA



RV6555
08-16-2007, 03:20 AM
I?m new to VBA, and appreciate any suggestions. I was wondering if anyone know how to convert (open, then save as) all files in a certain directory (doc to txt) via an Excel 2003 Macro. I program in Excel, because afterwards the txt files need to be imported to excel and I want to have just one macro running the whole process. I?ve come this far and it works when I provide one filename. When I try to replace the filename with a variable so that it opens all files, VBA gives me runtime error 424.

Thanks,

Roy

Sub ShowFileList()

Drive = "C:\"
Folder = "Documents and Settings\"
SubFolder1 = "RV\"
SubFolder2 = "My Documents\"
SubFolder3 = "Monthly Performance\"
SubFolder4 = "2007-07\"

Dim fs, f, f1, fc, s
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(Drive & Folder & SubFolder1 & SubFolder2 & SubFolder3 & SubFolder4)
Set fc = f.Files
For Each f1 In fc
s = s & f1.Name
s = s & vbCrLf
Next

'Should open files, does not work
Documents.Open Filename:=s, ConfirmConversions:=False _
, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto

'Save document as
ActiveDocument.SaveAs Filename:=s, FileFormat:= _
wdFormatText, LockComments:=False, Password:="", AddToRecentFiles:=True, _
WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False, Encoding:=437, InsertLineBreaks:=False, AllowSubstitutions:=False, _
LineEnding:=wdCRLF

ActiveDocument.Close

End sub

rory
08-16-2007, 04:55 AM
How about something like this:

Sub ShowFileList()
Dim fs, f, f1, fc, s
Dim astrPaths(5) As String, strPath As String
astrPaths(0) = "C:"
astrPaths(1) = "Documents and Settings"
astrPaths(2) = "RV"
astrPaths(3) = "My Documents"
astrPaths(4) = "Monthly Performance"
astrPaths(5) = "2007-07"
strPath = Join(astrPaths, "\")
Debug.Print strPath
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(strPath)
Set fc = f.Files
For Each f1 In fc

Documents.Open FileName:=strPath & "\" & f1.Name, ConfirmConversions:=False _
, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto

'Save document as
ActiveDocument.SaveAs FileName:=strPath & "\" & f1.Name, FileFormat:= _
wdFormatText, LockComments:=False, Password:="", AddToRecentFiles:=True, _
WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False, Encoding:=437, InsertLineBreaks:=False, AllowSubstitutions:=False, _
LineEnding:=wdCRLF

ActiveDocument.Close
Next

End Sub

RV6555
08-16-2007, 06:07 AM
Hi Rory, thanks for your reply. The code gives a runtime error 76, path not found in line Set f = fs.GetFolder(strPath)

I will try to make a Word macro and then let it be called by an Excel macro this would solve the problem more easily and I would continue to have such one sequence.

rory
08-16-2007, 06:51 AM
Does the path exist? (The code will print the path it's using to the Immediate Window in the VBEditor, so you can check what it is looking for)
It's not hard to convert that to run from Excel - just set a reference to Microsoft Word Object Library and run this:
Sub ShowFileList()
Dim fs, f, f1, fc, s
Dim appWord As Word.Application
Dim docWord As Word.Document
Dim astrPaths(5) As String, strPath As String
astrPaths(0) = "C:"
astrPaths(1) = "Documents and Settings"
astrPaths(2) = "RV"
astrPaths(3) = "My Documents"
astrPaths(4) = "Monthly Performance"
astrPaths(5) = "2007-07"
strPath = Join(astrPaths, "\")
Debug.Print strPath
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(strPath)
Set fc = f.Files
Set appWord = New Word.Application
For Each f1 In fc

Set docWord = appWord.Documents.Open(Filename:=strPath & "\" & f1.Name, ConfirmConversions:=False _
, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
WritePasswordTemplate:="", Format:=wdOpenFormatAuto)

'Save document as
docWord.SaveAs Filename:=strPath & "\" & f1.Name, FileFormat:= _
wdFormatText, LockComments:=False, Password:="", AddToRecentFiles:=True, _
WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False, Encoding:=437, InsertLineBreaks:=False, AllowSubstitutions:=False, _
LineEnding:=wdCRLF

docWord.Close
Next
appWord.Quit
Set appWord = Nothing

End Sub

YellowLabPro
08-16-2007, 07:19 AM
RV,
There is another method, and I use this method,
Open a new workbook, then go to the Menu Toolbar>> Data>> Import External Data>> then select the location where your file(s) reside, (and I recommend you select a permanent location- the import feature will remember where this is until you tell it differently), choose each file and set the particulars for the import.
Then when you want to update the data in Excel from the txt files, there are two options; you can choose to update for all the the .txt files you have initially or subsequently add, by selecting the Excel Icon w/ the red exclamation point on the Import Tool bar, or you can select to just update one sheet by selecting the red exclamation point icon, on the same toolbar.
This gives you very nice control over what is or is not updated, or additional files you would like to add to the process.

RV6555
08-16-2007, 08:05 AM
@rory, I have no clue what this code does, but nothing really happens (the document does not open and I do not find a saved file).

@ylp, I'm not sure whether I understand what you mean. Thing is that I need to run this macro once for about 1100 files, then monthly for 50 files. That's why I would prefer a macro that does everything.

rory
08-16-2007, 08:11 AM
The macro opens the files, saves them as text then closes them. You shouldn't see it happen other than maybe a flicker or two. It saves them as the same name since that's what your original code did.

YellowLabPro
08-16-2007, 08:37 AM
RV-
Will your data get its own sheet or will it append all the data to one sheet?
What I do, and will suggest to you is test it on one, then two or three sheets to test the process. Then implement it in chunks, w/ that many files you are most likely going to experience some other hiccups. This way you can confirm whatever path you choose, Rory's code or my internal process, that the process is working.
My method is basically doing the same thing Rory's is in concept, it just does it through built in Excel feature.
I like this method, while I only process two sheets on a daily basis, not 1100 or 50, I may not see other processing overhead issues that you might. But is worth exploring. My suggested method is automated once you set it up the way you want.

Rory,
Interesting question while the topic is on the table, how many sheets can a workbook hold? I would think that if RV is handling 1100 individual sheets, that data management of this is going to be a NIGHTMARE!

rory
08-16-2007, 09:15 AM
Number of sheets in a workbook is limited by available memory. Personally, I would say this sounds like a candidate for a proper database, but I don't really know at this stage what exactly is happening once the docs are converted into plain text files!

RichardSchollar
08-16-2007, 09:37 AM
In a thread on another board, the poster admitted to having a workbook with 2,000 sheets!!!! It really does just boggle my mind to think of so many sheets!

http://www.mrexcel.com/board2/viewtopic.php?p=1359557#1359557

Richard

YellowLabPro
08-16-2007, 09:46 AM
Richard,
I just viewed your link above- my thought is I would like to see the poster's file. It is quite amazing to see all the different uses for Excel and how versatile it really is.
Thanks for sharing that post.

RV6555
08-17-2007, 02:09 AM
Thank you all for your suggestions. I will shortly tell my intention by converting to txt. Monthly I receive 50 files with sales, there are a few variables I need to extract from these files. Each file is for a specific office. Per file I need to extract:
- name of office (can be one or several in one file)
- period (month + year and is mentioned in file and followed by the word “date:”)
- office’s rank in market share (underlined in top 20 market participants)
- percentage of market share (underlined in top 20 market participants, behind rank)
- estimate of total market (mentioned after start of new paragraph)

To describe the process:

receive MS word files
Add unique numbers: before I convert to the text file, I need to find the underlined ranks (as the .txt file will clear formatting) and market share percentages in order to put a unique number in between so that I will be able to extract them from the sheet
convert to .txt
import to excel (For me it does not really matter whether all the monthly reports come in one sheet or in several, but I need to be able to extract the right information from the sheet. From an orderly perspective )
extract the data and paste values in a new workbook
some calculations regarding market share and graphs need to be made
all word & text files can be deleted
Currently I have 1100 files (22 months), but there is data going back for another 4 years. In case the macro would work well, I would do the other 2400 files as well. I am testing with two files at the time, and will implement it by month (so 50 files at the time).

In the meanwhile I have managed to get a macro in word doing step 1 to 3, as excel seemed very difficult. The macro would still be ran from Excel by calling the word macro with the following code:

Sub Word_Convert()
Documents("Doc1.doc").Activate
Call ShowFileList
End Sub

The problem I’m having with this code is that it performs the macro twice. :S I would still prefer a complete Excel solution in order to keep the code as easy as possible.

@ylp I have tried your suggestion, but I find that every time I need to select the location (I could save it in my favourites though). Additionally, I am not able to import more than one file at the time.

Thanks again for your suggestions!

YellowLabPro
08-17-2007, 03:07 AM
RV,
In reading through this I have a different approach to suggest. If my interpretation is incorrect, then feel free to disregard. This also is only applicable if Excel can read data directly from Word, something I prior experience with, however I just set up a test condition and was able to do a very simple macro to search a Word doc and get the data from it.



Additional point on my previous method: Yes you need to save the file in the same location each time, either initially or after a different storage location is chosen of the said file, you need to tell Excel where the file is. This is just like in your macro, that if you change the location, Excel is still looking through the path that you tell it. This can be done inside the Import features properties. Secondly: on the import toolbar as I pointed out to you in my first post, there are two buttons, one w/ an Excel Icon and a Red exclamation character- this does all files. But the files you want to add need to be included initially or added to the list if more are added later.


If all your wanting to do is extract certain pieces of data from the report on a monthly basis and Excel can read/extract the data from Word then converting the files to a text file format seems unneccessary.
I would suggest you use the Find function and have the macro loop through all the files and copy that data to one worksheet. Now you have all the data in one place making it very concise and accessible. You can then perform any necessary tasks on the data here.

Let me know if this suggestion is appropriate or way out in left field....:think:

RV6555
08-17-2007, 04:21 AM
Yes, you are actually very correct... Somewhere in the process I should have thought this would not be possible. I still do not see how I can determine exactly what needs to be copied though (underlined I can figure), because the numbers differ for every month and office, and there might be more offices in a file. The file is about 5 pages with a lot of blabla, and I just need the 5 variables. My initial point of view was that when I would be able to import it into excel and can use the vlookup and sumif function to extract the right data from the sheets. Any suggestion on how to determine the words/numbers in a Word search? Thanks!

YellowLabPro
08-17-2007, 04:35 AM
This will take some work on your part, but my gut feeling is it will not be that difficult. There are several members here who will help you solve. The thing to realize is to treat each part as an indiviual issue and solve for that part. Don't worry about if you don't know how it can be done for now, the power of Excel, VBA and the people here can handle this task.

If you don't mind me giving you a few ideas here, I would suggest setting up an outline of what you want to accomplish, what items you need from each file, how you want it to appear in your worksheet and any other pertinent information.

Example:
Location Name
Location Region
Sales Dollar Amount
Site Manager Name
Date

Then find keywords that are in the Word Doc. and assign these key words as an index value. That could be one thing we can search for in the Word Doc., from here out refered to as .Doc or Doc.

Also if you can post one or two of the .Docs here on the board so we can see what you are dealing w/. If there is private info on the .Doc, then change that data.

Ps. I am still very very new to this, the other members here may possibly have other/better solutions for a particular section of the process.

RV6555
08-17-2007, 06:20 AM
Thanks for your reply. As you said, I will try on my own for now. Thanks, Roy

RV6555
08-20-2007, 12:24 AM
All, I have worked this over and decided that the easiest method for me would be having a word macro and an excel macro. I will run them separately (one after the other) for the moment, later I will get excel call the macro in word. I have tried this already, but for some reason it performs the word macro twice. Thanks for your help see below for my resolution.

Thanks,

Roy

WORD MACRO

Sub ShowFileList()

Drive = "C:\"
Folder = "Documents and Settings\"
SubFolder1 = "RV\"
SubFolder2 = "My Documents\"
SubFolder3 = "Monthly Performance\"

Dim fs, f, f1, fc, s
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(Drive & Folder & SubFolder1 & SubFolder2 & SubFolder3)
Set fc = f.Files
For Each f1 In fc
COUNTER = COUNTER + 1
Next
For Each f1 In fc
s = s & f1.Name
ChDir Drive & Folder & SubFolder1 & SubFolder2 & SubFolder3 & SubFolder4
Documents.Open FileName:="C:\Documents and Settings\RV\My Documents\Monthly Performance\" & s

'The following section searching for all underlined ranks (ending on 1st, 2nd, 3rd, 4th and puts the following
'number 123 & 456 in front ofthem. By these numbers they will be recognisable in the Excel file so that we can
'automatically detect what is our rank.
Selection.Find.ClearFormatting
Selection.Find.Font.Underline = True
With Selection.Find
.Text = "st"
.Replacement.Text = "st" + " 123 " + "456"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Font.Underline = True
With Selection.Find
.Text = "nd"
.Replacement.Text = "nd" + " 123 " + "456"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Font.Underline = True
With Selection.Find
.Text = "rd"
.Replacement.Text = "rd" + " 123 " + "456"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Font.Underline = True
With Selection.Find
.Text = "th"
.Replacement.Text = "th" + " 123 " + "456"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

'Save document as
ActiveDocument.SaveAs FileName:="C:\Documents and Settings\RV\My Documents\Monthly Performance\" & s & ".TXT", FileFormat:= _
wdFormatText, LockComments:=False, Password:="", AddToRecentFiles:=True, _
WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False, Encoding:=437, InsertLineBreaks:=False, AllowSubstitutions:=False, _
LineEnding:=wdCRLF

ActiveDocument.Close
s = ""
C = C + 1
If C = COUNTER Then
Exit For
End If
Next


End Sub


EXCEL MACRO

Sub Macro2()
'
' This macro imports all .txt files into different worksheets in Excel
'

'

Drive = "C:\"
Folder = "Documents and Settings\"
SubFolder1 = "RV\"
SubFolder2 = "My Documents\"
SubFolder3 = "Monthly Performance\"

Dim fs, f, f1, fc, s
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(Drive & Folder & SubFolder1 & SubFolder2 & SubFolder3)
Set fc = f.Files
For Each f1 In fc
COUNTER = COUNTER + 1
Next
Workbooks.Add
For Each f1 In fc
s = s & f1.Name
ChDir Drive & Folder & SubFolder1 & SubFolder2 & SubFolder3

Sheets("Sheet1").Select
Sheets.Add
With ActiveSheet.QueryTables.Add(Connection:= _
"TEXT;C:\Documents and Settings\RV\My Documents\Monthly Performance\" & s, Destination:=Range("A1"))
.Name = s
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 437
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = True
.TextFileTabDelimiter = True
.TextFileSemicolonDelimiter = True
.TextFileCommaDelimiter = False
.TextFileSpaceDelimiter = True
.TextFileOtherDelimiter = ""
.TextFileColumnDataTypes = Array(1)
.TextFileTrailingMinusNumbers = True
.Refresh BackgroundQuery:=False
End With
s = ""
Next
End Sub

rory
08-20-2007, 01:00 AM
Hi,
You don't actually need to loop through all the files twice - you can replace this:
For Each f1 In fc
COUNTER = COUNTER + 1
Next


with this:
COUNTER = fc.Count

RV6555
08-20-2007, 01:33 AM
Perfect! Thanks!