PDA

View Full Version : data from all the html files in an excel



machunt
07-11-2012, 02:55 AM
Hi friends,
I am again stuck with a problem and need some help.
I have a folder with around 20-25 html files.What I am doing is that opening each html files making Ctrl+A and pasting it in an excel files in sheet 1 and closing the html file, then opeing the second html file ctrl+A and again pasting in sheet 2 of the excel file and so on for all the html files.Is there a way to do the same work using a VBA code.Can somebody please help me on this.
Thanks,
mac

CodeNinja
07-11-2012, 06:39 AM
Machunt,
The following code requires the Microsoft HTML Object library reference, so be sure to go to your VB editor and click tools/references and select it.

This code will also only capture text I believe. You may need to play with the line: hd2.DocumentElement.outerText to get it to handle objects other than text.

Dim str As String
Dim hd2 As HTMLDocument
Dim sFolder As String
Dim i As Integer

sFolder = "C:\\Junk\" 'change this to your directory

i = 1
sURL = Dir(sFolder & "*.htm*")
While sURL <> ""
Set hd = New HTMLDocument
Set hd2 = hd.createDocumentFromUrl(sFolder & sURL, "")
Do Until hd2.readyState = "complete"
DoEvents
Loop
str = hd2.DocumentElement.outerText
If ThisWorkbook.Sheets.Count < i Then ThisWorkbook.Sheets.Add
Sheets(i).Cells(1, 1) = str
i = i + 1
sURL = Dir()
Wend

End Sub

snb
07-11-2012, 07:47 AM
Sub snb()
c00 = "G:\OF\"
c01 = Dir(c00 & "*.html")

With CreateObject("scripting.filesystemobject")
Do Until c01 = ""
c02 = c02 & "||" & .opentextfile(c00 & c01).readall
c01 = Dir
Loop
End With

sn = Split(c02, "||")
For j = 1 To UBound(sn)
Cells(j, 1) = sn(j - 1)
Next
End Sub

machunt
07-11-2012, 10:26 PM
Hi CodeNinja
Thanks for helping me, but when I am running the code it is giving me a "Run time error 438" - " Object doesn't support the property or method"

when I am clicking debug , this line is getting highlighted
- Set hd2 = hd.createDocumentFromUrl(sFolder & sURL, "")

I have already added the "Microsoft HTML Object library" reference

Hi SNB,
thanks for helping me, but when i run the code it is pulling the HTML code and not the data present in the html file. Also the code are generated in the cells below one another.

Can you guys please help me out.
Thanks in advance.
regards,
mac

snb
07-11-2012, 11:58 PM
In that case you better make a webquery in every sheet to every html-file.

CodeNinja
07-12-2012, 06:46 AM
Hi CodeNinja
Thanks for helping me, but when I am running the code it is giving me a "Run time error 438" - " Object doesn't support the property or method"

when I am clicking debug , this line is getting highlighted
- Set hd2 = hd.createDocumentFromUrl(sFolder & sURL, "")

I have already added the "Microsoft HTML Object library" reference


Machunt, please provide a sample of the excel file and html file that this is failing on so I may debug it on my computer.

Thanks.