PDA

View Full Version : How to parse HTML file from VBA code into Excel ?



VinoBob
04-29-2010, 09:00 AM
Hi!

I would like to do the folowing:

- Open a HTML file from VBA code ( with a from )
- Access <td> tags that have a specific attribute ( namely : class="font" width="220" )
- copy the value of these table cells into Excel cells

Is it possible ? If yes, then how ?

Thanks in advance !

austenr
04-29-2010, 09:40 AM
You want to import an HTML file into Excel using an Access form is that it? Why use Access at all? You can filter it with Excel.

Also you need to be more clear on your explaination.

VinoBob
04-29-2010, 10:09 AM
Nope. I wanna access the table that is i want to handle it from the VBA code. That has nothing to do with the Access DB SW.

So i would like to open the HTML file from code ( like you open a worksheet with Application.Dialogs(xlDialogOpen).Show ), and get the values of the previously mentioned table cells, and copy them into the table in the worksheet. Got it ?

Its enough for me if i can open the file as a textfile, and then i can search in it, cause i know what tokens i need to search, but if i could open the file as HTML file, and search for the td tags with specified attributed its even better.

Crocus Crow
04-29-2010, 01:50 PM
Something like this should get you started. First you need to create the example HTML file used by my code, so go to the VBA Express Excel Help Forum (I'm not allowed to post the link!) in IE and save the page as C:\VBAExpress_Excel_Forum.html using File - Save As - Webpage, HTML only.


'Requires references to Microsoft Internet Controls and Microsoft HTML Object Library

Sub Extract_TD_text()

Dim URL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim r As Long

'Saved from www vbaexpress com/forum/forumdisplay.php?f=17
URL = "file://C:\VBAExpress_Excel_Forum.html"

Set IE = New InternetExplorer

With IE
.navigate URL
.Visible = True

'Wait for page to load
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend

Set HTMLdoc = .document
End With

Set TDelements = HTMLdoc.getElementsByTagName("TD")

Sheet1.Cells.ClearContents

r = 0
For Each TDelement In TDelements
'Look for required TD elements - this check is specific to VBA Express forum - modify as required
If TDelement.className = "alt2" And TDelement.Align = "center" Then
Sheet1.Range("A1").Offset(r, 0).Value = TDelement.innerText
r = r + 1
End If
Next

End Sub
As noted in the code you need to modify it for your particular TD tags, i.e. TDelement.width = "220"

tauterouff
04-18-2013, 01:49 PM
The example you provided works fine but when I try it on an on a company html page (same steps) and edit the search criteria it doesn't list any results.

Here is a part of the html.
<td class="graytext" width="50%">DBA Address</td>


<td width="50%" class="text">7 Dwarfs Ln</td>



<tr>




If I search for "graytext" or "test" and remove your second variable it won't find anything. Could there be another issue?

abhay_547
01-18-2018, 08:44 PM
The example you provided works fine but when I try it on an on a company html page (same steps) and edit the search criteria it doesn't list any results.

Here is a part of the html.
<td class="graytext" width="50%">DBA Address</td>


<td width="50%" class="text">7 Dwarfs Ln</td>



<tr>


If I search for "graytext" or "test" and remove your second variable it won't find anything. Could there be another issue?





I have the lines on the html page which are supposed to be in a table format but they are not and there is no separator between each column data except some undefined spaces as shown in below image, so how can i import it as a proper table .i.e. into separate columns in the excel file, also i have multiple such html files so i can import multiple files into one worksheet.