PDA

View Full Version : [SOLVED] hrefs from an ordered list <ol> on a webpage in Internet Explorer



tgibbs
01-02-2014, 03:40 PM
Microsoft Excel 2010, Internet Explorer 9, on a Windows 7 computer:

I am trying to pull the hyperlink address and jump text from a very simple website that consists of an ordered list <ol> contaning company name(this is a hyperlink that links to their website) and if they are hiring(hyperlink to companies website). Then it would output the links into an empty worksheet.

Here is the web page and the source code:
NOTE: number 4 on the list, 20x200 has 2 links in the source code "20x200" is a hyperlink and "(hiring)" is also a hyperlink
11030

I'm creating a macro that will:
1. Access the webpage from excel (Solved)
2. Import the ordered list into an array (Biggest Problem)
3. Parse the href element and it's target info(1 href if they aren't hiring, 2 if they are hiring)
4. output the links into a generic sheet,
A. Column A would have all the companies names irrespective if they are hiring or not.
B. Column B would have only the companies that are hiring.


Here is the Macro so far:
Sub findCompanyLinks()


Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
With ie
ie.Visible = False
ie.Navigate URL
' Don't show window
ie.Visible = False


'Wait until IE is done loading page
Do While ie.Busy
Application.StatusBar = "Downloading information, lease wait..."
DoEvents
Loop


' Make a string from IE content
Set mdoc = ie.Document
hrefdata = mdoc.getElementsByTagName("li")
'hrefData = mdoc.links()

End With


End Sub


Thanks in advance for your help. Really the main question I have is how to access the href and target info for each list item and pull it into an array of some sort. Then I am confident I can export the array into excel.

westconn1
01-03-2014, 12:03 AM
try like

rw = 1 'first row for data
for each ele in ie.document.getelementsbytagname("a")
col = 1
if ele.innertext = "(hiring)" then col = 2
sheets("mysheet").cells(rw,col).value = ele.href
rw = rw + 1
next i have not tested this on a website and it may pick up any additional links at the bottom of page, you could test for the elements parent to prevent extra links getting added
change sheetname to suit, or use a worksheet object

tgibbs
01-03-2014, 06:09 PM
Awesome! This was exactly what I needed.

Thanks westconn1!

harshil042
01-30-2014, 07:36 AM
Hi all,

I am using VBA to navigate through the website - this is the coding

Sub newportcarreg()

Dim lot As String

Set objie = CreateObject("internetExplorer.application")

With objie
.Visible = False
.navigate (not able to post the link but the code to the button is at the bottom - also you could type in newport auction in gogle and you will get the link)

Do While .busy Or _
.readystate <> 4
DoEvents
Loop

Do Until ActiveCell.Value = ""
lot = ActiveCell.Value
Sheet1.Range("a1").End(xlDown).Offset(1, 0).Value = .document.getElementById(lot).innertext
ActiveCell.Offset(1, 0).Select
Loop

End With
'
End Sub


With this code I am able to go to the website and pull the information from the 1st page. But I am unable to navigate to page 2 and pull the data from page 2 because the button on the website is not exactly a button that I can use a click command in VBA. Its a href tag with java scritp - <a href="javascript:__doPostBack('GridView1','Page$2')">2</a>

Can any1 help me and let me how I can navigate to page 2 through VBA?


Thank you
Harshil