cfernand74
07-05-2014, 01:23 PM
i am trying to eliminate some steps i have to do everyday, which is extract data from different city records websites.
i know how to extract tables to excel and manipulate the data once there, but i figured there must be a way to extract the specific data without extracting all the data. I don't mind if it is held in memory. I just don't want to extract all table data to a sheet.
Simply i want to search for the words "Finished Attic" that is located on a website table and extract the info in a cell that is three cells to the right of it which is "293".
There are many tables on this website but the one we will be working with is table 8
Code
Description
Gross
Area
Living
Area
BAS
First Floor
975
975
FUS
Finished Upper Story
975
975
FAT
Finished Attic
975
293
BAY
Bay Window
54
0
STP
Stoop
65
0
UBM
Basement
1002
0
4046
2243
I know how to extract the specific cell info "293" and i would do this, but the issue is that "Finished Attic" position moves up or down depending on the property criteria. In this case it is in the 5th row. The word its self "Finished Attic" is always the same(constant) and obviously the numbers i need to extract changes(two cells to the right of "Finished Attic"). And this goes for lots of other terms i need to extract, but i just need a sample code for one and i can take it from there.
The difference between were the info lies is ("td")(1) for "Finished Attic and ("td")(3) for "293"
i can get to "Finished Attic" with this code
Range("A26").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(1).innerText
i can get to "293" with this code
Range("A22").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(3).innerText
Know what i would like to do is search the table on the website for the key word "Finished Attic" wherever it is located on the table and move two cells to the right and extract that data. Which in this case would be "293".
This is the code i have setup to get me to the website and try grab info. Please edit as necessary.
Sub TaxAssessors()
Dim oElement As Object
'to refer to the running copy of Internet Explorer
Dim IE As InternetExplorer
'to refer to the HTML document returned
Dim html As HTMLDocument
'open Internet Explorer in memory, and go to website
Set IE = New InternetExplorer
IE.Visible = False
IE.navigate "http://gis.vgsi.com/providenceri/Parcel.aspx?Pid=13212"
'Wait until IE is done loading page
Do While IE.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to Tax Assessors ..."
DoEvents
Loop
'--------------------------------Code Starts here
'------Test to extract word "Finished Attic" just to see if in right location
Range("A1").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(1).innerText
'------Extract Finished Attic sq/ft (data i am after)
Range("A2").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(3).innerText
'--------------------------------Code ends here
IE.Quit
End Sub
What i was thinking is to search all the tables in the website page for the word "Finished Attic" once found then select that table and row. In prior code I think of ("tr")(3) as the row and ("td")(3) as the column. Obviously these position change, so maybe ("tr")(x) is identified then add ("td")(3) to it to get value i am after.
What a puzzle?
i know how to extract tables to excel and manipulate the data once there, but i figured there must be a way to extract the specific data without extracting all the data. I don't mind if it is held in memory. I just don't want to extract all table data to a sheet.
Simply i want to search for the words "Finished Attic" that is located on a website table and extract the info in a cell that is three cells to the right of it which is "293".
There are many tables on this website but the one we will be working with is table 8
Code
Description
Gross
Area
Living
Area
BAS
First Floor
975
975
FUS
Finished Upper Story
975
975
FAT
Finished Attic
975
293
BAY
Bay Window
54
0
STP
Stoop
65
0
UBM
Basement
1002
0
4046
2243
I know how to extract the specific cell info "293" and i would do this, but the issue is that "Finished Attic" position moves up or down depending on the property criteria. In this case it is in the 5th row. The word its self "Finished Attic" is always the same(constant) and obviously the numbers i need to extract changes(two cells to the right of "Finished Attic"). And this goes for lots of other terms i need to extract, but i just need a sample code for one and i can take it from there.
The difference between were the info lies is ("td")(1) for "Finished Attic and ("td")(3) for "293"
i can get to "Finished Attic" with this code
Range("A26").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(1).innerText
i can get to "293" with this code
Range("A22").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(3).innerText
Know what i would like to do is search the table on the website for the key word "Finished Attic" wherever it is located on the table and move two cells to the right and extract that data. Which in this case would be "293".
This is the code i have setup to get me to the website and try grab info. Please edit as necessary.
Sub TaxAssessors()
Dim oElement As Object
'to refer to the running copy of Internet Explorer
Dim IE As InternetExplorer
'to refer to the HTML document returned
Dim html As HTMLDocument
'open Internet Explorer in memory, and go to website
Set IE = New InternetExplorer
IE.Visible = False
IE.navigate "http://gis.vgsi.com/providenceri/Parcel.aspx?Pid=13212"
'Wait until IE is done loading page
Do While IE.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to Tax Assessors ..."
DoEvents
Loop
'--------------------------------Code Starts here
'------Test to extract word "Finished Attic" just to see if in right location
Range("A1").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(1).innerText
'------Extract Finished Attic sq/ft (data i am after)
Range("A2").Value = IE.document.forms("form1").getElementsByTagName("table")(8).getElementsByTagName("tr")(3).getElementsByTagName("td")(3).innerText
'--------------------------------Code ends here
IE.Quit
End Sub
What i was thinking is to search all the tables in the website page for the word "Finished Attic" once found then select that table and row. In prior code I think of ("tr")(3) as the row and ("td")(3) as the column. Obviously these position change, so maybe ("tr")(x) is identified then add ("td")(3) to it to get value i am after.
What a puzzle?