PDA

View Full Version : Solved: VBA to post data to a servelet?



storfar
09-21-2010, 08:36 PM
I'm trying to get census data from the US census website a little more efficently so I need to pass an addresses to it and scrape the results. I've got the scrape piece done but I can't figure out how to send a new address to the site.

I've read a few other forums about posting to an asp but I can't see any asp info in the source data. It appears they are using a servelet so I'm not sure how to get around that with VBA.

Any help would be appreciated. Thanks

Bob Phillips
09-22-2010, 01:42 AM
Post your code and indicate how and where you would want to change the new URL and an example.

storfar
09-22-2010, 01:03 PM
I guess the problem is that the URL doesn't change. Normally I just change the inputs to the URL, strip out the data I want, proceed to the next URL and so on.

On this census site (I don't have a high enough rating on this site to post urls so hopefully it will come through below) adding an address does not change the url so I can't figure out how to submit my 2nd, 3rd, etc addresses.

Thanks

strWebsite = "factfinder.census.gov/servlet/AGSGeoAddressServlet?_lang=en&_programYear=50&_treeId=420"
Set msXML = CreateObject("microsoft.XMLHTTP")
msXML.Open "GET", strWebsite, False
msXML.SetRequestHeader "Content-type", "text/xml"
msXML.Send
strPageContent = msXML.responseText

Bob Phillips
09-22-2010, 03:30 PM
I get an error with that code, but I'll take a punt that you want to run it for various year/treeId combinations, something like this



Public Sub Test()

MsgBox GetWebData(50, 420)
MsgBox GetWebData(51, 480)
End Sub

Public Function GetWebData(year As Long, treeId As Long) As String

strWebsite = "factfinder.census.gov/servlet/AGSGeoAddressServlet?_lang=en&_programYear=" & year & "&_treeId=" & treeId
Set msXML = CreateObject("microsoft.XMLHTTP")
msXML.Open "GET", strWebsite, False
msXML.SetRequestHeader "Content-type", "text/xml"
msXML.Send
GetWebData = msXML.responseText
End Function

storfar
09-22-2010, 04:27 PM
What you have done is what I normally do to get data from a website where the URL changes. However, in this instance, the website URL is generic and the data on the underlying page changes.

The piece that I can't figure out is how to automatically send addresses to the site given that I can't do it via the URL. Does that make sense? If you add a "http" to the address below you will get to the site in question.

factfinder.census.gov/servlet/AGSGeoAddressServlet?_lang=en&_programYear=50&_treeId=420

If you put a US mailing address in the boxes, then hit go, it will return a bunch of data options with the new URL being below (again, add an "http" to it, I can't because of restrictins on this site). Both these URLs are generic and will be displayed regardless of what address you submit.

factfinder.census.gov/servlet/AGSGeoAddressServlet

Bob Phillips
09-22-2010, 04:56 PM
I surprise myself at times

http://factfinder.census.gov/servlet/AGSGeoAddressServlet?_lang=en&_programYear=50&_treeId=420&Street=123%20Main%20Streeet&City=Savannah&State=Georgia

storfar
09-22-2010, 05:22 PM
That link just takes me to the search page without giving any actual results for the Savannah Georgia address that is within the URL. I'm assuming you plugged in an address and then tried to manipulate the URL. The site appears to cache your most recent search and bring that data up on the initial load screen so clear your browser cache and then try the link. I tried it in Chrome and IE and got an empty search screen each time.

Shred Dude
09-23-2010, 11:09 AM
What about the full URL that is returned?

Like this one, for The White House at 1600 Pennsylvania Avenue, Washington, DC...



http://factfinder.census.gov/servlet/AGSGeoAddressServlet?_MapEvent=&_category=&_subcategory=&_stateSelectedFromDropDown=District+of+Columbia&context=ags&programYear=50:420&street=1600+Pennsylvania+Avenue&city=Washington&states=District+of+Columbia&zip=20502&geo_id=null&_programYear=50&_treeId=420&_lang=en&tree_id=420&bucket_id=

storfar
09-23-2010, 11:19 AM
That seems to work. Thanks!