PDA

View Full Version : Using Excel VBA to Scrape Web Table Data



gurs
09-19-2016, 10:42 AM
I have a question about scraping information from a web table using Excel VBA. The webpage with the table doesn’t seem to be accessible directly using a URL and instead relies on cookies to display the relevant information. For some reason vbaexpress won't let me submit my post with the URL included, so here is an obfuscated URL (just replace each *** with a period):

refscheduler***net/Soccer/Welcome2***asp

If I am accessing the table using a browser, I use the link above, select “Area 6”, then click List Areas, then select “Area D” and click List Regions, then select “418…” and click Login. On the resulting page I click “View Schedule As Guest”, which takes me to a final page displaying the schedule table (from which I want to extract data).

The site uses the POST method. Based on my responses to the various dropdowns, it is setting the following parameter values:
DBSection = 6
DBArea = 'D'
RegionID = 418

It then requests the results table (somehow using the variables) using the following 3 lines:


<FORM ACTION="rptGameListings.asp" METHOD="POST">
<INPUT TYPE="Submit" VALUE="View Schedule as Guest">
</FORM>

On the results page, all of the data for the table is contained in the HTML. The table begins on row 21 and starts with this code for the header row:


<TABLE BORDER="1" CELLSPACING="2" CELLPADDING="2"> <TR> <TH>Game ID</TH> <TH>Start Date</TH> <TH>Start Time</TH> <TH>Field</TH> <TH>Division</TH> <TH>Home</TH> <TH>Visitor</TH> <TH>Referee</TH> <TH>AR 1</TH> <TH>AR 2</TH> </TR>

The row continues on with a tag and the first row of data, etc. This is the second table in the code for the page, but it doesn’t seem to be defined in any way that would allow me to call it by name. The first table just contains a login link.

So the question is, how can I use VBA to access the table and copy its contents into an array? I already have functioning VBA code that does this for pages that are directly accessible via URL, but I’m not sure how to instruct VBA to access this page.

Thanks for the help!