PDA

View Full Version : How to do web scraping from java based website



JackkG
01-08-2015, 04:31 PM
Hello all,


I need to pull a report from a java based website into excel. The website asks for start date and end date and generates the report based on the dates supplied, but the problem is, the source of the page don't show any details wherein i can use the details to pull the report through VBA. Is there any way we can do this?

Any help on this will be great.



sorry, I'm not able to attach source file here...


Thanks!


JayeshG

jpo645
01-08-2015, 07:52 PM
Fill in all the information, but before hitting Submit press F12 to bring up the web console. In your web console (depending upon the browser, the names and icons are different), look for the network tab. Then press enter. If Java is sending a request to a server, you should be able to capture it here. You should also be able to see what the response is. You need to figure out if it's using REST or SOAP. Once you know that, you can attempt making the server request and capturing the response in VBA.

JackkG
01-09-2015, 07:55 AM
Hi jpo645,

Thanks for your reply. Let me try this out. Will get back on this.

JackkG
01-09-2015, 08:11 AM
Hi Jpo,

It uses SOAP, as its using the POST method. So, now how to go about this. What info I'll be needing to get the report?

JackkG
01-09-2015, 08:24 AM
Hi,

Can I get a sample code for this? Or the code works the same for what is used for normal site scraping?

Thanks!

jpo645
01-09-2015, 12:05 PM
Well this part is more art than science. Hopefully, in the post request, you can see the breakdown of the request to know what you should send in. Usually, the values are in the URL or in the payload. Alternatively, if you are familiar with the system you are requesting information from, there may be an API that explains exactly what's required in the request. If it's SOAP, that means the response is going to come back as XML. This is good news because you can scrape the XML response as you would HTML (indeed, arguably you can scrape it more easily).

The problem is, SOAP requests are a little old. Here are a few conversations about it, i've found:

http://community.spiceworks.com/topic/478128-soap-request-via-vba-in-excel
http://scn.sap.com/community/epm/blog/2012/08/10/how-to-invoke-a-soap-web-service-from-custom-vba-code
http://www.soapuser.com/client4.html

But outside of that, I couldn't give you any sample code.

JackkG
01-09-2015, 02:05 PM
Okay kool. Thanks Jpo. Will check it out!