PDA

View Full Version : WebQuery difficulties



JimmyTheHand
11-21-2007, 01:08 AM
Hi

I'm trying to pull data from the web using webquery. Sometimes it works, sometimes it doesn't.

Background:
Testing laboratory needs to work according to standards. It is required to follow any changes, and use the newest standards available. The standards I need to check periodically are from different sources. There are ASTM, ISO, ISO EN, MSZ (this latter is Hungarian national std.), etc. I need to determine the date (year) of the latest issue.

Problem description:
I experimented with ASTM and ISO websites, and got their seach engines working by updating the connection string with standard numbers as parameters. It works fine, no problem. But when I try to do the same with MSZ site, it just fails. The error message is:

Run-time error '1004':
https://...(URL)
cannot be opened...(etc.)

And here's the code I'm using. (For start, I used constant parameters 'MSZ' and '11721*', in the connection string.)
Sub MSZ_qry()
Dim Conn As String
Dim LDR As Worksheet
Set LDR = ThisWorkbook.Sheets(1)
Conn = "URL;https://www.mszt.hu/mszt/portal/user/anon/page/default.psml/js_panename/msztObjectBrowser?jelzet=MSZ+11721*&tipus=S"
With LDR.QueryTables.Add(Connection:=Conn, Destination:=LDR.Range("A1"))
.BackgroundQuery = True
.TablesOnlyFromHTML = True
.Refresh BackgroundQuery:=False
.SaveData = True
End With
End Sub

The same URL can be opened with internet browser. So what's wrong with the code? How can I overcome this problem?

Thanks in advance for any tips.

Jimmy

rory
11-21-2007, 06:16 AM
Do the other two sites use HTTP or HTTPS? Also, do you have to login or accept a cookie to use the MSZT site?

JimmyTheHand
11-21-2007, 09:27 PM
1. The MSZT site uses HTTPS, both the others use HTTP.
2. I don't need any password or such, to use MSZT site. There is a login option, but you can reach the search engine without using it. I think the anon part of the URL indicates anonym user.
3. I don't know about cookies. How do you check the answer to that question?

I will provide the other two connection string in 2 hours, and also there will be another question with CEN's site.
Thanks for bearing with me. :yes

Jimmy


</EDIT
So, here is the connection string for www.astm.org: (http://www.astm.org:)


StdNo = c.Offset(, 1)
Conn = "URL;http://www.astm.org/verity_hist.cgi?UserQuery=" & StdNo & _
"&SEARCH_COLLECTION=ACTIVE_REDLINE&SEARCH_STATUS=ACTIVE_REDLINE&NS-search-type=Free+text&NS-max-records=1" & _
"&NS-query=" & StdNo & "+AND+NOT+%28STATUS+%3CNEAR%2F1%3E+REPLACED%29&NS-collection=ASTM_ACTIVE_REDLINE" & _
"&NS-collection2=XXX&NS-collection3=YYY&NS-collection4=ZZZ&NS-search-type=boolean"

and the other for www.iso.org: (http://www.iso.org:)

StdNo = c.Offset(, 1)
Conn = "URL;http://www.iso.org/iso/search/extendedsearchstandards.htm?from_js=true" & _
"&type=adv&displayResult=true&published=on&formKeyword=&title=true&formISO_number=" & StdNo & _
"&formPart_number=&formDocument_type=ALL&formSupp_type=&formICS=&formStage_code=&" & _
"current_stage=true&repost=1&formDate_stage=0&formOther_date=&formCommittee=&" & _
"formSub_committee=&formSubmit=Search"

And finally the question related to CEN's site. I want to search for EN standards, too. This is the link to their search engine:

http://www.cen.eu/esearch/extendedsearch.aspx

I do the searching by Standard reference, so that's the field I need. But this site works differently than the others. The others edit/update the URL string, and send it to the server. This I can do, too, after analyzing the updated URL and determining rules of parameter usage. But CEN's search engine, after I submit a form, comes back with exactly the same URL as seen above. How can you utilize this kind of form from Excel?
/>

rory
11-22-2007, 03:00 AM
If the URL doesn't change then I suspect you won't be able to achieve this with WebQuery - you will probably need some way of filling in the form (maybe automating IE) and submitting that then parsing the returned HTML, but that's not something I have any experience of.
Have you tried getting in touch with the site's owners to see if they provide this as a service?

JimmyTheHand
11-22-2007, 07:16 AM
If the URL doesn't change then I suspect you won't be able to achieve this with WebQuery :(


you will probably need some way of filling in the form (maybe automating IE) and submitting that then parsing the returned HTML, but that's not something I have any experience of.
:dunno


Have you tried getting in touch with the site's owners to see if they provide this as a service? Of course they provide such service. For a price. I was asked by my superior to try to solve it for free. So unless I want to spend hours regularly with internet browsing and standard checking, I need to find an automatic solution. I prefer spending hours on writing a good program to spending hours on browsing, anyway. ;)

Jimmy

rory
11-22-2007, 07:23 AM
Can you give us an example of the data you would enter into their search engine (and where, as I can't view the site at work for some reason) and what you need to get back, so we can experiment?

JimmyTheHand
11-23-2007, 12:02 AM
I can't view the site at work for some reason...
Why would that be, I wonder? Nevermind, I'll try to navigate you there.
[...]
I was going to create a guide with screenshot images, but since I can upload only one attachment, I did it in Word doc. See the attachment.

In the meantime I have also learned that the MSZT side, indeed, uses cookies. Once the search engine started to behave strangely, and I had to delete the cookie for it to return to normal.

I hope you can experiment now.

Thanks

Jimmy