PDA

View Full Version : Parsing websites



Yamadori
03-18-2017, 11:28 PM
Hello,

I have problems with parsing a specific website . I used the excel function for importing data of a website (Data tab --> Get External Data group --> From Web) which I inserted into a macro.
And here is the problem: It seems that the information that I need is not available the day after I created the macro. I'm looking for soccer odds of a specific country (also only a specific bet offer) but obviously the bet website changes the information of the web address. Means when I run the makro after a while excel does deliver other information that I'm looking for.
Is there anything I can do? Obviously I need another programming language which is able to run through the website, click on the right buttoms and set the right filters e.g., right? I'm looking for a dynamic solution which always delivers the information I need but does find its way to that point of the website each time "new".

Thanks in advance,

Michael

offthelip
03-20-2017, 12:00 PM
without knowing what the web site looks like it is very difficult to help you. In general terms what I have found often works with these sort of sites is to look very carefully at the URL that gives you the correct page, it often is made up of two parts:
www.somename.com/othername?sometext=111&p=rrr&d=19032017
the part of the url which is after the question mark is a set of parameters which determine what is sent back. if you look at the parameters you can often work out how to set these to get the answers you are looking. e.g. todays date, team code or whatever
Best of luck

Yamadori
03-20-2017, 11:35 PM
Hello,

thanks you for your reply. Since I'm new here and need at least 5 posts to post links, I can't show you an example. If you follow the path below you can reach the point where 'm struggling:

--> Go to Tipico(.com) --> press "Football" --> Press "Spain" --> Press "La Liga". Until this path there is a link. That's fine so far. But I am interested in a specific bet Option --> Press "Show bet Options" --> Press "Over/Under Points in match".

18701

18702

Unfortunatley, when chosing this bet options, the link doesn't change, so I still have no direct connection to the bet options I'd like to have.

Thanks in advance,

Michael

offthelip
03-21-2017, 10:24 AM
Hopefully this will get you started, it gets most of the information on the page,. if you want more you will need to change the logic a bit. also i haven't tidied up the output because I wasn't sure quite what you wanted.

Sub test()
Worksheets("sheet1").Select
Range(Cells(1, 1), Cells(12000, 10)) = ""
meta = "<meta itemprop"
result = "result_pk"
Dim str As String
indi = 2
Dim inarr As Variant
Dim outarr As Variant
Set FinHTTP = CreateObject("Microsoft.xmlHTTP")
Dim FinRows, FinCols
urlstring = "https://www.tipico.com/en/online-sports-betting/football/spain/la-liga/g36301/"
FinHTTP.Open "GET", urlstring, False
FinHTTP.send

textmess = FinHTTP.responseText
'MsgBox textmess
FinRows = Split(FinHTTP.responseText, Chr(10))
For j = 0 To UBound(FinRows) - 1

' Cells(j + 1, 1) = FinRows(j)
str = FinRows(j)

If InStr(str, meta) > 0 Or InStr(str, result) > 0 Then
Cells(indi, 1) = FinRows(j)
indi = indi + 1
End If
Next j


End Sub

Yamadori
03-21-2017, 06:59 PM
Hello,

thanks a lot for this code! It helps me to get the Information in a more sorted way than just copying everything (that's how Excel does it when you use the standard function) into a sheet.

I'm still not at the point where the Information I need is. With your code I get the odds for the three possible results of each game (win home team, draw, win away team). I'm looking for the Odds of the bet option "Over/Under points in match". But therefore you first need to click "Show bet options" to select it (that's at the very top of the homepage). Can you see where I'm struggling?

Thanks!

Michael

offthelip
03-22-2017, 03:30 AM
Hi , I don't think my method is going to solve the problem for you, I think the only way I know to do this is to do it by instantiating Internet Explorer within VBA, then running through the elements or classes to find the right name and then sending a click event. You can use my methd to get a complete printout of the web page to allow yo to see what the classes are by removing the if statement. I am sorry but I don't have time to do it for you.

Yamadori
03-23-2017, 12:30 AM
Hey guys,

what I found and adjusted to my needs is:


Sub Click()
Dim ele As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False
IE.navigate "Tipico_Link"
Set ele = IE.Document.getElementById("Show bet options")
ele.Click
End Sub

Unfortunately I receive the following message: "the method document of iwebbrowser2 failed"
Others were able to solve this Problem by changing Set IE = CreateObject("InternetExplorer.Application") into Set IE = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}"). For me it doesn't make any difference, it just shows the same error message.

By the way, the HTML-code for my button is:

<span class="">Show bet options</span>

Thanks!

Michael

offthelip
03-23-2017, 08:41 AM
I have done some work on this and I have got half way there, I have now managed to click the "show bet options" Hopefully this will show you that way to get the other half.
Note I added a sheet called data to help in development along with some of the code which is stil in this sub.
Sub Laliga()
Worksheets("data").Select
Range("A1:z2000") = ""




Dim Items As IHTMLElementCollection
Dim Item As IHTMLElement
Dim RowNumber As Long
Dim ItemId As String
Dim QFields As IHTMLElementCollection
Dim QField As IHTMLElement
Exitf = False


Dim tsext As String

Dim ie As Object
Dim HTMLDoc As HTMLDocument

Set ie = CreateObject("InternetExplorer.Application")

ie.navigate "https://www.tipico.com/en/online-sports-betting/football/spain/la-liga/g36301/"

ie.Visible = True

Do While ie.Busy And Not ie.readyState = 4
DoEvents
Loop

DoEvents
Set HTMLDoc = ie.Document
i = 1
Set Items = HTMLDoc.all

For Each Item In Items
Set QFields = Item.all

For Each QField In QFields
i = 1
tsext = QField.innerText
flagt = InStr(tsext, "Show bet options")
flagc = InStr(QField.outerHTML, "more_types onclick")
If flagt = 1 And flagc = 14 Then
Cells(i, 1) = tsext
Cells(i, 2) = flagc
Cells(i, 3) = QField.outerHTML
' MsgBox (flagc)
QField.Click

Exitf = True
Exit For
End If

' MsgBox (tsext)
Next QField
If Exitf Then
Exit For
End If
Next Item




Set Document = Nothing
Set Browser = Nothing

End Sub

offthelip
03-23-2017, 12:10 PM
I am getting there this now opens the Over/Under point in match bits:

Sub Laliga()
Worksheets("data").Select
Range("A1:z2000") = ""




Dim Items As IHTMLElementCollection
Dim Item As IHTMLElement
Dim RowNumber As Long
Dim ItemId As String
Dim QFields As IHTMLElementCollection
Dim QField As IHTMLElement
Exitf = False


Dim tsext As String

Dim ie As Object
Dim HTMLDoc As HTMLDocument

Set ie = CreateObject("InternetExplorer.Application")

ie.navigate "https://www.tipico.com/en/online-sports-betting/football/spain/la-liga/g36301/"

ie.Visible = True

Do While ie.Busy And Not ie.readyState = 4
DoEvents
Loop

DoEvents
Set HTMLDoc = ie.Document
i = 1
Set Items = HTMLDoc.all

For Each Item In Items
Set QFields = Item.all

For Each QField In QFields
i = 1
tsext = QField.innerText
flagt = InStr(tsext, "Show bet options")
flagc = InStr(QField.outerHTML, "more_types onclick")
If flagt = 1 And flagc = 14 Then
' Cells(i, 1) = tsext
' Cells(i, 2) = flagc
' Cells(i, 3) = QField.outerHTML
' MsgBox (flagc)
QField.Click

Exitf = True
Exit For
End If

' MsgBox (tsext)
Next QField
If Exitf Then
Exit For
End If
Next Item


' Find over/under
i = 1


For Each Item In Items
Set QFields = Item.all
Exitf = False
For Each QField In QFields
tsext = QField.innerText
flagt = InStr(tsext, "Over/Under points in Match")

flagc = InStr(QField.outerHTML, "more_types onclick")
If flagt = 1 Then
Cells(i, 1) = tsext
Cells(i, 2) = flagt
Cells(i, 3) = QField.outerHTML
' MsgBox (flagc)
QField.Click
i = i + 1

Exitf = True
Exit For
End If

' MsgBox (tsext)
Next QField
If Exitf Then
Exit For
End If
Next Item






Set Document = Nothing
Set Browser = Nothing
End Sub

Yamadori
03-25-2017, 02:43 AM
Hello,

impressing, thanks! It took me a while to understand but I finally made it.

I tried your code and have even been able to check if I'm on the right page by running through the items and searching for a specific team (which was fortunately found).

Now I have the next problem: I don't know how to get the right information on the page. Of course I could put all football teams into an array and check if the string of the current item is equal to the strings in the array. But that wouldn't help me to find the bets and odds since they change of course. I tried to find the height of the found team (example: "Real Madrid", height: 144) in order to get the bets which are right to the teams. It didn't work out for me.
Another approach was to use you first suggestion (using the Microsoft.xmlHTTP Object) but from my point of view I have the same problem like at the beginning since creating this object requires a URL which is not correct in my case because it doesn't deliver the right information before clicking two buttons. Is there any way to transfer the items I already have into the FinHTTP-object?

I don't know how to deal with that, I need the information (games, bets, odds) listed like in your first approach.

Here's the code (only the last few rows are "new"):


Sub Laliga()

Worksheets("Odds").Select
range("A1:Z2000") = ""

Dim Items As IHTMLElementCollection
Dim Item As IHTMLElement
Dim QFields As IHTMLElementCollection
Dim QField As IHTMLElement
Dim HTMLDoc As HTMLDocument

meta = "<meta itemdrop"
Result = "result_pk"
Dim str As String
indi = 2
Dim inarr As Variant
Dim outarr As Variant
Set FinHTTP = CreateObject("Microsoft.xmlHTTP")
Dim FinRows, FinCols

Dim RowNumber As Long
Dim ItemId As String

Exitf = False
Dim tsext As String

Dim ie As Object 'ie --> InternetExplorer als Objekt einrichten und Seite öffnen
Set ie = CreateObject("InternetExplorer.Application")
ie.navigate "LINK" (still not able to post Links here)
ie.Visible = True

Do While ie.Busy And Not ie.readyState = 4
DoEvents
Loop

DoEvents
Set HTMLDoc = ie.document 'Internet Explorer als HTML-Dokument darstellen
i = 1
Set Items = HTMLDoc.all 'IHTML-Elemente mit HTML-Elementen füllen

For Each Item In Items 'Find the button "Show bet options"
Set QFields = Item.all

For Each QField In QFields
i = 1
tsext = QField.innerText
flagt = InStr(tsext, "Show bet options")
flagc = InStr(QField.outerHTML, "more_types onclick")
If flagt = 1 And flagc = 14 Then

QField.Click

Exitf = True
Exit For
End If

Next QField
If Exitf Then
Exit For
End If
Next Item


i = 1
For Each Item In Items 'Find the button "Over/Under points in Match"
Set QFields = Item.all 'Alle Items der Seite "Over/Under points in Match" QFields zuordnen
Exitf = False

For Each QField In QFields
tsext = QField.innerText
flagt = InStr(tsext, "Over/Under points in Match")
flagc = InStr(QField.outerHTML, "more_types onclick")

If flagt = 1 Then '--> button found

QField.Click
i = i + 1

Exitf = True
Exit For
End If

Next QField
If Exitf Then
Exit For
End If
Next Item

i = 1
'Added code
FinHTTP = Items 'Copy the information into Excel
FinRows = Split(FinHTTP.responseText, Chr(10))

For j = 0 To UBound(FinRows) - 1

str = FinRows(j)

If InStr(str, meta) > 0 Or InStr(str, Result) > 0 Then
Cells(indi, 1) = FinRows(j)
indi = indi + 1
End If

Next j

Set document = Nothing
Set Browser = Nothing

End Sub

Thanks for helping me so much!

snb
03-25-2017, 04:21 AM
Please use code tags !

Yamadori
03-26-2017, 06:49 AM
Hey there,

I made it by using the classNames. It's still some work to do in the Excel-file but that's not a big deal.




For Each QField In QFields
tsext = QField.innerText
flagt = InStr(tsext, "Over/Under points in Match")
flagc = InStr(QField.outerHTML, "more_types onclick")

If QField.className = "multi_row" And InStr(tsext, "(1.") = 1 Then 'This gives me the bet options including bets
Cells(j, 1) = tsext
j = j + 1
If j = 40 Then
Exitf = True
Exit Sub
End If
End If

If QField.className = "e_active t_row jq-event-row-cont" Then 'This gives me the date of the match and the teams
Cells(i, 2) = tsext
i = i + 1
If i = 40 Then
Exitf = True
Exit For
End If
End If

Next QField
If Exitf Then
Exit For
End If
Next Item



Thank you!