PDA

View Full Version : Importing Data From Web Page



edneco
12-13-2013, 11:44 AM
Hello again..... I need to extract 3 tables from web, but I canīt do it.... Iīm using a VBA code to import some tables from web, by recording macros... This is the only way that I can "write" some code....

This is it: w w w.stat-football.c o m/en/t/eng10.php?c=25&ht=h1

As you can see, the table is "1st half total goals 1.5", Home and Away. But I need the "1st half total goals 0.5" table, Home and Away.

But the problem is the 3 tables that I want to import, needed to be selected by choose them and clicking in a submit button. But with a macro, this steps wonīt appear in the code....

Then I ask for help, I this guy that help me, made a code to me, but donīt work..... Can you guys help me with this???



Public HTMLdoc As Object
Public PageSrc As String

Function GetText(ByRef objHTML As Object, ByRef TextOut As String)

Dim flag As Boolean
Dim i As Long

' Returns the text from HTML elements using recursion.
If objHTML.HasChildNodes = True Then

For i = 0 To objHTML.ChildNodes.Length - 1
With objHTML.ChildNodes(i)
If UCase(.nodeName) = "BR" Then flag = True

Select Case .NodeType
Case 1 'Element
DoEvents

' Check if element contains any nodes with text.
If .HasChildNodes Then
Call GetText(objHTML.ChildNodes(i), TextOut)
End If

Case 3 'Text
If .NodeValue <> "" Then
If flag = True Then TextOut = TextOut & vbLf
TextOut = TextOut & .NodeValue
End If
End Select
End With

flag = False
Next i

End If

' Return a text string of the elements' text separated by a pipe character | .
GetText = TextOut

End Function

Sub OpenWebPage(ByVal URL As String)

PageSrc = ""

With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, True
.Send

While .readyState <> 4: DoEvents: Wend

' Check for any connection errors.
If .statusText <> "OK" Then
MsgBox "ERROR: " & .Status & " - " & .statusText, vbExclamation
Exit Sub
End If

PageSrc = .ResponseText
End With

' Create an empty HTML Document and load it with the Page Source.
Set HTMLdoc = CreateObject("htmlfile")
HTMLdoc.body.innerHTML = PageSrc

End Sub

Sub ListGameData()

Dim oButton As Object
Dim oDiv As Object
Dim oHeader As Object
Dim oItem As Object
Dim oRow As Object
Dim oSelect As Object
Dim oTable As Object
Dim Rng As Range
Dim Text As String
Dim Wks As Worksheet


Set Wks = Folha2
Set Rng = Wks.Range("E5:Q25, S5:AE25, AG5:AS25")

OpenWebPage "w w w.stat-football. c om /en/t/eng10.php?c=25&ht=h1"

For Each oItem In HTMLdoc.getElementsByTagNAme("select")
If oItem.Name = "tot" Then
Set oSelect = oItem
Exit For
End If
Next oItem

For Each oItem In HTMLdoc.getElementsByTagNAme("input")
If oItem.Type = "submit" And oItem.Value = " >> " Then
Set oButton = oItem
Exit For
End If
Next oItem

Set oDiv = HTMLdoc.getElementById("y0")
Set oHeader = oDiv.ChildNodes(0)
Set oTable = HTMLdoc.getElementById("tb01")


For n = 1 To 3
oSelect.selectedIndex = n - 1
oButton.Click ' <--- This causes an error

For r = 0 To oTable.Rows.Length - 1
For c = 0 To oTable.Rows(r).Cells.Length - 1
Text = ""
Rng.Areas(n).Item(r + 1, c + 1) = GetText(oTable.Rows(r).Cells(c), Text)
Next c
Next r

Rng.Areas(n).Columns.AutoFit
Next n

End Sub



I thank you, in advance, if you help me with this.....

Best regards, Eduardo

SamT
12-13-2013, 01:58 PM
That is a pretty generic Thread Title.

I think I will change it

edneco
12-14-2013, 04:53 AM
Thank you SamT.... In fact that title is used many times......

Bellow in your post you said "Can you explain what are you trying to do, and not how you think you want to do it?"

Following this sentence, what I want is to extract the data from 3 tables in that link. By default when we open that link, the page shows the "1st half total goals 1.5", Home and Away.

But the 3 tables that I really need are "1st half total goals 0.5", Home and Away......

Need help to resolve the problem in the code above, or a new one that works........

Can anyone help with this issue??? I really need this tables because I trying to do some research, to do a work for graduation.

Again, thank you very much for any help you can give......

SamT
12-14-2013, 10:17 AM
Link = http://www.stat-football.com/en/t/eng10.php?c=25&ht=h1

Source code analysis:

Tables (JS Link Names in Bold, active Table captions in Italic):
OverAll 1st half total goals 1.5
<table class="tar" id="tb01" style="border:1px solid #88c3c3;" cellpadding="1">
Home home 1HT total goals 1.5
<table class="tar" id="tb02" style="border:1px solid #88c3c3;" cellpadding="1">
Away away 1HT total goals 1.5
<table class="tar" id="tb03" style="border:1px solid #88c3c3;" cellpadding="1">

All three tables re available in the source code at the same time, Unfortunately they are complex layers of un-ID'ed tables

I looked at another data table page from that site and they are not all in the same format.

I am attaching a text file with the html code of those three tables.(3.4KB. Unzips to 66kb) I am only scanning the source code of that page, not doing any in-depth reading, so I may have more or less than all the code involved.

I suggest that you create a visual of one of them showing the "physical" layout, the class and id of each table. be sure to somehow differentiate td's and enclosed tables.

Without knowledge of Data Structure, programming is impossible.

edneco
12-16-2013, 12:50 PM
SamT thank you very much for your time to see this problem.....

But unfortunately I canīt help you what you asking for.... Like I said, the only code that I "wrote" is recording macros in excel...

Iīm too noob with VBA..... Sorry.....

If you or anyone else can do this for me, I thank you very much......