PDA

View Full Version : VBA IE Scrape data to excel



totemos
02-06-2021, 03:30 AM
Hello,

Im running a project where VBA is reading values from a column A (Row by row) and with a web page request i want to return some data to excel.



A




WG550227465GR







WG550227***GR







This is automated with the below code


Sub Query()

Dim x As Integer
Dim i As Integer
NumRows = Range("A1", Range("A1").End(xlDown)).Rows.Count
Range("A1").Select
' Establish "For" loop to loop "numrows" number of times.
For x = 2 To NumRows + 1
'If Len(Range("a" & x)) <> 13 Then
'MsgBox "wrong tracking code"
'Exit Sub
'End If
Dim Xrange As String
Dim t As Integer
Xrange = Trim(Range("A" & x).Value)

Dim ie As New InternetExplorer
t = 2


'web scrape + cell value
2 Application.wait (Now + TimeValue("0:00:002"))
ie.navigate "https://www.elta-courier.gr/search?br=" & Xrange
If Xrange = Mid(Xrange, 1, 2) = "pd" Then t = t - 2
Do While ie.Busy Or ie.readyState <> 4: DoEvents: Loop 'Until ie.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = ie.document


Dim sDD As String
Dim sDD1 As String
Dim sDD2 As String

'check if page has loaded with data, counting Elements number.
i = Doc.getElementsByTagName("td").Length
If i = 0 Then GoTo 2

sDD = Trim(Doc.getElementsByTagName("TD")(t + 2).innerText)
sDD1 = Trim(Doc.getElementsByTagName("TD")(t).innerText)
sDD2 = Trim(Doc.getElementsByTagName("TD")(t + 1).innerText)

' write values to excel
ie.Quit
Dim aDD As Variant
aDD = Split(sDD, ",")
Range("C" & x).Value = aDD
Range("D" & x).Value = sDD1
Range("E" & x).Value = sDD2

1 Next
Application.ScreenUpdating = True


ie.Quit
End Sub






Data requested to be returned is the second row from the attached picture.

27881


PROBLEM
1) I get an error "The object invoked has disconected from its clients
2) It takes to long to read 500 records and parse and receive the data back to excel.

Can someone check what im doing wrong?

YasserKhalil
02-06-2021, 05:41 AM
Try this code


Sub GetData_Test()
GetData "WG550227465GR", "Sheet1", 1, 2
End Sub


Sub GetData(ByVal sID As String, ByVal wsName As String, ByVal lRow As Long, ByVal lCol As Long)
Const sURL As String = "https://www.elta-courier.gr/track.php"
Dim a, http As MSXML2.XMLHTTP60, col As Collection, json As Object, postData As String, sResp As String, i As Long
Set http = New MSXML2.XMLHTTP60
postData = "number=" & sID
With http
.Open "POST", sURL, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send postData
sResp = .responseText
End With
Set json = JSONConverter.ParseJson(sResp)
Set col = json("result")(sID)("result")
ReDim a(1 To col.Count, 1 To 1)
For i = 1 To col.Count
a(i, 1) = col.item(i)("status")
Next i
For i = LBound(a) To UBound(a)
ThisWorkbook.Worksheets(wsName).Cells(i + lRow - 1, lCol).Value = a(i, 1)
Next i
End Sub


The parameters used is the item you are dealing with and the sheet name where you would like to put the results and the row as the start row and finally the column as the start column.. In the example the results would be put in row 1 and column 2

YasserKhalil
02-06-2021, 05:53 AM
By the way, download the JSONConverter.bas from this link
https://github.com/VBA-tools/VBA-JSON/blob/master/JsonConverter.bas

YasserKhalil
02-07-2021, 06:39 AM
Did the code help you?

totemos
02-07-2021, 10:30 AM
Code is working perfect. One question though. Im not familiar with XML library and i modify your code to the following and im getting error : run-time error 10001 error parsing json



Sub GetData_Test() GetData "sid", "Sheet1", 1, 2

End Sub




Sub GetData(ByVal sID As String, ByVal wsName As String, ByVal lRow As Long, ByVal lCol As Long)
Const sURL As String = "https://www.elta-courier.gr/track.php"
Dim x As Integer

NumRows = Range("A1", Range("A1").End(xlDown)).Rows.Count
Range("A1").Select
' Establish "For" loop to loop "numrows" number of times.
For x = 1 To NumRows 'If Len(Range("a" & x)) <> 13 Then
'MsgBox "wrong tracking code"
'Exit Sub
'End If
k = 1
Dim Xrange As String
Dim t As Integer
sID = Trim(Range("A" & x).Value)




Dim a, http As MSXML2.XMLHTTP60, col As Collection, json As Object, postData As String, sResp As String, i As Long
Set http = New MSXML2.XMLHTTP60
postData = "number=" & sID
With http
.Open "POST", sURL, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send postData
sResp = .responseText
End With
Set json = JsonConverter.ParseJson(sResp)
Set col = json("result")(sID)("result")
ReDim a(1 To col.Count, 1 To 1)
For i = 1 To col.Count
a(i, 1) = col.Item(i)("status")
Next i



ThisWorkbook.Worksheets(wsName).Cells(k + lRow - 1, lCol).Value = a(2, 1)
lRow = lRow + 1

Next x
End Sub

totemos
02-07-2021, 10:34 AM
To explain further, i made a loop to read every row in column A (row by row) and in the array (a) retrieving only a(2) second record which is what i need.

Again thatnk you for you time and efford, your code is workind perfect.

YasserKhalil
02-07-2021, 10:35 AM
Can you give snapshot of the worksheet ..?
Try to loop in the GetData_Test procedure not in the GetData Public Procedure.

totemos
02-07-2021, 10:42 AM
In Column A there are almost 2000 records.

Will try now

27885

YasserKhalil
02-07-2021, 10:45 AM
Can you give two or three IDs so as to be able to test..? So you need just the second value of the results?!!

totemos
02-07-2021, 10:49 AM
...And at this part im getting the json error



27886

totemos
02-07-2021, 10:51 AM
Can you give two or three IDs so as to be able to test..? So you need just the second value of the results?!!

Yes, just the second value only.



WG550227465GR


WG5502277**GR

YasserKhalil
02-07-2021, 10:58 AM
Try this version


Sub GetData_Demo()
Dim ws As Worksheet, r As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
For r = 1 To ws.Cells(Rows.Count, 1).End(xlUp).Row
GetData ws.Cells(r, 1).Value, ws.Name, r, 2
Next r
End Sub


Sub GetData(ByVal sID As String, ByVal wsName As String, ByVal lRow As Long, ByVal lCol As Long)
Const sURL As String = "https://www.elta-courier.gr/track.php"
Dim a, http As MSXML2.ServerXMLHTTP60, col As Collection, json As Object, postData As String, sResp As String, i As Long, cnt As Long
Set http = New MSXML2.ServerXMLHTTP60
postData = "number=" & sID

With http
sPoint:
.Open "POST", sURL, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send postData
.waitForResponse
sResp = .responseText
If InStr(sResp, "Fatal error") Then cnt = cnt + 1: Debug.Print "Wait .. " & cnt: GoTo sPoint
End With
Set json = JSONConverter.ParseJson(sResp)
Set col = json("result")(sID)("result")
ReDim a(1 To col.Count, 1 To 1)
For i = 1 To col.Count
a(i, 1) = col.item(i)("status")
Next i
ThisWorkbook.Worksheets(wsName).Cells(lRow, lCol).Value = a(2, 1)
End Sub




* I have updated the code as I noticed the site is down sometimes.

totemos
02-07-2021, 11:03 AM
Unfortunately same error.


27887

totemos
02-07-2021, 11:12 AM
Yes. it worked like a charm, perfect. It worked perfectly !!

YasserKhalil
02-07-2021, 11:17 AM
You're welcome. Glad I can offer some help.
Regards