PDA

View Full Version : Solved: Get data from web page and paste to excel



swaggerbox
03-15-2010, 03:26 AM
I need to extract specific data from a web page and paste those in an excel sheet. The data is the value of title="" in a web page. For example: title="915-526-6305", I need to extract 915-526-6305. Is there a way to do this or help me get started?

hardlife
03-16-2010, 11:18 AM
It is probably not easy to help You, without page with this information.

it is up to You.

SamT
03-16-2010, 12:02 PM
It is probably not easy to help You, without page with this information.

it is up to You.

Web Page Code :


<html>
<head>
<title>915-526-6305</title>
</head.
etc.

hardlife
03-16-2010, 12:39 PM
me is not skilled in html source code, please if You want me - to help, post web page where is this information, me will try my best to help You.

with best regards, Pavel

mdmackillop
03-16-2010, 05:52 PM
Based on Vladimir's code here (http://www.vbaexpress.com/forum/showpost.php?p=207788&postcount=4)

Option Explicit
Option Compare Text

Sub test()
MsgBox GetTitle("http://www.vbaexpress.com/forum/forumdisplay.php?f=17", "Title")
End Sub


Function GetTitle(WebPage As String, Tag As String)

Dim t, Tag2 As String, EndTag As String
Dim oHttp As Object, txt$, i&, j&

'Adjust this to suit
Tag2 = Tag & "=" & Chr(34)
EndTag = Chr(34)
'*************************

On Error Resume Next
Set oHttp = CreateObject("MSXML2.XMLHTTP")
If Err <> 0 Then Set oHttp = CreateObject("MSXML.XMLHTTPRequest")
If oHttp Is Nothing Then MsgBox "MSXML2.XMLHTTP not found", 16, "Error": Exit Function
On Error GoTo 0

With oHttp
.Open "GET", WebPage, False
.Send
txt = .responseText
i = InStr(1, txt, Tag2, 1)
If i = 0 Then
GetTitle = Tag & " not found"
Else
t = Split(txt, Tag2)(1)
GetTitle = Split(t, EndTag)(0)
End If
End With
Set oHttp = Nothing

End Function

swaggerbox
03-17-2010, 02:06 AM
Great! Thank u guys for your help!

mlomenzo
10-19-2011, 10:15 AM
I used the code above to get the data that I need. It works beautifully. If I wanted to take this step further I will need help. I have a spreadsheet with a few thousand URL's. I would like the final script to read the URL in column B then automatically parse the webpage "title", "description", and "Keywords" in the adjacent columns on the same line as each respective url. Does this make sense? Any help would be greatly appreciated.
Thanks!
Mike

mdmackillop
10-19-2011, 11:15 AM
Welcome to VBAX
Can you post a workbook with two or three web pages from your selection.
Use Manage attachments in the Go Advanced reply section.

mlomenzo
10-19-2011, 12:46 PM
Is it ok if I give you another sheet that has the same layout but different data? Here is an example of the layout, the other contains data from a different site but in the same layout and is obviously a lot smaller than the workbook that I am working with. If not, let me know and perhaps I can e mail you the workbook. I would rather not post it on the interwebz. Thanks in advance for your help!!
:bow:

mlomenzo
10-19-2011, 12:48 PM
There is another module in that file which allows me to select the group of URL's and when i run the script, it places the link address in the adjacent column.

mdmackillop
10-19-2011, 01:33 PM
Give this a try
Select the URLs first then run the code


Option Explicit
Option Compare Text

Sub test()
Dim t, Tag As String, Tag2 As String, EndTag As String, GetTitle As String
Dim oHttp As Object, txt$, i&, j&, x&
Dim arr
Dim cel As Range
arr = Array("title", "description", "keywords")

On Error Resume Next
Set oHttp = CreateObject("MSXML2.XMLHTTP")
If Err <> 0 Then Set oHttp = CreateObject("MSXML.XMLHTTPRequest")
If oHttp Is Nothing Then MsgBox "MSXML2.XMLHTTP not found", 16, "Error": Exit Sub
On Error GoTo 0

For Each cel In Selection
With oHttp
.Open "GET", Cells(cel.Row, 2), False
.Send
txt = .responseText
For x = 0 To 2
Tag = arr(x)
'Adjust this to suit
Tag2 = Tag & Chr(34) & " content="
EndTag = Chr(34)
'*************************
i = InStr(1, txt, Tag2, 1)
If i = 0 Then
GetTitle = "not found"
Else
t = Split(txt, Tag2)(1)
GetTitle = Split(t, EndTag)(1)
End If
Cells(cel.Row, 3).Offset(, x) = GetTitle
Next
End With
Next
Set oHttp = Nothing

End Sub

mlomenzo
10-20-2011, 05:53 AM
You are darn brilliant!!! Works for description and keywords however, it is returning "not found" for title. Other than that, worked brilliantly. Fast, and lean!

mlomenzo
10-20-2011, 06:40 AM
OK, I just ran that code on the spreadsheet that I gave you and it worked fine. Like I said, I would prefer to not post the actual workbook that I am working on. Is there another way for me to get that workbook to you?

mdmackillop
10-21-2011, 03:22 PM
Option Explicit
Option Compare Text

Sub test()
Dim t, Tag As String, tag2 As String, EndTag As String
Dim oHttp As Object, txt$, i&, j&, x&
Dim arr
Dim cel As Range
arr = Array("head", "description", "keywords")

On Error Resume Next
Set oHttp = CreateObject("MSXML2.XMLHTTP")
If Err <> 0 Then Set oHttp = CreateObject("MSXML.XMLHTTPRequest")
If oHttp Is Nothing Then MsgBox "MSXML2.XMLHTTP not found", 16, "Error": Exit Sub
On Error GoTo 0


For Each cel In Selection
If cel <> "" Then
With oHttp
.Open "GET", Cells(cel.Row, 2), False
.Send
txt = .responseText
For x = 0 To 2
Tag = arr(x)
'Adjust this to suit
If Tag = "head" Then
tag2 = "<Title>"
Cells(cel.Row, 3).Offset(, x) = GetTitle(txt, tag2, "<", 0)
Else
tag2 = Tag & Chr(34) & " content="
Cells(cel.Row, 3).Offset(, x) = GetTitle(txt, tag2, Chr(34), 1)
End If
Next
End With
End If
Next
Set oHttp = Nothing
End Sub



Function GetTitle(txt, tag2, EndTag, Bit)
Dim i As Long, t
i = InStr(1, txt, tag2, 1)
If i = 0 Then
GetTitle = "not found"
Else
t = Split(txt, tag2)(1)
GetTitle = Trim(Split(t, EndTag)(Bit))
End If
End Function

mlomenzo
10-24-2011, 05:31 AM
Thank you very much kind sir. Works perfectly. You just saved me having to manually retrieve and enter that data on not only this, but 6-7 more workbooks just like this one.

anasttin
03-20-2012, 01:04 PM
Hi all,

This code looks like exactly what I need as well but when I run it, I get the Invalid procedure call or argument error.

I've researched the error and it suggests that maybe I haven't selected the input range but I do so I'm confused.

What I'm doing exactly is selecting a range of URL's that I've entered and then going into the vba code and hitting the 'run' button.

That's when I get the error.

What am I doing wrong?

Any help would be much appreciated.

Thanks!

anasttin
03-20-2012, 09:13 PM
Invalid procedure call or agument error when I run this?
Hi all (not sure if I replied to this post properly the first time so am trying again),

This code looks like exactly what I need as well but when I run it, I get the Invalid procedure call or argument error.

I've researched the error and it suggests that maybe I haven't selected the input range but I do so I'm confused.

What I'm doing exactly is selecting a range of URL's that I've entered and then going into the vba code and hitting the 'run' button.

That's when I get the error.

What am I doing wrong?

Any help would be much appreciated.

Thanks!

defcon_3
03-20-2012, 11:30 PM
Im curious about this. So this code can actually fetch specific data on a webpage? not only in the web source but also in webpage body?

mancubus
03-21-2012, 12:08 AM
worked for me...

- downloaded the file form post#9,
- copied the codes from post#14 to a standard module,
- select the url's in column B,
- run the code.

defcon_3
03-21-2012, 12:42 AM
Yes I also confirmed this works. But my question is, is this can also grab table content in the webpage. I know web query but its a hassle re entering the URL each time.

mancubus
03-21-2012, 02:42 AM
Yes I also confirmed this works. But my question is, is this can also grab table content in the webpage. I know web query but its a hassle re entering the URL each time.


sorry defcon_3.

it was for anasttin.
i should have quoted anasttin's msg before posting my reply. :dunno

can you post a sample file.
perhaps one of our masters may help

mlomenzo
01-22-2013, 11:59 AM
I am also having the problem with "Run Time error '5'. invalid procedure...
I had this problem briefly last time, but I thought I remember it being something simple such as the placement of the URL's on the sheet. I can't seem to wiggle it out. Here is the sheet that I am working on. Any help would be greatly appreciated.
Thanks!
Mike