Consulting

Results 1 to 6 of 6

Thread: Solved: Help with VBA & html spidering/downloading!

  1. #1

    Solved: Help with VBA & html spidering/downloading!

    Hello i would like your help with something that i have been searching for lots of time! I think its quite hard but you know better cause i am a complete newbie on vba!

    Ok I have a list of nearly 5.000 ISBN that i want to match each line with a specific "text" found in a website. But lets take it from the beggining:

    Given i got in A1

    the url of the website that have this syntax:

    http://bookshop.blackwell.co.uk/jsp/id/ISBN
    http://bookshop.blackwell.co.uk/jsp/id/ISBN
    http://bookshop.blackwell.co.uk/jsp/id/ISBN
    http://bookshop.blackwell.co.uk/jsp/id/ISBN
    http://bookshop.blackwell.co.uk/jsp/id/ISBN
    http://bookshop.blackwell.co.uk/jsp/id/ISBN
    ETC.

    etc. and in B1 i want to take this information for each line:

    for example for an ISBN: 9780385504225

    I GOT IN A1 THE URL (NOTE: URL ISNT STORED AS HYPERLINK)

    http://bookshop.blackwell.co.uk/jsp/id/9780385504225

    AND IN CELL B1 I WANT THIS TEXT:
    US title not currently available

    Is there a way to spider this information through VBA and if itsnt possible
    could i maybe use vba to download all these html files and then strip them down with another program?
    thanks in advance

    to show you exactly what i want to do I have attached an img

  2. #2
    You might be helped using a parameter web-query:
    www.jkp-ads.com/articles/webquery.asp

    See attached...
    Regards,

    Jan Karel Pieterse
    Excel MVP jkp-ads.com

  3. #3
    bump! anyone can help a little bit more please?!

  4. #4
    VBAX Contributor
    Joined
    Dec 2009
    Location
    Sevastopol
    Posts
    150
    Location
    Hi,
    Try this code:
    
    Sub GetISBN()
      
      Const PubStatusTag = "pubstatus="
      Dim Rng As Range, x As Range, oHttp As Object, txt$, i&, j&
      Set Rng = Range("A2", Cells(Rows.Count, 1).End(xlUp))
      
      On Error Resume Next
      Set oHttp = CreateObject("MSXML2.XMLHTTP")
      If Err <> 0 Then Set oHttp = CreateObject("MSXML.XMLHTTPRequest")
      If oHttp Is Nothing Then MsgBox "MSXML2.XMLHTTP not found", 16, "Error": Exit Sub
      On Error GoTo 0
      
      With oHttp
        For Each x In Rng
          .Open "GET", x.Value, False
          .Send
          txt = .responseText
          i = InStr(1, txt, PubStatusTag, 1)
          If i = 0 Then
            x.Offset(, 1).Value = "PubStatusTag not found"
          Else
            i = i + Len(PubStatusTag)
            j = InStr(i, txt, "&", 0)
            x.Offset(, 1) = Replace(Mid(txt, i, j - i), "+", " ")
          End If
        Next
      End With
      Set oHttp = Nothing
      
    End Sub
    Regards,
    Vladimir
    Last edited by ZVI; 03-13-2010 at 08:06 AM.

  5. #5
    omg thank you so much! this works great

  6. #6
    Administrator
    VP-Knowledge Base
    VBAX Grand Master mdmackillop's Avatar
    Joined
    May 2004
    Location
    Scotland
    Posts
    14,476
    Location
    Hi Vladimir,
    Can you create a KB Item for this?
    Regards
    MD
    MVP (Excel 2008-2010)

    Post a workbook with sample data and layout if you want a quicker solution.


    To help indent your macros try Smart Indent

    Please remember to mark threads 'Solved'

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •