Consulting

Results 1 to 2 of 2

Thread: Check If URL Returns 404

  1. #1
    VBAX Contributor D_Marcel's Avatar
    Joined
    Feb 2012
    Location
    Tokyo
    Posts
    117
    Location

    Check If URL Returns 404

    Hi Gurus,
    Trying to find some algorythm that tests if a URL really exists, I've found this precious piece of code provided by Leith Ross:

    [VBA]
    'Written: March 15, 2011' Updated: April 29, 2012
    ' Author: Leith Ross
    ' Summary: Returns the status for a URL along with the Page Source HTML text.


    Public PageSource As String
    Public httpRequest As Object


    Function GetURLStatus(ByVal URL As String, Optional AllowRedirects As Boolean)


    Const WinHttpRequestOption_EnableRedirects = 6


    If httpRequest Is Nothing Then
    On Error Resume Next
    Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
    If httpRequest Is Nothing Then
    Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5")
    End If
    Err.Clear
    On Error GoTo 0
    End If


    ' Control if the URL being queried is allowed to redirect.
    httpRequest.Option(WinHttpRequestOption_EnableRedirects) = AllowRedirects


    ' Clear any pervious web page source information
    PageSource = ""

    ' Add protocol if missing
    If InStr(1, URL, "://") = 0 Then
    URL = "http://" & URL
    End If


    ' Launch the HTTP httpRequest synchronously
    On Error Resume Next
    httpRequest.Open "HEAD", URL, False
    If Err.Number <> 0 Then
    ' Handle connection errors
    GetURLStatus = Err.Description
    Err.Clear
    Exit Function
    End If
    On Error GoTo 0

    ' Send the http httpRequest for server status
    On Error Resume Next
    httpRequest.Send
    httpRequest.WaitForResponse
    If Err.Number <> 0 Then
    ' Handle server errors
    PageSource = "Error"
    GetURLStatus = Err.Description
    Err.Clear
    Else
    ' Show HTTP response info
    GetURLStatus = httpRequest.Status & " - " & httpRequest.StatusText
    ' Save the web page text
    PageSource = httpRequest.ResponseText
    End If
    On Error GoTo 0

    End Function


    Sub ValidateURLs()


    Dim Cell As Range
    Dim Rng As Range
    Dim RngEnd As Range
    Dim Status As String
    Dim Wks As Worksheet

    Set Wks = ActiveSheet
    Set Rng = Wks.Range("A1")

    Set RngEnd = Wks.Cells(Rows.Count, Rng.Column).End(xlUp)
    If RngEnd.Row < Rng.Row Then Exit Sub Else Set Rng = Wks.Range(Rng, RngEnd)

    For Each Cell In Rng
    Status = GetURLStatus(Cell)
    Cell.Offset(0, 1) = Status
    Next Cell

    End Sub[/VBA]

    However, if the URL gets redirected to another site in case of no exist, this function returns 302 - Found instead of 404 or some error code/message.

    Of this two URL's:

    https://www.dotz.com.br/Produto?pid=7953596
    https://www.dotz.com.br/Produto?pid=7953

    The first one exists and the second, which is as test copied from the first, doesn't. Clicking on them from Excel will redirect to a main site. Probably because of this the function is returning 302. Copying and pasting them directly in the browser, will return 404 to the second.

    Is there a way to get the 404 to the second, even with this redirection?

    Regards,

    Douglas Marcel
    "The only good is knowledge and the only evil is ignorance". Socrates

  2. #2
    Moderator VBAX Sage SamT's Avatar
    Joined
    Oct 2006
    Location
    Near Columbia
    Posts
    7,814
    Location
    bump
    I expect the student to do their homework and find all the errrors I leeve in.


    Please take the time to read the Forum FAQ

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •