Consulting

Results 1 to 3 of 3

Thread: Copy data from website source code to spreadsheet

  1. #1

    Smile Copy data from website source code to spreadsheet

    Hi everybody

    I'm trying to find a way to copy data from a websites source code and paste it into excel.
    The data looks like this:
    <p>
    <span class="highlight">Mitarbeiter:</span> 120 in Deutschland<br/>
    <span style="display: "><span class="highlight">Umsatzklasse:</span> 10 - 50 Mio. Euro<br/></span>
    <span style="display: none"><span class="highlight">Filialen:</span> <br/></span>
    <span style="display: inline"><span class="highlight">Gegründet:</span> 1925</span>
    </p
    The value I'm looking for here would be "10 - 50 Mio. Euro".

    This is the code I've written so far:


    Sub Sample()    Dim IE As Object
        Set IE = CreateObject("InternetExplorer.Application")
        With IE
            .Visible = False
            .Navigate "https://www.wer-zu-wem.de/firma/steinel-normalien.html"
                     Do While .Busy And .readyState <> 4: DoEvents: Loop
             Application.Wait Now + TimeValue("0:00:06")
           Sheets("Dummy").Range("A1").Value = .document.body.outerHTML
            .Quit
        End With
    End Sub
    My approach here was to first copy the entire HTML source code to a spreadsheet and then continue with the InStr function.
    However the solution above just gives me an [COLOR=rgba(0, 0, 0, 0.87)]excerpt from the full source code, which does not contain the data I'm looking for.

    Does anyone know what exactly I'm doing wrong here?
    Is there maybe a better approach, maybe to get the data I'm looking for directly from the website without the step inbetween?

    Thank you guys in advance for any interest in my problem
    [/COLOR][COLOR=rgba(0, 0, 0, 0.87)]

    [/COLOR]

  2. #2
    VBAX Expert Leith Ross's Avatar
    Joined
    Oct 2012
    Location
    San Francisco, California
    Posts
    552
    Location
    Hello vrCommander,

    Welcome!

    Are you wanting to extract only certain text from the web page and copy that text to a worksheet? I ask because your title suggests you want to copy some HTML code to the clipboard and the paste the rendered HTML to the worksheet.
    Sincerely,
    Leith Ross

    "1N73LL1G3NC3 15 7H3 4B1L17Y 70 4D4P7 70 CH4NG3 - 573PH3N H4WK1NG"

  3. #3
    VBAX Newbie
    Joined
    Jul 2020
    Posts
    1
    Location
    Hi everyone, I didn't want to create a post with the same and common questions. I think "get data from web source" is quite a big area for questions on this forum

    My question is how to get a text from an HTML element? I found several solutions on other sites, but none of them worked for me VBA shows different errors: 13, 4xx, etc. - depending on my attempts to change some elements.

    I found the solution that might work vrCommander,
    as it's for scraping table data from a website.

    
    
    Sub WikiTable()
        Dim ieObj As InternetExplorer
        Dim htmlEle As IHTMLElement
        Dim i As Integer
    
    
        i = 1
    
    
        Set ieObj = New InternetExplorer
        ieObj.Visible = True
    
    'please change "wikipedia domain URL org" in next line to the wiki domain url, because this forum thinks I pasted too many URLs in my post
        ieObj.navigate "wikipedia domain URL org/wiki/List_of_countries_and_dependencies_by_population"
    
    
        Application.Wait Now + TimeValue("00:00:10")
    
    
        For Each htmlEle In ieObj.document.getElementsByClassName("wikitable")(0).getElementsByTagName("tr")
            With ActiveSheet
                .Range("A" & i).Value = htmlEle.Children(0).textContent
                .Range("B" & i).Value = htmlEle.Children(1).textContent
                .Range("C" & i).Value = htmlEle.Children(2).textContent
                .Range("D" & i).Value = htmlEle.Children(3).textContent
                .Range("E" & i).Value = htmlEle.Children(4).textContent
            End With
    
    
            i = i + 1
    
    
        Next htmlEle
    
    
    End Sub


    The next solution I found for a non-table data, but it doesn't work too

    
    
    Public Sub LinkedinName()
        Dim IE       As Object: Set IE = CreateObject("InternetExplorer.Application")
        Dim Elements As Object
        Dim Element  As Object
    
    
        With IE
            'Show and navigate to page
            .Visible = True
    'please change "LINKEDIN domain URL com" in next line to the wiki domain url, because this forum thinks I pasted too many URLs in my post
            .navigate ("LINKEDIN domain URL com/in/jeffwilsonbizzdesign/")
    
    
            'Wait until page loads
            Do While .Busy And .readyState <> 10
                DoEvents
            Loop
    
    
            'Create a collection of Option Tags, which are part of pa_packages
            Set Elements = .document.getElementById("top-card")(0).getElementsByTagName("h1")
    
    
            'Show the element's properties
            For Each Element In Elements
                Debug.Print "The InnerText is: " & Element.innerText
                Debug.Print "The Value is: " & Element.Value
            Next
        End With
    End Sub


    I played and changed elements of the code, was trying to solve issues, but stuck
    I want to add a name "Jeff Wilson" from the
    <h1 class="top-card-layout__title">Jeff Wilson</h1>
    line. And then add "Web Developer and Social Media Implementation Engineer at BizzDesign" from
    <h2 class="top-card-layout__headline">Web Developer and Social Media Implementation Engineer at BizzDesign</h2>
    and then add "Brisbane, Australia" from the
    <span class="top-card__subline-item">Brisbane, Australia</span>
    line. All these texts should be pasted in one cell.

    Can you please help me figure it out?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •