vlina
02-04-2011, 12:10 PM
Hello,
My code goes to a website, does a search, and returns a title, abstract, and reference of each of the 5 results on the page.
Is there a way to search tags (with particular classNames) ONLY within each of the 5 "result" div tags?
The html goes something like:
<div class="all_results">
<div class="result">
<p class="reference"><reference goes here</p>
<h1 class="title">title goes here</h1>
<p class="auth_list">authors go here</p>
<div class="abstract">
<p>abstract goes here</p>
</div>
</div>
<div class="result"> (this would be the 2nd result)
....
For each "div" tag with className "result", search only in that tag for "p" tag with className "reference' and "h1" tag with className "title" and sub-"div" tag with className "abstract".
I tried grabbing the innerHTML of each result and assigning it to a variable (called resultCode) and making that variable an HTMLDocument, string, etc. This way I could use that as the "ie.document" in "ie.document.all.tags("p")" and search only there. But it didn't work (wrong variable type).
The problem with my current code (below) is that it searches for "p", "h1" and "div" (with className abstract) in the entire document, not the current "result" div tag. So, it returns the first title, abstract, and reference 5 times instead of each of the 5 once.
Dim varTagP, varTagsP As Variant
Dim varTagH, varTagsH As Variant
Dim varTagDIV, varTagsDIV As Variant
Dim numDIVtags, i, m As Integer
Dim theReference, theTitle, theAbstract As String
Dim resultCode As String 'also tried HTMLDocument, etc.
numDIVtags = ie.document.all.tags("DIV").Length
For i = 0 To numDIVtags
If ie.document.all.tags("DIV")(i).className = "result" Then
Debug.Print "There is a result on the " & i & "th div tag"
'get reference from result
Set varTagsP = ie.document.all.tags("P")
For Each varTagP In varTagsP
If varTagP.className = "reference" Then
theReference = varTagP.innerText
Debug.Print theReference
Exit For
End If
Next
'get title from result
Set varTagsH = ie.document.all.tags("H1")
For Each varTagH In varTagsH
If varTagH.className = "title" Then
theTitle = varTagH.innerText
Debug.Print theTitle
Exit For
End If
Next
'get abstract from result
Set varTagsDIV = ie.document.all.tags("DIV")
For Each varTagDIV In varTagsDIV
If varTagDIV.className = "abstract_text" Then
theAbstract = varTagDIV.innerText
Debug.Print theAbstract
Exit For
End If
Next
End If
Next
I've looked for posts similar to this for many hours without luck. Any advice would be very much appreciated.
Thank you for your time.
My code goes to a website, does a search, and returns a title, abstract, and reference of each of the 5 results on the page.
Is there a way to search tags (with particular classNames) ONLY within each of the 5 "result" div tags?
The html goes something like:
<div class="all_results">
<div class="result">
<p class="reference"><reference goes here</p>
<h1 class="title">title goes here</h1>
<p class="auth_list">authors go here</p>
<div class="abstract">
<p>abstract goes here</p>
</div>
</div>
<div class="result"> (this would be the 2nd result)
....
For each "div" tag with className "result", search only in that tag for "p" tag with className "reference' and "h1" tag with className "title" and sub-"div" tag with className "abstract".
I tried grabbing the innerHTML of each result and assigning it to a variable (called resultCode) and making that variable an HTMLDocument, string, etc. This way I could use that as the "ie.document" in "ie.document.all.tags("p")" and search only there. But it didn't work (wrong variable type).
The problem with my current code (below) is that it searches for "p", "h1" and "div" (with className abstract) in the entire document, not the current "result" div tag. So, it returns the first title, abstract, and reference 5 times instead of each of the 5 once.
Dim varTagP, varTagsP As Variant
Dim varTagH, varTagsH As Variant
Dim varTagDIV, varTagsDIV As Variant
Dim numDIVtags, i, m As Integer
Dim theReference, theTitle, theAbstract As String
Dim resultCode As String 'also tried HTMLDocument, etc.
numDIVtags = ie.document.all.tags("DIV").Length
For i = 0 To numDIVtags
If ie.document.all.tags("DIV")(i).className = "result" Then
Debug.Print "There is a result on the " & i & "th div tag"
'get reference from result
Set varTagsP = ie.document.all.tags("P")
For Each varTagP In varTagsP
If varTagP.className = "reference" Then
theReference = varTagP.innerText
Debug.Print theReference
Exit For
End If
Next
'get title from result
Set varTagsH = ie.document.all.tags("H1")
For Each varTagH In varTagsH
If varTagH.className = "title" Then
theTitle = varTagH.innerText
Debug.Print theTitle
Exit For
End If
Next
'get abstract from result
Set varTagsDIV = ie.document.all.tags("DIV")
For Each varTagDIV In varTagsDIV
If varTagDIV.className = "abstract_text" Then
theAbstract = varTagDIV.innerText
Debug.Print theAbstract
Exit For
End If
Next
End If
Next
I've looked for posts similar to this for many hours without luck. Any advice would be very much appreciated.
Thank you for your time.