PDA

View Full Version : Macro to save in txt format web pages



Caroten
07-02-2014, 09:25 AM
Hi all,
despite follow the forum for so I just signed up for a problem that I can't find the solution.

I have a file in Excel (2010) this range A1: A150 with web addresses.

I have to fix my macro that allows me to:

- Open every different url
- Saved These pages in. Txt format
- The subject in ascending order (being all of the same site and having all of the same name)

I tried macro with the Following:


Dim IE As Object
Option Base 0

Sub Test1()
Dim rCell As Range, myRan As Range
'
If IE Is Nothing Then Set IE = CreateObject("InternetExplorer.APPLICATION")

Set myRan = ThisWorkbook.Sheets("Sheet1").Range("A1:A150") '<<< My sheet with urls
'
For Each rCell In myRan
If rCell <> "" Then
GetTabbb (rCell.Value)
'My Path ed file name (1) >>>
mykeys = Array("^s", "C:\Users\Luca\Desktop\prova\pippo", "{TAB}", "{TAB}", "{TAB}", "{TAB}", "{ENTER}")
DoEvents
For J = LBound(mykeys) To UBound(mykeys)
SendKeys mykeys(J), True
If J < 2 Then myWait (1) Else myWait (0.2)
Debug.Print mykeys(J)
Next J
myWait (0.5)
Debug.Print " "
Debug.Print "Next link>>>>>>>"
On Error Resume Next

Name "C:\Users\Luca\Desktop\prova\pippo.htm" As "C:\Users\Luca\Desktop\prova\pippo" & Format(Now(), "hh-mm-ss") & ".htm"
Name "C:\Users\Luca\Desktop\prova\pippo.txt" As "C:\Users\Luca\Desktop\prova\pippo" & Format(Now(), "yyyy-mm-dd") & ".txt"
On Error GoTo 0
End If
Next rCell
myWait (5)
Debug.Print "Kill IE"
IE.Quit
Set IE = Nothing
'
End Sub

Sub GetTabbb(myUrl)
'
With IE
ReNav:
On Error GoTo NoNav
Debug.Print ">"
myWait (0.2)
.navigate myUrl
.Visible = True
myWait (0.2)
Do While .Busy: DoEvents: Loop 'Attesa not busy
Do While .readyState <> 4: DoEvents: Loop 'Attesa documento
End With
'
'attesa addizionale
myWait (1)
Exit Sub
NoNav:
Debug.Print ("NoNav")
myWait (0.5)
On Error GoTo 0
Resume ReNav
'
End Sub
'
Sub myWait(myStab As Single)
Dim myStTiM As Single
'
myStTiM = Timer
Do 'wait myStab
DoEvents
If Timer > myStTiM + myStab Or Timer < myStTiM Then Exit Do
Loop
End Sub

I do not know what to do for errors:

1) I save the whole web page as complete (taking a long time and saving folder + page. Htm)
2) it asks me to overwrite the file foo.

This is because I can not say with increasing number of naming it: as you can see I decided to save it with "seconds, minutes, etc.." but with no results.

could someone help me fix it?

jonh
07-03-2014, 01:50 AM
not tested



For Each rCell In myRan
if not Fetch(rCell.Value, "C:\Users\Luca\Desktop\prova\pippo" & _
Format(Now, "yymmdd_hhmmss_") & rCell.Row & ".htm") then
msgbox "error getting page @ " & rCell.Address
end if
next



Function Fetch(URL, dest) as boolean
On Error Resume Next
Dim b
With CreateObject("Microsoft.XMLHTTP")
.Open "GET", URL, False
.send
b = .responseBody
If Err.Number <> 0 Or .Status <> 200 Then Exit Function
End With
With CreateObject("ADODB.Stream")
.Type = 1
.Open
.Write b
.SaveToFile dest, 2
End With
Fetch = Err.Number = 0
End Function