PDA

View Full Version : Solved: Compare date before event execution?



jmarkc
07-26-2008, 05:02 AM
Hi Guys. I hope someone can help me with this problem. i have been trying to figure it out for the last 2 days.

I have a macro that will pull a table from a website and paste it on a worksheet. This website updates the table weekly, dropping the earliest week's data with each update. I need to maintain the dropped data, so the macro finds the last column,moves back and places the data where I need it to go, then just deletes the first row of the table. This all works fine.

The problem I have is that if I run the macro several times before the new update (which will be done because other people use the file and the macro actually pulls several tables), I will continue to pile on data that I already have. I've been trying to do an If Then statement that will compare the date to a date cell on the spreadsheet and won't pull the data if the data is aleready there. I can't get that to work and wonder if there's a better way to do the comparison?

Here's the code I have that works. Can someone help solve the date comparison?
Sub PullWeb()
'Pull & post energy data from website
Dim sTS6 As String

sTS6 = Range("a1").Value
With ActiveSheet.QueryTables.Add(Connection:= _
"URL:HAD TO REMOVE DUE TO FORUM POSTING RULES" _
& sTS6, Destination:=Range("A8").End(xlToRight).Offset(-3, -2))
.Name = "Gasoline"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlOverwriteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "6"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With

'Delete 1st column of new data
Dim MyCell As Long
Dim i As Long
For i = Cells(1, Columns.Count).End(xlToLeft).Column To 1 Step -1
If Cells(1, i).Value = "Data" Then Cells(1, i).EntireColumn.Delete
Next i
For MyCell = ActiveSheet.UsedRange.Rows.Count To 2 Step -1
If Application.WorksheetFunction.CountIf(Columns(MyCell), "U.S.") >= 1 _
Then Columns(MyCell).EntireColumn.Delete
Next MyCell
End Sub


The URL above has been removed because I had too few posts. The sample workbook I have attached has the code so you can see what I'm talking about.

Thanks!

mdmackillop
07-26-2008, 06:40 AM
I don't know about checking before download, but why not set the paste destination to say Q5. You can then compare dates and insert the new data as appropriate.

jmarkc
07-26-2008, 08:08 AM
I thought about doing that too, but was hoping there would be a "cleaner" solution. The script I have now already takes a bit of time to run.

If your suggestion is the best way, then I'll play with that. Thanks!

TomSchreiner
07-27-2008, 06:17 AM
An alternative to using an Excel query. Is fast as well...
I changed the return order to: Change from week ago, Change from year ago, 7/21/2008, 7/14/2008, 7/7/2008, ect...
Intead of refreshing the entire table, it only inserts a single column of new data.
See example: gasoline.zip (http://home.fuse.net/tstom/gasoline.zip)

To test, delete the column containing the latest data for 7/21 and click update...


Option Explicit

Private Sub CommandButton1_Click()
Dim Data(), LastUpdate As Date

Application.ScreenUpdating = False

LastUpdate = GetData(Data)

If LastUpdate <= Range("D5").Value Then
MsgBox "No updates available..."
Exit Sub
End If

Columns(4).Insert

With Columns(4)
.Cells(5).Value = LastUpdate
.Cells(6).Offset(, -2).Resize(29, 3) = Data
End With

Application.ScreenUpdating = True
End Sub

Function GetData(Data()) As Date
Dim Doc As HTMLDocument, DocLoad As New HTMLDocument
Dim Table As HTMLTable, r As HTMLTableRow, c As HTMLTableCell
Dim RowCntr As Integer, CellText As String, x As Integer

ReDim Data(1 To 29, 1 To 3)

Set Doc = DocLoad.createDocumentFromUrl("http://www.eia.doe.gov/oil_gas/petroleum/data_publications/wrgp/mogas_home_page.html", vbNullString)
Do Until Doc.readyState = "complete": DoEvents: Loop
Set Table = Doc.getElementsByTagName("TABLE")(6)
GetData = CDate(Table.Rows(4).childNodes(2).innerText)

On Error Resume Next
For RowCntr = 6 To Table.Rows.Length
CellText = ""
CellText = Table.Rows(RowCntr).Cells(1).innerText
If CellText <> "" Then
x = x + 1
Data(x, 1) = CDbl(Table.Rows(RowCntr).Cells(4).innerText)
Data(x, 2) = CDbl(Table.Rows(RowCntr).Cells(5).innerText)
Data(x, 3) = CDbl(Table.Rows(RowCntr).Cells(3).innerText)
End If
Next

End Function

jmarkc
07-28-2008, 05:06 AM
Tom-
Thanks for the reply. This is great! I am new to vba coding, so I didn't know of another way to pull the data. I really like this. As my actual workbook has 5 sheets that contain data to be pulled from this website, using your code, would I just alter the URL and table number of the code you presented here? Obviously, I would also need to change the sheet formatting as well.

Once again, thanks for this alternative!

TomSchreiner
07-28-2008, 06:55 AM
The code, as in a web query, must be customized for each website. Are these pages public? If so, please post the URLs.

Thanks - Tom

jmarkc
07-28-2008, 07:13 AM
Yes, these pages are all public. I have them as seperate sheets of a workbook, with one sheet that basically summarizes certain parts of each sheet and another that uses this summary data to chart the changes in that data.

The URLs are as follows:

Monthly Energy Data (I have as table 8):
http://tonto.eia.doe.gov/cfapps/STEO_Query/steotables.cfm?tableNumber=8&periodType=Monthly&startYear=2007&startMonth=1&startMonthChanged=false&startQuarterChanged=false&endYear=2008&endMonth=12&endMonthChanged=false&endQuarterChanged=false&loadAction=Apply+Changes

Gasoline (table 6):
http://www.eia.doe.gov/oil_gas/petroleum/data_publications/wrgp/mogas_home_page.html

Natural Gas Futures (table 6):
http://tonto.eia.doe.gov/dnav/ng/hist/rngc1d.htm

Crude Oil (table 9):
http://tonto.eia.doe.gov/dnav/pet/hist/wtotusaw.htm

Diesel Fuel (table 6):
http://tonto.eia.doe.gov/oog/info/wohdp/diesel.asp

The only ones that are more dynamic in nature are the gasoline and diesel fuel tables, as these are updated with new weekly data that deletes the oldest week's data. The other pages are historical and keep the weekly data intact.

Thanks,
Jason

TomSchreiner
07-28-2008, 08:16 AM
Well... Maybe I have bit off more than I want to chew. :)

Short of me coding for each of your tables... Also, now knowing that you are using the various tables dependant upon one another, I would stick close with what you have now. The diesel table is similiar to the gasoline, so I could duplicate the same results as the example I posted earlier. As for checking for available updates on the other pages?

Natural gas
Petroleum
Monthly Energy Data (STEO Table Browser)

The above three provide in page information about the published date. If you think it will suffice us to simply check the last update (of your queries) against the publish date of these three pages to determine if your query should be refreshed, then that would be a breeze.

What do you think?

BTW. Which diesel do you import? "Average All Types" and/or "Low and Ultra-Low Sulfur"?

mdmackillop
07-28-2008, 02:42 PM
Hi Tom,
I look forward to the eventual solution. Maybe a good KB item could be created from this public information source.
Regards
MD

jmarkc
07-28-2008, 06:13 PM
Hi Tom-
Not sure if you got my e-mail or not, but I certainly appreciate your help with this.

I thought this might entail a bit more than you expected. :) I can supply a workbook if it makes it easier, but I will also gladly try my own hand at altering the code. Just let me know.

I don't necedssarily need to check for updates on the other 3 pages, just the diesel & gasoline pages. Because the other 3 pages are historical, I've been clearing the cells and just re-pulling/posting the data. Not necessarily the most efficient way to do things, but it works (though I must say that I like your idea of checking for updates as I imagine the macro will run much quicker).

To answer your question about the diesel fuel, I only need to import the "Average All Types".

MD - This could be a good tool for anyone whose business is affected by the changes in energy prices (nowadays who isn't affected??!! :) ).

Jason

TomSchreiner
07-29-2008, 07:40 PM
Hi Jason and MD.

I did not recieve an email.

"but I will also gladly try my own hand at altering the code"

I added some comments that will hopefully make it clear and help you to edit the procedures if neccesary.

The code for your diesel page was almost identical. The example workbook download link contains both gasoline and diesel worksheets.

gasoline-diesel.zip (http://home.fuse.net/tstom/gasoline-diesel.zip)

As with any web query, if the page is redesigned, your code will probably break. The code is completely dependant upon the current structure of the web page. You may be able to add a bit of logic to make it more flexible, but usually not.

If you have any question or future problems let me know. :)

Option Explicit

Private Sub CommandButton1_Click()
Dim Data(), LastUpdate As Date

Application.ScreenUpdating = False

LastUpdate = GetData(Data)

If LastUpdate <= Range("D5").Value Then
MsgBox "No updates available..."
Exit Sub
End If

Columns(4).Insert

With Columns(4)
.Cells(5).Value = LastUpdate
.Cells(6).Offset(, -2).Resize(29, 3) = Data
End With

Application.ScreenUpdating = True
End Sub

Function GetData(Data()) As Date
Dim Doc As HTMLDocument, DocLoad As New HTMLDocument
Dim Table As HTMLTable, r As HTMLTableRow, c As HTMLTableCell
Dim RowCntr As Integer, CellText As String, x As Integer

ReDim Data(1 To 29, 1 To 3)

Set Doc = DocLoad.createDocumentFromUrl("http://tonto.eia.doe.gov/oog/info/wohdp/diesel.asp", vbNullString)
Do Until Doc.readyState = "complete": DoEvents: Loop

'this is the 7th table. all arrays and collections are zero based in the MSHTML library
Set Table = Doc.getElementsByTagName("TABLE")(6)

'7th table, 2nd row, 4th child, innertext
GetData = CDate(Table.Rows(1).childNodes(3).innerText)

On Error Resume Next

'loop throught the 2nd to the last row in the table
For RowCntr = 2 To Table.Rows.Length
CellText = ""

'check for text - one or more rows/cells may not contain text
'the ones that do, contain our data
CellText = Table.Rows(RowCntr).Cells(1).innerText

If CellText <> "" Then
x = x + 1

'Change from week ago
Data(x, 1) = CDbl(Table.Rows(RowCntr).Cells(4).innerText)

'Change from year ago
Data(x, 2) = CDbl(Table.Rows(RowCntr).Cells(5).innerText)

'most current update
Data(x, 3) = CDbl(Table.Rows(RowCntr).Cells(3).innerText)
End If

Next

End Function

jmarkc
07-30-2008, 07:01 AM
Thanks Tom. :bow:

The comments you provided in the code will help me to understand the procedure. I'll play with it a bit and see what I can do, but I'm sure I'll be contacting you for help!

"As with any web query, if the page is redesigned, your code will probably break. The code is completely dependant upon the current structure of the web page. You may be able to add a bit of logic to make it more flexible, but usually not."

I agree completely and will jump that hurdle when (or if) I come to it!

I'm labeling this thread as solved.

I appreciate the help!

Jason