PDA

View Full Version : Speed up macro by doing calculations in memory



Nick_London
10-04-2011, 02:11 PM
Hi,

I have the code below that loops through columns of data in the sheet "variables", places it in a calculation sheet (sheet "calculate") containing formulas and then paste the results for each column in another sheet called "results"

However instead of doing lots of copying and pasting and calculations in the spreadsheet, I would like to do as much of it in memory as I have been told this may speed things up considerably (rightly or wrongly).

Essentially in the calculation sheet, I am using a formula that returns the R-Squared relating to a regression of each data column from the calculation sheet to the data that sits in column D. The parameters are shown below.

=RSQ($D$15:$D$268,$P$15:$P$268)

where $D$15:$D$268 is static data that sits in the calculation sheet which does not change and $P$15:$P$268 relates to each columns data that is pasted in from the variables sheet. The output from the formula is one cell.


Sub Process()

Dim cellout As Range
Dim lastcell, firstcell, lastcelly, lastcol As Integer
Dim intCols As Integer
Dim rngData As Range
With Application
.ScreenUpdating = False
.Calculation = False
End With
Sheets("Variables").Select
lastcol = Sheets("Variables").Range("c1").End(xlToRight).Column 'determines start and end for chart series
lastrow = Sheets("Variables").Range("A220").Row

For intCols = 3 To lastcol 'start from column 2 to last populated column (as determined in code above)
Set rngData = ThisWorkbook.Worksheets("Variables") _
.Range(Worksheets("Variables").Cells(1, intCols), _
Worksheets("Variables").Cells(395, intCols))

Worksheets("Calculate").Range("P1:P" & lastrow).Value = rngData.Value

Worksheets("Calculate").Select
Range("B4:C4").Copy 'c4 contains formula output, b4 show the variable name based on what's populated in P1.
Sheets("Results").Select
Range("a60000").End(xlUp).Offset(1, 0).PasteSpecial Paste:=xlPasteValues
Next
End Sub


I guess I need to do the following but welcome suggestions.

1) Create an input array of the raw data from the variables sheet and read into memory
2) Create an input array for column D from the calculation sheet and put this data into memory
3) Calculate R sqaured using the RSQ formula
4) Store results in an output array and paste the results into excel after all variables have been looped.

Can someone help?

Many thanks,

Nick

Rob342
10-04-2011, 03:01 PM
Nick
Without going into the deep end

1 remove .select use with & end with
2 use set = worksheets(****) & assign variables
3 Use Application.EnableEvents= False Start & set true at the end
4 Do Manual calc on the sheet instead of automatic

just a few ideas

Nick_London
10-05-2011, 09:51 AM
Thanks for the tips Rob!

With some research I managed to work out how to read the raw data from the variables sheet into memory/Array using the code below:


Sub test()
lastcol = Sheets("Variables").Cells(3, 3).End(xlToRight).Column
lastrow = Sheets("Variables").Cells(220, 1).Row
variables = Range("A1").Resize(lastrow, lastcol)
End Sub


If I place column D from the calculation sheet into the variable sheet instead, in column 2 then the above array would incorporate all data needed in each successive calculation.

So now I need to work out how to apply the RSQ function to run all variables/columns against column 2 in the array and then output the results, i.e in the formula below ref1 would always equal column 2, ref2 would start with column 3, then 4, 5, 6 etc to last column.

RSQ(ref1, ref2)

After calculating the formula result for each column, I want to write the results to an output array and finally place results into a worksheet.

Hope someone can offer some suggestions.

Thanks,

Nick

Blade Hunter
10-05-2011, 09:10 PM
As previously mentioned, get rid of the selects.

Also, why do you have this in there?


lastrow = Sheets("Variables").Range("A220").Row


why not just have:

lastrow = 220

Rob342
10-06-2011, 05:35 AM
Nick
Can you post a copy of the workbook? and i'll take a look for you.

Nick_London
10-06-2011, 03:11 PM
Hi Rob,

That would be great.

Here is an example with limited number of variables.

Thanks,

Nick

Rob342
10-07-2011, 03:14 AM
Nick
A few questions?

1 rngdata .cells(395,intcols) this cell ref refers to nothing is it suppose to?
2 Range(a60000) with sheet results, will the results be overridden or added to from the last calc on that sheet.
3 sheet variables row 220 will this be added to or is this the max row
4 why are there gaps between header & data or have you something there thats not on the example.

Have tried you example on my machine it only took 1 sec to process.

Rob

Nick_London
10-07-2011, 05:15 AM
Hi Rob,

Answers to your questions are below

1 rngdata .cells(395,intcols) this cell ref refers to nothing is it suppose to?

Nick: I believe I need to include this to set the full range - i.e from cell 1 to cell 395 to col. I suppose this can be set to 220

2 Range(a60000) with sheet results, will the results be overridden or added to from the last calc on that sheet.

Nick: Via: Range("a60000").End(xlUp).Offset(1, 0) - this does not overwrite existing results but adds the latest calculation result to the last blank row

3 sheet variables row 220 will this be added to or is this the max row

Nick: This is the max row/last data point for this dataset but will chage for other datasets

4 why are there gaps between header & data or have you something there thats not on the example.

Nick: This is just how the data was arranged but also partaly so that I can add in any other formulas in the future

5) Have tried you example on my machine it only took 1 sec to process.

Nick: Yep works fine for small data set, but I am running this in Excel 2007 for over 500 variables and it takes much longer. Also when I have used more comples formulas, particulary custom UDFs, this lengthens the data processing time substantially

shrivallabha
10-08-2011, 04:25 AM
I do not know if this will give you improved performace or not but it does calculate RSq in memory (as told to you).
Public Sub Process2()
Dim lLCol As Long, lLRow As Long
Dim rRef As Range, rCalc As Range
Set rRef = Sheets("Calculate").Range("D15:D220")
With Sheets("Variables")
lLCol = .Range("C1").End(xlToRight).Column
For i = 3 To lLCol
lLRow = .Cells(Rows.Count, i).End(xlUp).Row
Sheets("Results").Range("A" & Rows.Count).End(xlUp).Offset(1, 0).Value = _
.Cells(1, i).Value
Set rCalc = .Range(.Cells(15, i), .Cells(lLRow, i))
Sheets("Results").Range("B" & Rows.Count).End(xlUp).Offset(1, 0).Value = _
Application.WorksheetFunction.RSq(rRef, rCalc)
Next i
End With
End Sub
If this works better then the 2nd sheet "Calculate" can be removed for good as the reference column can be adjusted in the first sheet.

Rob342
10-09-2011, 02:21 AM
Nick
Sorry for the delay c/heating problem

Shivallabha has come up with a good routine that does away from flicking from sheet to sheet, its a much sleaker routine that you have got already.
Can you try this for speed.

Paul_Hossler
10-09-2011, 06:02 PM
Nick: Yep works fine for small data set, but I am running this in Excel 2007 for over 500 variables and it takes much longer. Also when I have used more comples formulas, particulary custom UDFs, this lengthens the data processing time substantially


If you're doing calculations on the data before you calculate the RSQ, I'd think you'd do better by including them in the VBA, and not have to copy/paste at all

Just copy the Y range into an array, then copy each X range into an array, process the X values, call RSQ internally, and then go get another X range

After all, it seems like you only want the RSQ for each of the 500 X arrays

Just my 2 cents

Paul

Jan Karel Pieterse
10-09-2011, 11:27 PM
I'm puzzled. Why go through the hassle of pulling the data to VBA, then running a worksheet function on the data in VBA and pushing the results to a sheet again? Why not simply (temporarily) place the RSQ function in an adjacent column, calc it and copy, paste special values the result?

Nick_London
10-10-2011, 03:58 AM
Thanks all for comments.

I'm testing out Shrivallabha's code and seeing how it flexible it would be to accomodate other sceanarios.

Bob Phillips
10-10-2011, 04:41 AM
I'm puzzled. Why go through the hassle of pulling the data to VBA, then running a worksheet function on the data in VBA and pushing the results to a sheet again? Why not simply (temporarily) place the RSQ function in an adjacent column, calc it and copy, paste special values the result?

To quote Charles Mingus, <<Making the simple complicated is commonplace; making the complicated simple, awesomely simple, that's creativity>>.

As the man says, its commonplace.

Paul_Hossler
10-10-2011, 05:50 AM
Bob, I have to get your book of quotes :thumb

That's 2 good ones in the last couple of days

Paul

Rob342
10-10-2011, 06:35 AM
Paul / Jan / Bob

Could this data be pulled into a dynamic array and just calc the rsq result.
If it could i would be interested to know how to do it, just a small example perhaps.
Rob

Paul_Hossler
10-10-2011, 08:03 AM
I'm still struggling with the need to pull it into an array at all.

RSQ is an intrinsic Excel function.

The only time I'd think you would need to pull it into an array is if you was planning to process it a lot with VBA and then put it back on to a worksheet

I did insert the Y values as Col C to make it easier for me to see


Option Explicit
Sub Process_PH()

Dim rData As Range, rY As Range, rX As Range
Dim iCol As Long

Set rData = Worksheets("Variables").Range("c15").CurrentRegion

Set rY = rData.Columns(3)

For iCol = 4 To rData.Columns.Count
Set rX = rData.Columns(iCol)

Worksheets("Variables").Cells(2, iCol).Value = Application.WorksheetFunction.RSq(rY, rX)

Next iCol
End Sub


Paul

Nick_London
10-19-2011, 10:19 AM
Gents, I have tested out Shivallabha code and it works fast when just calculating the RSQ. But when applying to a custom function I have, it takes over 10 mins to run. Don't know if using arrays instead would speed it up by much?

But I was thinking, the other advantage of using an array is that you don't need to return the results for all variables/columns. For example only return results to a spreadsheet for variables that have an RSQ value of greater than say 0.10.

Also if I want to perform mutiple calculations on the same data/columns, does loading into array mean I only have to read/process the data once?

I'll look at your code Paul and see if I can make sense of it!

Thanks,

Nick

mdmackillop
10-19-2011, 12:58 PM
Option Base 1
Sub Process2()

Dim arr(), x As Long, y As Long
Dim Ydata As Range, Xdata As Range, Head As Range

With Sheets("Variables")
Set Xdata = Range(.Cells(15, 3), .Cells(15, 3).End(xlToRight).End(xlDown))
x = Xdata.Columns.Count
y = Xdata.Rows.Count
Set Head = .Cells(1, 3).Resize(, x)
End With

Set Ydata = Sheets("Calculate").Cells(15, 4).Resize(y)

ReDim arr(x, 2)
With Sheets("Results")
For i = 1 To x
arr(i, 1) = Head(i)
arr(i, 2) = Application.WorksheetFunction.RSq(Ydata.Value, Xdata.Columns(i).Value)
Next
End With

Sheets("Results").Range("A2").Resize(x, 2) = arr()

End Sub

Nick_London
10-26-2011, 10:13 AM
Mdmackillop,

Thanks very much for the full code which works great. I've looked at it and tryied to understood as much as possible and hopefully have learnt a few things. I have changed it to accomdodate my custom function but it is generating a runtime-error 13: Type Mismatch.

The main difference in the code is changing of the following line:


arr(i, 2) = Application.WorksheetFunction.RSq(Ydata.Value, Xdata.Columns(i).Value)


To this one:


arr(i, 2) = OLSRegAdhoc(Xdata.Columns(i).Value, Ydata.Value, 1, 3)


Where OLSRegAdhoc is a non excel built in function.
Note, that in addition to the xdata and Ydata, this custom function takes on two other input/parameters (1 and the 3).

The function also returns 4 outputs accross successive columns instead of 1 like the RSQ function. Not sure if this has something to do with it. I've attempted to account for this in the code based on my current understanding of it but still get the same error. Also I'm pretty sure I've made a few mistakes in changing the code below around how to define the ouput array to extend to 3 more columns!


ReDim arr(x, 5) 'defines array output - no of x = rows and 2 for number of columns
With Sheets("Build-linear")
For i = 1 To x
arr(i, 1) = Head(i) 'first column show indicator name
arr(i, 5) = FormulaArray.OLSRegAdhoc(Xdata.Columns(i).Value, Ydata.Value, 1, 3)
Next
End With

Sheets("Build-Linear").Range("S2").Resize(x, 5) = arr()

Hope someone can shed some light.

Thanks,

Nick

Nick_London
10-30-2011, 12:17 PM
Hello again,

Further to my previous post, I have now done a bit more testing of the code. Instead of using my custom function, I have looked at using the excel build in function "linest" instead to see if I get similar errors.

I have taken Mdmackillop code and changed the line containing the RSQ function to the linest function as per below based in the four arguments the function takes.

arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata.Value, Xdata.Columns(i).Value, True, True)

When I run the code in the attached workbook called "Process Linest", (full code below) it populates the first column with the labels but returns nothing in the second column, i.e the output from the formula is not being returned.

Incidently, when I changed the corresponding bit in Paul's code, the linest function does seem to return a value (the first cell of the ouput from the array function - the coefficient).

Any ideas how I can get the linest function to work in Mdmackillop's code? Also I only need to extract four outputs from the function.

For example, below is an example of the typical output for one variable only.


Col 1Col 2Row 16.63849.2Row 20.07603356.5Row 30.973334537.9Row 47446.2204.0Row 5215434843259021751

I am only interested in displaying in the excel output results sheets accross columns the following:

1) Col/Array 1, row 1 (coefficient for x variable)
2) Col/Array 1, row 3 (RSQ)
3) Col/Array 2, row 2 (standard error of coefficient)
3) Col/Array 2, row 3 (standard error of regression)

Hope someone can help as I having been going around in circles trying to solve this one!

Thanks,

Nick


Sub Process_linest()
Dim arr(), x As Long, y As Long
Dim Ydata As Range, Xdata As Range, Head As Range

With Sheets("Variables")
Set Xdata = Range(.Cells(15, 3), .Cells(15, 3).End(xlToRight).End(xlDown))
x = Xdata.Columns.Count
y = Xdata.Rows.Count
Set Head = .Cells(1, 3).Resize(, x)
End With

Set Ydata = Sheets("Calculate").Cells(15, 4).Resize(y)

ReDim arr(x, 2)
With Sheets("Results")
For i = 1 To x
arr(i, 1) = Head(i)
arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata.Value, Xdata.Columns(i).Value, True, True)
Next
End With

Sheets("Results").Range("A2").Resize(x, 2) = arr()

End Sub

mdmackillop
11-06-2011, 01:06 PM
I don't know why Linest fails in the Sub, but it returns an error. You could write in the formulae as follows, but this doesn't meet your original request
Sub Process_linest()

Dim arr(), x As Long, y As Long
Dim Ydata As Range, Xdata As Range, Data As Range, Head As Range

With Sheets("Variables")
Set Xdata = Range(.Cells(15, 4), .Cells(15, 4).End(xlToRight).End(xlDown))
x = Xdata.Columns.Count
y = Xdata.Rows.Count
Set Head = .Cells(1, 3).Resize(, x)
End With

Set Ydata = Sheets("Calculate").Cells(15, 4).Resize(y)

ReDim arr(x, 2)
For i = 1 To x
a = Ydata.Cells.Count
arr(i, 1) = Head(i)
arr(i, 2) = "=LinEst(Calculate!" & Ydata.Address & ",Variables!" & Xdata.Columns(i).Address & ")"
Debug.Print arr(i, 2)
Next
Sheets("Results").Range("A2").Resize(x, 2).Formula = arr()
End Sub


In this variant, Z will be entered in the worksheet, but not into the array
z = Application.WorksheetFunction.LinEst(Ydata.Value, Xdata.Columns(i).Value)
Sheets(3).Cells(i + 1, 3) = z
arr(i, 2) = z

Paul_Hossler
11-06-2011, 06:04 PM
When I used it (as I recall) I just passed the Range

Try it without the .Value



arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata, Xdata.Columns(i), True, .....


Paul

mdmackillop
11-06-2011, 06:16 PM
Hi Paul,
I tried both ways, but the array won't take the result.

Paul_Hossler
11-07-2011, 09:34 AM
I changed the X and Y addresses a bit, so make sure they still work for you

No error ar least

LINEST returns an array, and the R^2 is in row 3, col1



Sub Process_linest()
Dim i As Long
Dim arr(), x As Long, y As Long
Dim Ydata As Range, Xdata As Range, Head As Range
Dim v As Variant
Dim sFormula As String

With Sheets("Variables")
Set Xdata = Range(.Cells(15, 4), .Cells(15, 4).End(xlToRight).End(xlDown))
x = Xdata.Columns.Count
y = Xdata.Rows.Count
Set Head = .Cells(1, 3).Resize(, x)
End With

Set Ydata = Sheets("Variables").Cells(15, 3).Resize(y)

ReDim arr(x, 2)
With Sheets("Results")
For i = 1 To x
arr(i, 1) = Head(i)

sFormula = "=LINEST(" & _
Ydata.Parent.Name & "!" & Ydata.Address(True, True) & "," & _
Xdata.Parent.Name & "!" & Xdata.Columns(i).Address(True, True) & "," & _
"TRUE, TRUE)"

v = Application.Evaluate(sFormula)
'RSQ is row 3, col1
arr(i, 2) = v(3, 1)
Next
End With

Sheets("Results").Range("A3").Resize(x, 2).Value = arr()

End Sub


Paul

mdmackillop
11-07-2011, 11:43 AM
I never thought of an array!

This change returns the first value of that array
With Sheets("Results")
For i = 1 To x
arr(i, 1) = Head(i)
arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata.Value, Xdata.Columns(i).Value, True, True)(1, 1)
Next
End With

Sheets("Results").Range("A2").Resize(x, 2) = arr()


Note that there is an error in this line of your code. The Column should be 4 as shown
Set Xdata = Range(.Cells(15, 4), .Cells(15, 4).End(xlToRight).End(xlDown))

Nick_London
11-10-2011, 06:02 AM
Paul/Mdmackillop,

Many thanks for your help and efforts here. I have now tested both versions of the code and they work fine. I think I have pretty much everything I need to move forward with my project but will keep the post open for a few more days in case anything else comes up.

Thanks again.

Nick

mdmackillop
11-10-2011, 06:13 AM
Happy to help. Let us know how speed compares with the worksheet solutions.

Nick_London
11-10-2011, 11:31 AM
As soon as I run this against "real" data I will let you know how long it takes to process. However I have come up against an unexpected further hurdle.

Basiclly I have a function that lets you run the linest function with data for more than two X variables located in non-contagious ranges.

In the example below, the extra variable I have defined as zdata. When using the code below it runs fine.


Sub AAA_test_Linear_works()
Dim Ydata As Range, Xdata As Range, Head As Range

Set Xdata = Sheets("Variables").Range("j15:j220")
Set Ydata = Sheets("Calculate").Range("d15:d220")
Set Zdata = Sheets("Calculate").Range("n15:n220")

Worksheets("Results").Cells(9, 2).Value = Application.WorksheetFunction.LinEst(Ydata, makecontig(Xdata, Zdata), True, True)(1, 1)

End Sub


However when I include the zdata variable along with the makecontig function in the main code, it generates a run time error of 424 - "Object Required" (code below).

Debugging takes me to the line: ReDim avOut(1 To av(0).Count, 0 To UBound(av)) of the function.

I thought I could just intergrate into the code and it would work fine but clearly not and I'm struggling to work out why.

Any ideas?

I've included the new code and function in the attachment.

Thanks,

Nick


Sub Process_new()

Dim arr(), x As Long, y As Long
Dim Ydata As Range, Xdata As Range, Zdata As Range, Head As Range

With Sheets("Variables")
Set Xdata = Range(.Cells(15, 3), .Cells(15, 3).End(xlToRight).End(xlDown))
x = Xdata.Columns.Count
y = Xdata.Rows.Count

Set Head = .Cells(1, 3).Resize(, x)
End With

Set Ydata = Sheets("Calculate").Cells(15, 4).Resize(y)
Set Zdata = Sheets("Calculate").Cells(15, 14).Resize(y)

ReDim arr(x, 2)

With Sheets("Results")
For i = 1 To x
arr(i, 1) = Head(i)
arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata.Value, makecontig(Xdata.Columns(i).Value, Zdata.Value), True, True)(1, 1)
'arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata.Value, Xdata.Columns(i).Value, True, True)(1, 1)
Next
End With
Sheets("Results").Range("A2").Resize(x, 2) = arr()

End Sub

mdmackillop
11-10-2011, 02:32 PM
I don't know what the output should be, but this creates a 2 column array to feed back into Linest

Sub Process_new()
Dim arr(), x As Long, y As Long
Dim Ydata As Range, Xdata As Range, Zdata As Range, Head As Range

With Sheets("Variables")
Set Xdata = Range(.Cells(15, 4), .Cells(15, 4).End(xlToRight).End(xlDown))
x = Xdata.Columns.Count
y = Xdata.Rows.Count

Set Head = .Cells(1, 3).Resize(, x)
End With

Set Ydata = Sheets("Calculate").Cells(15, 4).Resize(y)
Set Zdata = Sheets("Calculate").Cells(15, 14).Resize(y)

Sheets("Calculate").Select
Zdata.Select
ReDim arr(x, 2)

With Sheets("Results")
For i = 1 To x
arr(i, 1) = Head(i)
arr(i, 2) = Application.WorksheetFunction.LinEst(Ydata.Value, MakeContig(Xdata.Columns(1), Zdata), True, True)(1, 1)
Next
End With
Sheets("Results").Range("A2").Resize(x, 2) = arr()

End Sub

Public Function MakeContig(ParamArray av()) As Variant

Dim avOut() As Variant
Dim i As Long
Dim j As Long
Dim rw As Long

rw = av(0).Cells.Count
ReDim avOut(1 To rw, 0 To UBound(av))

For j = 0 To UBound(av)
For i = 1 To rw
avOut(i, j) = av(j).Cells(i).Value
Next i
Next j
MakeContig = avOut
End Function

Nick_London
11-16-2011, 07:56 AM
Thanks Mdmackillop - you've saved the day again!

The code seems to work now and I've verified that the calculated values correspond with manually calculated ones.

But need to change the 1 in the line Xdata.Columns(1) to "i" to make it work for all X columns.

Unfortunately the setup does not seem to work against the customised OLSRegAdhoc function which works in a simar way to Linest but can't share the code for this function as it is proprietary.

Cheers.

Nick