Consulting

Results 1 to 17 of 17

Thread: VBA to run a Linear Regression Automatically

  1. #1

    VBA to run a Linear Regression Automatically

    Hi,
    Hi, I feel like this is a simple question, but I'm not getting any results from Mr. Excel where this question is cross posted: http://www.mrexcel.com/forum/excel-q...matically.html so I figured I'd try here.

    I tried the macro recorder tool, but it won't capture my actions within the data analysis pack for some reason.

    I have the data analysis pack installed on my Excel 2010/2013. I'm trying to create VBA code that does the following:

    1. Opens up the Linear Regression Tool
    2. For the Y Range: C1:Last Row in column C
    3. For X Range: A1:Last Row in column B
    4. Check off the box for Labels
    5. Check off the Confidence Level and ensure its set to 95
    6. Select Output Range and set it to M1

    Any ideas?

    Thank you very much

  2. #2
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    something like this:
    Sub Macro6()
    lrA = Cells(Rows.Count, "A").End(xlUp).Row
    lrC = Cells(Rows.Count, "C").End(xlUp).Row
    lr = Application.Max(lrA, lrC)  'or Min? I expect the x and y have to have the smae number of values?
    Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("C1:C" & lr), ActiveSheet.Range("A1:A" & lr), False, True, 95, ActiveSheet.Range("M1"), False, False, False, False, ActiveSheet.Range("X3"), False
    End Sub
    Arguments (not authoritative):
    inpyrng As Variant, Input Y Range
    Optional inpxrng As Variant, Input X Range
    Optional constant As Variant, Constant is Zero true/false
    Optional labels As Variant, Labels true/false
    Optional confid As Variant, Confidence Level blank/nothing or 95 for 95%
    Optional soutrng As Variant, Output Range "" for a new worksheet letting excel name the sheet, "a new sheet" quote the name of the new sheet, nothing for a new workbook
    Optional residuals As Variant, Residuals true/false
    Optional sresiduals As Variant, Standardized Residuals true/false
    Optional rplots As Variant, Residual Plots true/false
    Optional lplots As Variant, Line Fit Plots true/false
    Optional routrng As Variant, ?? nothing
    Optional nplots As Variant, Normal Probability Plots true/false
    Optional poutrng As Variant ?? nothing

    I tried using named arguments like inpyrng:= but it wasn't having it, so the arguments have to be in order

    If you set a reference to atpvbaen.xls you can use the likes of:
    Regress ActiveSheet.Range("$C$1:$C$14"), ActiveSheet.Range("$A$1:$A$14"), True, True, 95, ActiveSheet.Range("$M$1"), True, True, True, True, , True
    Last edited by p45cal; 02-22-2016 at 03:56 AM.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  3. #3
    VBAX Guru mancubus's Avatar
    Joined
    Dec 2010
    Location
    "Where I lay my head is home" :D
    Posts
    2,644
    p45cal, typo i think?

    lrC >>"C"
    PLS DO NOT PM; OPEN A THREAD INSTEAD!!!

    1) Posting Code
    [CODE]PasteYourCodeHere[/CODE]
    (or paste your code, select it, click # button)

    2) Uploading File(s)
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) (multiple files can be selected while holding Ctrl key) / Upload Files / Done
    Replace company specific / sensitive / confidential data. Include so many rows and sheets etc in the uploaded workbook to enable the helpers visualize the data and table structure. Helpers do not need the entire workbook.

    3) Testing the Codes
    always back up your files before testing the codes.

    4) Marking the Thread as Solved
    from Thread Tools (on the top right corner, above the first message)

  4. #4
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Quote Originally Posted by mancubus View Post
    p45cal, typo i think?

    lrC >>"C"
    Yes, corrected, thanks.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  5. #5
    p45cal, thanks for the tips here! I'm getting a run-time error when the code looks for ATPVBAEN.XLAM. It can't find this file.

    Here's what I have for the slightly modified code:

    Sub Macro6()
    Sheets("Jo - Price Predictor").Select
        lrA = Cells(Rows.Count, "A").End(xlUp).Row
        lrB = Cells(Rows.Count, "B").End(xlUp).Row
        lrC = Cells(Rows.Count, "C").End(xlUp).Row
        lrD = Cells(Rows.Count, "D").End(xlUp).Row
        lrE = Cells(Rows.Count, "E").End(xlUp).Row
        lrF = Cells(Rows.Count, "F").End(xlUp).Row
        lr = Application.Max(lrA, lrF) 'or Min? I expect the x and y have to have the same number of values?
        Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("F1:F" & lr), ActiveSheet.Range("A1:A" & lrE), False, True, 95, ActiveSheet.Range("P1"), False, False, False, False, ActiveSheet.Range("X3"), False
    End Sub
    Any ideas? Also what is the "X3" for at the end of the code?

  6. #6
    Nevermind I got it! Apparently there is a separate Analysis Toolpak - VBA that you have to enable which is separate from the regular Analysis Toolpak. Very cool code! Thanks p45cal! You rock!

  7. #7
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Two points:
    The penultimate argument should not be there; that was just me experimenting.
    I think you should be using lr throughout the arguments not lrE at all.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  8. #8
    Ok I'll take out the X3 part. Why not use the lrE where it says:
    Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("F1:F" & lr), ActiveSheet.Range("A1:A" & lrE),
    I want the range to run A1 over to column E and down the last row. So my X range would be A1:last row of E. I don't have that right there?

    Also - is there anyway to have the macro press the ok on a prompt that comes up after I run this? What's happening is I'm overwriting a the ANOVA summary table that comes up when the macro pastes the ANOVA table in cell P1, which is exactly what I want it to do, however Excel always asks me if this is ok in a dialogue box.

    Thanks for the continued advice! I know this isn't the first one you've helped me on!
    Last edited by nirvehex; 02-24-2016 at 07:17 PM. Reason: additional question

  9. #9
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    "So my X range would be A1:last row of E"

    ??!! So where is the Y range?
    I don't know that the regression can handle an X (or Y) range on more than one column or row!

    Points on a chart each have 2 parts, an X coordinate and a Y coordinate. No points are missing a coordinate, no points have 3 coordinates (it's not a 3D chart is it?). So I'd hazard a guess that the number of X coordinates should be the same as the number of Y coordinates, which means equal size ranges of the 2 columns being used for the X and Y ranges. If they are different sizes; let's say the bottom of column A was row 10 so the range for say the X-coordinates is A1:A10, and the bottom of row C is row 5 so the range for the Y coordinates is C1:C5, that's great, we have pairs of coordinates for the first 5 points. But which value pairs up with A6, A7 etc.? Blank cells? I don't know how the regression tool handles blanks, perhaps it calls them zeroes. That's absolutely fine if you want blank cells to be counted as zeroes but be aware that it will affect the regression. I somehow doubt you want that.

    So as a general rule of thumb, for a 2D chart, you always have pairs of coordinates so it's best if those pairs have values you're sure to want included in the regression calculation (rather than having random blank values) and that there should be an equal number of X and Y coordinates.
    This is why in my code suggestion I had the line:
    lrA = Cells(Rows.Count, "A").End(xlUp).Row
    to find the bottom cell with anything in it in column A and the line:
    lrC = Cells(Rows.Count, "C").End(xlUp).Row
    to find the bottom cell with anything in it in column C.
    To my mind (to anyone's actually), if these values were different then you wouldn't have a pair of coordinates for each point. So I had to use a row number which was the same for both column A and C, which gave rise to the line:
    lr = Application.Max(lrA, lrC) 'or Min?
    which takes the larger of the two values and puts it into lr, with the intention of using it for both X and Y ranges (Columns A and C) for the regression calculation.
    Note the comment I made on the same line:
    'or Min? I expect the x and y have to have the same number of values?

    Since I plumped for max, then there would be some blank cells in one of the ranges - more fool me, I should have gone for min, that way there'd be no blanks at the bottom of either column that I didn't know how the regression tool handled. I did pose that question in the comment.

    Now you've introduced columns B, D and F. I haven't the foggiest what part they play. If you're plotting column A against column F then you've correctly got
    lr = Application.Max(lrA, lrF)
    (although I would seriously consider using Min)
    so in your line:
    Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("F1:F" & lr), ActiveSheet.Range("A1:A" & lrE), False, True, 95, ActiveSheet.Range("P1"), False, False, False, False, ActiveSheet.Range("X3"), False
    there is every chance the X and Y ranges are not the same size, and this is what I think the line should look like:
    Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("F1:F" & lr), ActiveSheet.Range("A1:A" & lr), False, True, 95, ActiveSheet.Range("P1"), False, False, False, False, , False

    This is hard work.

    I'm going to test whether Application.displayAlerts=False will work for suppressing the overwriting question.

    edit:
    It doesn't.
    You can always clear the area just before the regression line, something like:
    Range("M1:U21").Clear
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  10. #10
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    I would have tested post #1 had there been an attached file. Sounds like one was shared privately which is much more involved that post #1. There are lots of good points by p45cal. It is an interesting thread.

    GoalSeek and Solver routines can be handy.
    ' Solver, http://support.microsoft.com/kb/843304
    ' SolverOK, http://msdn.microsoft.com/en-us/libr...ice.10%29.aspx

    =Slope, =Intercept, and =ForeCast usually fit my needs for simple linear regression.

  11. #11
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Quote Originally Posted by Kenneth Hobs View Post
    Sounds like one was shared privately which is much more involved that post #1.
    Nope!

  12. #12
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    I see now p45cal that your post#9 covered several of the potential pitfalls one can encounter without seeing the data in post #5. Given that, it is even harder to help without an example file.

    I like to solve problems in simple steps which is why I usually ask for a simple file to help others. So, column A for x values and column B for y values with header labels in row 1 makes for the most simple approach.


    This relates in a very small way but is more involved than just simple linear regression. I used Slope and Intercept in VBA to determine alpha and beta inputs for a Weibull distribution in this thread. http://www.mrexcel.com/forum/excel-q...lculation.html

    =LinEst is also handy.

  13. #13
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,729
    Location
    FWIW the way I'm used to doing regression ....

    Using 2016, I could record a macro calling the data analysis pack, and then changed the recorded ranges

    Option Explicit
    Sub Macro2()
         Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("$F$1:$F$10"), _
            ActiveSheet.Range("$A$1:$E$10"), _
            False, True, 95, ActiveSheet.Range("$M$1"), _
            False, False, False, False, , False
    End Sub
    
    
    Sub Macro2mod()
        Dim rX As Range, rY As Range
        
        Set rX = ActiveSheet.Cells(1, 1).End(xlDown)
        Set rX = ActiveSheet.Cells(1, 1).Resize(rX.Row, 5)
        Set rY = ActiveSheet.Cells(1, 6).Resize(rX.Rows.Count, 1)
        
        Application.SendKeys "{enter}"  '   skip overwrite msg but I don't like SendKeys
         Application.Run "ATPVBAEN.XLAM!Regress", rY, rX, _
            False, True, 95, ActiveSheet.Range("$M$1"), _
            False, False, False, False, , False
    End Sub

    Since there was not workbook or other data, I just made some up, but I did end up with this

    Capture.JPG
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  14. #14
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Ah right! so you can have 1 Y-range (I tried and it won't let you have more than one column or one row), and multiple x-ranges and it returns with results for each one. Looks good.

    What I'm having difficulty with is why it isn't 1 x-range and multiple y-ranges; usually, when doing research, I would have expected a single x-range, as in for example plotting human growth, you might have time on the x-axis, call it age, running left to right at the bottom of the chart. Then at certain intervals you'd measure height and plot it. Say you went for yearly intervals; you'd measure the child annually and plot their height against the vertical y-axis and their age against the horizontal x-axis.
    And if you had several children and you wanted to compare them, you'd still have regular 1 year age intervals along the x-axis, and a variety of y values for each subject. What I very much doubt you'd do is have a static set of Y values, say .5 metre, 1 metre, 1.5 metres etc. and say right, let's record the time when they reach these targets, ('now look parents, keep a tab on your child's height, and when they reach 1 metre, for goodness sake don't forget to tell us!').

    What… am I missing? (Note this isn't an arrogant question. I realise I am missing something; the people who wrote this regression tool must know their art)

    (I realise it makes no difference which axes things are plotted on, it's just convention, but why, seemingly, fly in its face?)

    The other thing I noticed is that when you use just a single column for the X range, you get intercept and gradient, fine, but extend that to two columns and you only get the intercept of the second column, the gradient for the first column goes awry and you get a correct gradient for the second column.

    Clearly out of my depth on this one.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  15. #15
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,729
    Location
    Going back (way, way, .... way back) to my stat course ...

    Y is a dependent function of 5 different independent paramaters ( the X's), so LR comes up with the best set of 5 coefficients and an intercept that best 'fits' the line

    So the 'general' trend function would be something like this to 'fit' 5 independent parameters (a, b, c, d, and e)

    Y(a,b,c,d,e) = a*N18 + b*N19 + c*N20 + d*N21 + e*N22 + N17

    Capture.JPG

    The very few times I've ever used it, I wrapped LINEST like Ken suggested in a UDF and returned an array with the coefficients and intercept (a 1 x 6 array for this example)

    I found it easier to do further calculations that way
    Last edited by Paul_Hossler; 02-26-2016 at 11:11 AM. Reason: Rev terminology to dependnet / independent to be clearer
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  16. #16
    Knowledge Base Approver VBAX Wizard p45cal's Avatar
    Joined
    Oct 2005
    Location
    Surrey UK
    Posts
    5,876
    Right, thanks Paul. I'm warming to this.
    p45cal
    Everyone: If I've helped and you can't be bothered to acknowledge it, I can't be bothered to look at further posts from you.

  17. #17
    The send keys command was exactly what I needed, thank you! p45cal Thanks for your help as well. I guess either solutions works for the overwrite. Kenneth, thanks for that link about the Weibull distribution that was interesting!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •