Consulting

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 33

Thread: How to set up spreadsheet for regression analysis in Excel 2016

  1. #1
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location

    How to set up spreadsheet for regression analysis in Excel 2016

    I want to loop through a lot of data tables and do multivariate regression analysis on them.

    I am asking how to setup my data for data analysis toolpack. I have a lot of spreadsheets with #321 rows (with header) and a dynamic number of columns.

    In this thread a OFFSET loop is used:

    https://www.mrexcel.com/forum/excel-...taneously.html

    And I want to know how my data should be set up for the VBA OFFSET code in the link to work?

    Or if there is a better way to set up my data given my 321 rows and dynamic number of columns?

    For n = 1 to 800
    Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range($C$5:$C$11").Offset(0,n), ActiveSheet.Range("$AFL$5:$AFM$11"), _
    False, True, , Worksheets("1"), False, False, False, False, , False
    Next
    Last edited by waimea; 12-17-2018 at 10:36 AM.

  2. #2
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    I haven't tried yet but I am thinking that I can read a table into an array and then regress the array?

    Perhaps even insert the independent variable into the table as the last column, then read the table into the array and then regress the array.

    Is this possible?

  3. #3
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    If you attached an example file, it would be easier to help.

    See if this helps. http://www.vbaexpress.com/forum/show...-Automatically

  4. #4
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    Hi Kenneth,

    I have looked at your link and it is very useful.

    I have several tables with dependent variables and I have one table with independent variables. The dependent and independent tables are not on the same sheet.

    In sheet1 I have 23 X in a table called Data, in sheet2 I have 3 X in a table called Data2.

    In sheet3 I have 8 Y in a table called Independent.

    I would like to loop through an array of table names (ex. array("data","data2", "more table names"))

    and regress all X and Y variables. I think that I can use a for each loop and offset to offset the Y variables?
    Attached Files Attached Files

  5. #5
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    I have recorded a macro where I use the function Regress:

    Sub Regression()
         Application.Run "ATPVBAEN.XLAM!Regress", ActiveSheet.Range("$B$1:$B$321"), ActiveSheet.Range("$B$1:$Q$321"), False, True, , "", True, False, True, True, , False
    End Sub
    I have two questions:

    1.) Is there anyway to use more then 16 x variables / columns in Data Analysis Regression?

    2.) How can I loop the input ranges for X and Y?

  6. #6
    Make your life easier and just combine the tables into one before starting the regression.
    Regards,

    Jan Karel Pieterse
    Excel MVP jkp-ads.com

  7. #7
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    I have recorded copying the y1 variable into the spreadsheet with the x1, x2, x3 values.

    I want to loop this so that it copies the y1, y2, y3, y4, y5,y6, y7,.. y16 variables into sheet2.

    
    Sub ChangeIndependentVariable()
        Range("Independent[[#Headers],[y1]]").Select
        Range(Selection, Selection.End(xlDown)).Select
        Selection.Copy
        Sheets("Sheet2").Select
    
        Range("E1").Select
    
        Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
            :=False, Transpose:=False
        Range("Data2[[#Headers],[Name]]").Select
    End Sub


    Sub ChangeIndependentVariableLoop()
    
    Dim i As Integer
    
        For i = 1 to 7
    
        Range("Independent[[#Headers],[y& i]]").Select
        Range(Selection, Selection.End(xlDown)).Select
        Selection.Copy
        Sheets("Sheet2").Select
        Range.Offset(0,i).Select
    
    
        Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
            :=False, Transpose:=False
    
        Range("Data2[[#Headers],[Name]]").Select
    Next i
    
    End Sub
    Something like this?

  8. #8
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    I am looking at this post from the thread you linked earlier, http://www.vbaexpress.com/forum/show...l=1#post339025

    Sub Macro2mod()
        Dim rX As Range, rY As Range
        
        Set rX = ActiveSheet.Cells(1, 1).End(xlDown)
        Set rX = ActiveSheet.Cells(1, 1).Resize(rX.Row, 5)
        Set rY = ActiveSheet.Cells(1, 6).Resize(rX.Rows.Count, 1)
        
        Application.SendKeys "{enter}"  '   skip overwrite msg but I don't like SendKeys
         Application.Run "ATPVBAEN.XLAM!Regress", rY, rX, _
            False, True, 95, ActiveSheet.Range("$M$1"), _
            False, False, False, False, , False
    End Sub
    The use of SendKeys is really clever and I haven't seen that before.

    How could I loop this code given my 321 rows and different number of columns? (From 1 to 16 columns)

  9. #9
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    I found this link https://stackoverflow.com/questions/...ression-in-vba with the following code:

    Dim x as WorkSheet : Set x = ActiveSheet '<-- I'd recommend to set it with the sheet's name (i.e. x = Worksheets("Xdataset"))
    
    Dim col as Range
    
    For Each col in x.Range("F3:CU22").Columns
        Application.Run "ATPVBAEN.XLAM!Regress", col, 
        x.Range("$C$3:$E$22"), False, False, , col.Cells(0).Value, False, False, False, False, , False
    Next
    I don't understand how the loops works? What is x.Range("F3:CU22).Columns, what is x.Range("$C$3:$E$22") and what is col.Cells(0)?

  10. #10
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    Looks like you will need 8 runs per X dataset listobject. If your listobjects are laid out as in the example file, I would put the output to the right of the usedrange on row 1 on each data sheet with a blank column between each.

    I will work up an example using your file. You can change your regress options to suit. Recording a macro is the usual first step.

  11. #11
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    Hi Kenneth,

    this is very kind of you. I am really struggling with this and all help is very much appreciated.

  12. #12
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    Regarding question 1 in post #5, no, 16 is the Regress limit. If you want to do more, you will have to pursue other means. e.g. SAS, R, etc. You can view those other ways by searching for "excel regress 16".

    For question 2, after removing the data columns of more than 16 in that one listobject, you can try this code in a Module. It is more involved due to listobjects.

    Sub Main()  
      Dim ws As Worksheet, rX As Range, rY As Range, rO As Range
      Dim lO As ListObject, yLO As ListObject, xLO As ListObject
      Dim y As Range
      
      Application.DisplayAlerts = False
      
      'Set ListOjbect("Independent")
      For Each ws In Worksheets
        For Each lO In ws.ListObjects
          If lO.Name = "Independent" Then
            Set yLO = lO
            Exit For
          End If
        Next lO
        If Not yLO Is Nothing Then Exit For
      Next ws
      If yLO Is Nothing Then GoTo EndSub
      
      'Range for set of Y's
      Set y = yLO.DataBodyRange.Columns(2).Resize(, yLO.DataBodyRange.Columns.Count - 1)
      
      'Iterate dataset listobjects and regress
      For Each ws In Worksheets
        'Assume 1 ListObject per sheet
        If ws.ListObjects.Count = 1 Then
          If ws.ListObjects(1).Name <> "Independent" Then
            Set xLO = ws.ListObjects(1)
            Set rX = xLO.DataBodyRange.Columns(2).Resize(, xLO.DataBodyRange.Columns.Count - 1)
            
            For Each rY In y.Columns
              'Set range for regress output
              Set rO = ws.Cells(1, LastNBCol(ws.Cells) + 2)
              'Regress
              Application.Run "ATPVBAEN.XLAM!Regress", rY, _
                rX, False, False, , rO, _
                False, False, False, False, , False
            Next rY
          End If
        End If
      Next ws
      
    EndSub:
      Application.DisplayAlerts = True
    End Sub
    
    
    Function LastNBCol(rng As Range) As Long
      Dim LastColumn As Integer
      If WorksheetFunction.CountA(Cells) > 0 Then
        'Search for any entry, by searching backwards by Columns.
        LastColumn = rng.Find(What:="*", after:=rng.Cells(rng.Rows.Count, rng.Columns.Count), _
                      SearchOrder:=xlByColumns, _
                      SearchDirection:=xlPrevious).Column
        End If
        LastNBCol = LastColumn
    End Function

  13. #13
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    Hi Kenneth,

    you really went above and beyond what I was expecting, this code is great and I am really grateful for your time and your help. You are truly a VBA Guru!

    I am looking at the code and I understand some of it. It going to be a long night before I understand all of it.

    This code can handle up to 16 Y (independent variables) and 16 X (dependent variables)?

  14. #14
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    The number of Y sets is only limited by the number of columns available for output. Or, it could easily be re-coded to poke the regress output into rows until you hit the million row limit....

  15. #15
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    Thank you again, this is very kind of you Kenneth!

    I'm not sure if I understand it completely, can I have multiple tables with additional X values with a max of 16 X in them?

  16. #16
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    Yes. I coded it for one ListObject per sheet. It can be coded otherwise. So, rather that output to row 1, it could be coded to put output to the right on the same row as the ListObect's title row which may or may not be row 1 on the sheet.

  17. #17
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    This gets better and better!

    And all of this happens in your Function LastNBCol(rng As Range) As Long?

  18. #18
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    That function's purpose is to get the last non-blank column on the sheet. I set the output top left cell to be 2 columns to the right of that. That gives a blank column between the outputs.

    IF you want output row to the some row as the listobject's title row rather than the sheets row 1:
    'Set rO = ws.Cells(1, LastNBCol(ws.Cells) + 2)
    Set rO = ws.Cells(xLO.HeaderRowRange.Row, LastNBCol(ws.Cells) + 2)

  19. #19
    VBAX Contributor
    Joined
    Jul 2018
    Posts
    174
    Location
    This is almost like Christmas morning and getting to open the Christmas presents early!

    If I wanted to automate the regression output and automatically lookup the "P-value "against the "Significance F" I would use FIND and OFFSET?

    FIND to find SUMMARY OUTPUT and the OFFSET to OFFSET against that cell and VLOOKUP against a table with P-values?

    Would it be possible to autofit the output columns? Just to make sure I get what the code is doing, it takes each Y value and regresses them on the tables with X values one X column at the time? Or one Y value on all of the X values?
    Last edited by waimea; 12-19-2018 at 03:42 AM.

  20. #20
    VBAX Guru Kenneth Hobs's Avatar
    Joined
    Nov 2005
    Location
    Tecumseh, OK
    Posts
    4,956
    Location
    I guess one could autofit after each regression or at the end of the Main() macro or separately as shown in the code here.

    To answer regress each y set vs one x set or the many x sets (16 max), you can run this code. As you can see in the result on Sheet1, it regresses all x sets together in Main().

    As for looking up the stats after the run, I would probable add a unique value in Main(). e.g. SUMMARY OUTPUT Y1, SUMMARY OUTPUT Y2, etc. I would that to make a formula lookup or a macro lookup, easier. Of course during the Main() run, we could add code to return stat parts as needed.

    Sub AutoFitAll()  
      Dim ws As Worksheet
      For Each ws In Worksheets
        ws.UsedRange.Columns.AutoFit
      Next ws
    End Sub
    
    
    Sub SingleLR()
      Application.Run "ATPVBAEN.XLAM!Regress", Sheet2.Range("B2:B321"), _
                Sheet1.Range("B2:B321"), False, False, , Sheet1.Range("S35"), _
                False, False, False, False, , False
    End Sub
    Note that Sheet2=Worksheets("Sheet3"). Sheet2 is the codename (object) for the sheet whereas "Sheet3" is the tab name.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •