CPerry

03-05-2018, 04:13 PM

Multiple Linear Regression, logging and VBA

My problem involves using close to 200 independent formulas to predict movements of prices for 6 separate but connecting brands (Greyhounds in a single betting market prerace).

I have data coming into my spreadsheet from a programme automatically. From there I have formulas in Cells C40:H71 (6 x 32 = 192 cells), underneath I’d like to show which of these formulas are closest in predicting the actual moving value for each of 6 brands.

I know that my best bet will be to use multiple linear progression however for computing power purposes, is it best to firstly shorten my sample by finding each formulas Euclidean Distances? The data can be shooting into my spreadsheet at up to 20m/s so as time passes I’d like my system to ‘remember’ the previous 30 seconds worth of data (and have this figure easily able to adjust). Cell F5 (on the same tab named 'Bet Angel') has a one second interval timer to keep track of this.

The outputted data (which I will use Cells C73:H104) will then be ranked and (please correct me if my logic is off here and provide alternative solutions) with the – for example 10 – formulas that have the highest R(sq) value I will put these figures into another linear regression matrix to display a column of figures that I can use the FORECAST function with to predict the next 30 seconds worth of movement (by taking the angle of the line created in a graph to judge momentum of each brand)

Alternatively to using cells C73:H104, I have a new tab called ‘Data’ which could do this part of the system. Starting from Cell C5 and going along the 5th row (then go 30 rows down – 1 per second) and take an average back on the original sheet (Named Bet Angel). The whole of this system I would like to eventually run solely with VBA but for now part code and part formula will suffice.

If anybody has any ideas of how this can work and can provide formulas/vba code, please let me know! Can’t get my head round how this vital part will tick.

Cheers,

CPerry :)

My problem involves using close to 200 independent formulas to predict movements of prices for 6 separate but connecting brands (Greyhounds in a single betting market prerace).

I have data coming into my spreadsheet from a programme automatically. From there I have formulas in Cells C40:H71 (6 x 32 = 192 cells), underneath I’d like to show which of these formulas are closest in predicting the actual moving value for each of 6 brands.

I know that my best bet will be to use multiple linear progression however for computing power purposes, is it best to firstly shorten my sample by finding each formulas Euclidean Distances? The data can be shooting into my spreadsheet at up to 20m/s so as time passes I’d like my system to ‘remember’ the previous 30 seconds worth of data (and have this figure easily able to adjust). Cell F5 (on the same tab named 'Bet Angel') has a one second interval timer to keep track of this.

The outputted data (which I will use Cells C73:H104) will then be ranked and (please correct me if my logic is off here and provide alternative solutions) with the – for example 10 – formulas that have the highest R(sq) value I will put these figures into another linear regression matrix to display a column of figures that I can use the FORECAST function with to predict the next 30 seconds worth of movement (by taking the angle of the line created in a graph to judge momentum of each brand)

Alternatively to using cells C73:H104, I have a new tab called ‘Data’ which could do this part of the system. Starting from Cell C5 and going along the 5th row (then go 30 rows down – 1 per second) and take an average back on the original sheet (Named Bet Angel). The whole of this system I would like to eventually run solely with VBA but for now part code and part formula will suffice.

If anybody has any ideas of how this can work and can provide formulas/vba code, please let me know! Can’t get my head round how this vital part will tick.

Cheers,

CPerry :)