PDA

View Full Version : Steer me in the right direction?



RINCONPAUL
11-28-2015, 01:30 AM
I've got half a mill of Horse racing results to analyse! I just want to look at breaking the data up into 4 combinations of criteria, searching for a profitable combination? 3 of those criteria are static, like number of runners, distance, class of race, but the last criteria is dynamic! That's the current price.

Now back testing is cool, coz we know what the final starting price of each horse was, but prior to a race starting, we don't know that? However we know that it's going to be +- 2% of the implied probability of the current price just before the horses jump in a race. The conundrum is that you can't back test using an individual price, as there's thousands of possibilities, ever changing. Back testing using a range of current price +- 2% is OK, but if you have fixed ranges, like $2 ~ $3, then $3 ~ $4....., you get stuffed at the crossover points I.e current price is oscillating around $3....what range should it be in?

How can you make the whole thing more dynamic? Maybe a lookup of a pivot table...dunno...any suggestions gratefully accepted.

Aussiebear
11-28-2015, 05:04 AM
Well giving the simplicity of your thread,...... post a workbook, you've been here long enough to know it makes life so much more simple.

SamT
11-28-2015, 09:43 AM
but if you have fixed ranges, like $2 ~ $3, then $3 ~ $4....., you get stuffed at the crossover points I.e current price is oscillating around $3....what range should it be in?The price ranges are dynamic, too. That particular horse's price range would be 3 +- 1, OR 2 to 4, OR 3.00, .33 (3 +- 33%)

Personally, I would go with "3.00, .333" if it was in a single cell. Easy to Code for that, but still best to use two Columns

Center of price range and % swing tells a lot more than min and max.

RINCONPAUL
11-28-2015, 11:25 AM
Thinking about it overnight, if I use the data from 3 static criteria, say (1200m, Maiden, 7 starters), then create a pivot table of two columns. The Y axis would be price in tick size. The X axis would be $ profit/loss (there would be blank cells, so have to forecast outcomes to fill in blanks). Then use an index match of current price to search the Y axis and return the average profit/loss in a range 2% each side of that current price? So lets say the current price is $5. A function converts that to implied probability % = 20%. Another function +- 2% to that = 22% & 18%...reconverts to $ = $5.55 & $4.55. Then have an array formula like '{=AVERAGE(IF(($A$2:$A$500000<=5.55)*($A$2:$A$500000>=4.55),$B$2:$B$500000,""))}'? That will probably slow down excel markedly, as it would be doing same for every horse in current race (24 max), so a vba solution might be more expedient?

RINCONPAUL
11-28-2015, 12:22 PM
I've attached a workout sheet. I'll describe the problems as I go. Firstly filtering down data creates missing information. The prices in column A started as 420 No. but that left too many missing prices without an historical result, so I rounded them up to nearest whole number (1.55 = 2, 3.4 = 3), that reduced the subset to 47No. By rounding I have applied an undefined averaging percentage adjustment, so that's not good at all! I put that subset into a pivot table to sum the profit/loss and col B is the result.

Let's say I have an excel bot made up with all these indexes in a table. If the live price of a horse is cell D7 ($7.55), the adjacent functions add, subtract 2%, reconvert to $, then round up, then cell K7 averages that range = -$3,410 loss.

The downfall, I think, isn't so much the function trail, but the rounding up? If I return to original data, I get a more accurate result, but have to start interpolating the profit/loss for missing prices? So I might have $3.40 = $45 profit, $3.60 = $60 loss. If current price is $3.58.....what do I do? Hence why I rounded up to nearest whole number to lump them all into a bigger basket, but in so doing created inaccuracies in profit/loss forecast!

SamT
11-28-2015, 01:23 PM
The downfall, I think, isn't so much the function trail, but the rounding up?
Rounding has very little effect. You would occasionally include a price that would be exempt with no rounding.

Adding or subtracting from a reciprocal has a definite effect.

Consider this Formula in the third column "=1/(1/A1+B1)"



1
0.02
0.98


5
0.02
4.55


10
0.02
8.33


20
0.02

14.29


30
0.02
18.75


50
0.02

25.00


75
0.02
30.00


100
0.02
33.33


110
0.02
34.38


200
0.02
40.00




Now consider this Formula: "=A1*B1"



1
80%
0.80


5
80%
4.00


10
80%
8.00


20
80%

16.00


30
80%
24.00


50
80%
40.00


75
80%
60.00


100
80%
80.00


110
80%
88.00


200
80%
160.00

RINCONPAUL
11-28-2015, 01:51 PM
Hi Sam, thanks for your interest. I beg to differ on the consequences of rounding? Consider a price of $1.51. It has an implied probability of 66%, rounded to $2, it has an implied probability of 50%. Just lost 16% in the rounding and I want to work plus or minus 2%. I've got to think about this a bit more.

SamT
11-28-2015, 02:06 PM
Then don't round at all. Excel can handle it.

I can only work with the few tidbits that you share.

Tidbits? Orts? Morsels? Tiny little unrelated bits of information.

RINCONPAUL
11-28-2015, 02:43 PM
The water gets muddied too with Betfair, and their back lay spreads which you wouldn't be familiar with probably? The starting price of a horse is the most reliable indicator of performance, but no one knows what it will be with enough degree of certainty. Thinking outside the square, I might be better off converting current price to a Rank, and comparing current rank to the Open rank. It reduces the typical race to about 10 numbers to consider. Much more manageable?

Cheers again, over and out.