PDA

View Full Version : [SOLVED] Arranging groups of numbers into a normal distribution

nirvehex
02-27-2016, 01:35 PM
Hi All,

I'm trying to figure out a formula or method to organize a list of numbers into a normal distribution shaped like a bell curve.

Currently I have a list of text representing location ID's in column A and sales figures in column C. In column B I'm trying to generate a sales tier or bin. I want to organize the bins into a normal data distribution shaped like a bell curve. So the middle sales tier would have the highest count of locations. I've come up with a few statistical methods which can be selected in my drop down in cell E3, but I can't seem to get it into a normal distribution. Based on this drop down it generates the (1) size of each bin, and (2) the number of bins. I've attached my demo file.

Is there anyway to write a formula to organize the data into a normal distribution? Any help would be appreciated.

Thank you!

p45cal
02-28-2016, 04:25 PM
From what i can see you have no normal distribution - the only way I can see might be to have different bin range sizes - but I suspect that'd be cheating.

nirvehex
02-28-2016, 05:56 PM
p45cal, No that is not cheating. That is exactly the point. I want to create # of bins, and bin size to achieve a normal distribution or as close to as possible. The things that can change are bin size and # of bins.

p45cal
02-29-2016, 03:05 AM
Currently you have stuck with equal bin range sizes per 'analysis'; what I was suggesting is that these bin ranges be different in size within one 'analysis'. The more I think about it the more I think it's cheating.
I'm no statistician but if you want me to expand on why I think it's cheating, I will.
Going down this route would explain to me why someone coined the phrase 'there are lies, damned lies, and statistics'. The stuff of politicians and climate so-called scientists. Curve-fitting.

p45cal
02-29-2016, 05:46 AM
I can't see any clustering of values of your raw data (apart from everything being between 1 million and 5 million):
15502

Average 3,014,882
SD:1,157,666

Fiddled bin ranges: nice 'normal' curve with 68% values within central 2 bars, 95% of values within the central 4 bars and 99.7% of reading included in all 6 bars.
15501
Cheating.

Paul_Hossler
02-29-2016, 07:34 AM
Sorting the data by Sales, normalizing it to a domain of [0.5, 0.5], and applying a NORM(), I can get this

It looks pretty at least, but I'm not sure exactly where you want to go

15505

nirvehex
02-29-2016, 01:47 PM
p45cal/Paul,

I definitely like where you guys are going with this!

Maybe I'll summarize again what I'm trying to do, as I feel I probably didn't explain as well as I could have.

I'm given a list of sales figures. I want to determine (1) # of groups and (2) group size.
Determine 1 and 2 to group data into groups that when graphed on a column chart, form the shape of a bell curve or close to it.

Paul I see you're plotting each individual point of the normalized data. What I'm hoping to do is to group these points into groups (A, B, C, D, E, F, G, H, etc...) where each letter represents a range and ultimately follows the shape of the bell curve like you have above. So maybe A is 0-20, B is 20-40, C is 40-60, D is 60-80, etc. p45cal used the Excel histogram tool, but I was hoping to come up with a formula that determines (1) the number of groupings needed and (2) the size/range of each grouping in order to obtain a bell curve like shape.

Any idea how to do this without the histogram tool? Paul, I feel like your method is close if I could just figure out a formula that determines # of grouping and grouping size and still follows your bell curve shape above.

Many thanks to both of you!

Paul_Hossler
02-29-2016, 03:59 PM
I calculated the Sales \$ at various standard deviation points and used those as bins for Frequency()

I really don't think you'll get a bell curve since as far as my limited stat skills go it doesn't appear to be normally distributed

But this is what I got

I'd go with p45cal's approach in #5

15512

nirvehex
02-29-2016, 06:39 PM
Ok. Thanks to both of you guys! I think I need to go to a stats refresher course myself, but this was super helpful.