PDA

View Full Version : Clean the working mess



IkEcht
05-12-2009, 05:29 AM

IkEcht
05-12-2009, 05:30 AM
somehow text got lost: here you go:
Hello again,

so I've got this working piece of code. It basicly generates some averages and shows me the best three scores (and scorers) (mind you, for one variable the best score is the lowest score, for the next variable the best score is the highest) within each of three different classes (city-sizes). A lot of missing data doesn't give any problems. But every time I look at the code I can see that is could be way more efficient, then again every time I try to rewrite it, I end up with another piece of code that seems to keep on repeating itself.

So basicly looking for someone willing to look at this code and provide me some hints how to rework this.

Thanks, IkEcht

ps workbook attached, data malformed, but you'll get the idea I hope. Code also inserted below (repeat-warning)

GTO
05-12-2009, 06:34 AM
Greetings IkEcht,

You have a nice bit of work in this, and I'm feel certain that someone here will be able to add to it as far as efficiency. For those of us who might be a tad thick-headed (me for instance), could you either zip an example with the variables in English, or, provide a list of what each translate to in English?

Again, this seems like some nice effort and the variables appear to be named meaningfully. Its just that the meaningful names are for naught for those of us who cannot understand them.

Hope to help,

Mark

IkEcht
05-12-2009, 07:02 AM
Some translations might be a bit tricky as the nice thing about not having English as a native language is I can use the variable "Som" which in English would translate "sum" and probably is restricted. But here you go for a workbook with translated variablenames

Bob Phillips
05-12-2009, 07:49 AM
It is impossible to look at that code without thinking that a) the code could be drastically reduced in size and in performance, b) a re-design would probably have even bigger benefits, and c) we would be stupid to even try without having a clue about the logic behind this monster.

I personally wouldn't even begin to chan ge it without knowing what your overall objective is in terms of data and required results, your previous explanation doen't do it, and also what the code logic is as I don't have time to work it out for myself.

IkEcht
05-12-2009, 08:15 AM
Let me (re)try. As you see the objective of this exercise. Reduce the size, increase performance and more even better benefits.

Basicly the data is about different aspects of Duch cities (column A on the data-sheet). The data-sheet itself is all well-aranged and holds nothing strange. The columns B through EM hold the actual data.

First of all the Dutch cities are divided into four categories based on their size (Colum B of the Data-sheet).
1. <20000 citizens
2. 20.000 - 49.999 citizens
3. 49.999 - 94.999 citizens
4. => 95.000 citizens

Now for each collumn B to ED I want to calculate a mean for the Netherlands as a whole (or at least of those cities for which we have data, others have a "-" by default). And a mean for the categories 2,3,4 above so for the three sizecategories > 20000 citizens. (rows 3-7 on the "bestengemiddelden" sheet).

<Leave this code out first, this is another part to win efficience, but not the first for sure>
There is only one complitating factor in this and that is that for columns CY to DG and DL to DT the average score should be averaged over the values in EE to EM. Probably better understood when looking at the code written to do this.
</end of leave this out code>

For these last three size-groups I also want to have the best three cities within this group with their score. (rows 8 to 13 for the 20-50k, rows 15 to 20 for the 50-95k and rows 22 to 27 for the 95k+ group).
If there are equal scores I want the cities to be mentioned together (for instance cell T10 on "bestenengemiddelden"). I then try to name only the best three cities, but if more cities have an equal score that is in the top they all get mentioned.

The last complicating factor is that it is not always true the highest score is the best score. It differs from column to column wether the lowest or the highest score is best.

Oh and this data later on is used to fill several reports. In fact the cities scoring best are presented as benchmarks to other cities in the same size-category.


I hope this is somewhat more clear. But feel free to ask for more information.