PDA

View Full Version : How to analyze big data with Excel



waimea
04-10-2019, 08:29 AM
I have a lot of data but I am not sure how to analyze it with Excel.

Should I use regressions? Pivot tables? Something else?

What is the first step in analyzing big data??

Bob Phillips
04-10-2019, 08:37 AM
Power Query and Power Pivot and pivot tables.

waimea
04-10-2019, 08:43 AM
Hi xld,

thank you for your answer!

I have tabular data and my first data table has 1600 rows and 100+ columns.

In what way should I try to transform my data before analysis? Should it be tabular or non-tabular?

I am trying to compare cities against each other and find out what city-city combinations is the best.

Bob Phillips
04-10-2019, 09:37 AM
That amount of data is trivial for these tools.

You should look to denormalise it so that you can pivot the data and get the sort of analysis you require. Power Query can import that data as it is and unpivot it so that it can then be pivoted.

waimea
04-10-2019, 09:47 AM
Hi again xld,

thank you for your reply!

I have unpivoted my columns using PowerQuery and I now have 3 columns, city name, metric and value.

How should I proceed?

Bob Phillips
04-10-2019, 10:20 AM
On the Home tab, click the arrow on the Close & Load button (Furthest left) and select Close & Load To. In the popup dialog, select connection only and Add to Datamodel. Exit back to Excel.

Now in Excel, go to Insert>Pivot Tables, select 'Use this workbook's Data Model' from the popup dialog, you should see your query there which you can now use in a pivot table.

waimea
04-10-2019, 10:35 AM
Hi xld,

thank you for your instructions, I now have a data model with my data in it but when I click on metric in my field list I get a list of all metrics.

Is it possible to single out them one by one in the way my data started out like?

Bob Phillips
04-10-2019, 10:43 AM
Put the metric in the rows pane not the value pane.

waimea
04-10-2019, 11:51 AM
Hi xld,

I have followed your instructions but I am not sure on how the unpivoted data when pivoted in a pivottable is better?

After I have my datamodel should I create measures to analyze the data in PowerPivot? Or try to use regression on the data?

Bob Phillips
04-10-2019, 01:28 PM
What you have is a set of raw data that you can now analyse. How you analyse it I cannot say, I do not know the data or what you are trying to achieve.

waimea
04-10-2019, 01:45 PM
Hi xld,

thank you for your reply. I don't know how I should analyze my data.

I am comparing cities against each other on a lot of different metrics, more is better and there is a min and a max score.

Say I have cities A, B and C.

Then I am comparing City A vs City A. City A vs City B. City A vs City C.

City B vs City A, City B vs City B. City B vs City C.

City C vs City A. City C vs City B. City C vs City C.

I want to find the combination of highest score between all combinations of the different cities.

waimea
04-13-2019, 12:51 AM
Any suggestions on how I can analyze data that is as in the above post??

snb
04-13-2019, 07:30 AM
Any suggestions on how I can analyze data that is as in the above post??

You can't since these are only strings.
Please post a sample Excel workbook.

Paul_Hossler
04-13-2019, 04:05 PM
Any suggestions on how I can analyze data that is as in the above post??

Why compare a city against itself?

If you have 10 cities (C01 - C10) and 15 metrics (M01 - M15) to compare, then I'd figure you'll have 15 x (10 x 9) / 2 = 675 'compares' comparing each metric for each city pair

e.g.


C01 - C02 - M01, C01 - C02 - M02, … C01 - C02 - M15
C01 - C03 - M01, C01 - C03 - M02, … C01 - C03 - M15

….




C09 - C10 - M01, C09 - C10 - M02, … C09 - C10 - M15







Sounds like you started with a 10 row, 16 column array of data using my made up numbers above ??

waimea
04-14-2019, 01:25 AM
Hi Paul,

thank you for your reply! I am comparing a city against itself only because I loop trough "all items" in two comboboxes.

There is no value in comparing city A with city A.

I am not sure what I am starting with at the moment. I will have to look at my file again.

I will post a sample workbook. Do you have any suggestions on how I should analyze the "compares"?

Paul_Hossler
04-14-2019, 07:54 AM
As an CONCEPTUAL EXAMPLE .... :yes … you could look at something like this

10 cities (GREEN), 3 metrics (YELLOW)with each metric having a weighting factor (ORANGE)

The BLUE are just random numbers 0-100 but could be your raw scoring of City-X against City-Y for Metric-M (Col A - Col K)

Each raw score is weighted (Col M - Col N) in Col P - Col W

Weighted scores are summed Col Y - Col Z

The weighted summary is sorted High to Low in Col AB - Col AC

I used 2 UDFs and array functions

F9 to recalc and see what happens



24062

waimea
04-14-2019, 08:18 AM
Hi Paul,

thank you for your reply! You are a conceptual example genius!

I am looking at your workbook right now and I am trying to understand it!

How did you determine the weights?

Paul_Hossler
04-14-2019, 12:09 PM
Hi Paul,

thank you for your reply! You are a conceptual example genius!

I am looking at your workbook right now and I am trying to understand it!

How did you determine the weights?

I just put in anything for the weights - in real life, you'd have to figure out what each metric was 'worth' in your analysis

This example determined which City Pair was 'best', but it's probably possible to figure out which City was highest scored

waimea
04-14-2019, 11:59 PM
Hi Paul,

thank you for your reply.

How would you go about finding the city/cities with the highest score?

Paul_Hossler
04-15-2019, 07:46 AM
The 'best' City Pair is in AB2 in my example above. The City01-City02 has a score of 70 for Metric-01

You could give City01 a 70 and City02 a 70 for Metric-01, and then weight Metrics, and, etc.

24066


This makes City-04 the 'best' with a composite score of 530.95

It really all depends on how you weight the metrics and how you choose to determine the 'best'

waimea
04-15-2019, 09:42 AM
Hi Paul,

thank you for your reply! I really like your idea with a comsposite score.

I will download your workbook later today and see if I can recreate it.

Thank you again!

waimea
04-18-2019, 09:04 AM
Hi again Paul,

I have now had a good look at your two conceptual examples and both are breathtakingly good! I am looking at your UDF right now.

However, my data output from my macro is not in the same form. Is there an easy way to change my output macro to suit your conceptual examples?

I have condensed my data to only have 3 metrics.

1. Make my data the same as in your example
2. Make the code aware of how many cities there are

eg. my first batch of cities has 40 cities, the next batch has 59, the next 26 etc.. down to a mere 1 city.

All suggestions are welcome.

Paul_Hossler
04-18-2019, 12:31 PM
Sure

However, you would not want to include city pairs (CityX CityY) where (a) X = Y and where (b) X > Y

(A) X = Y would double count (e.g. City01 City02), and

(b) X > Y would have been counted earlier where X < Y (i.e. City02 City 04 counted, but City04 City 02 would be counted again)

Weighting factors?

Did you want to score city pairs or each city by itself?

waimea
04-18-2019, 12:57 PM
Hi Paul,

thank you for your reply! You are too clever for me and this is above and beyond what I could ask for!

I want to score each city for itself. Since the combination of city pairs only is a comparison and I want to find the cities with the highest composite score.

I am not sure about Weighting factors, I was planning to use it as you did in your conceptual example with one weight for each metric.

Paul_Hossler
04-18-2019, 01:54 PM
Q: what does the originalest data look like? You said output of your macro, implying that it's processed or reformated somehow

Q: I assume the cities have real names and not -nn numbers ("New York City", Los Angeles", "London", ...)

Q: I assume the metrics have real names ("Size", "Quality of Life", Good Fishing", ...)



Your basic premise (as I understand it) is to (A) compare two cities and 'grade' them against each other using various metrics (e.g. "LA vs NYC on size is 66")

It is NOT to (B) just 'grade' each city on the metrics
(e.g. "LA on size is 66" and "NYC on size is 75")

I've never done it the A-way before, just the B-way

waimea
04-18-2019, 02:14 PM
Hi Paul,

thank you again for your reply! It is getting late here and I will make a sample workbook tomorrow.

The cities have real names, the metrics have real names and your premise is correct (I think).

Do you see any advantages or disadvantages of the A-way?

Paul_Hossler
04-18-2019, 03:53 PM
Do you see any advantages or disadvantages of the A-way?

A way: I just think it'll be harder for you to score the city pair comparisons

Example: say that the Metric = "Quality of Life"

LA - NYC = 56

LA - London = 67

London - Sydney = 78

The raw data using the "Each City" approach in post #20 would have

LA = 123

London = 145

etc.





B way: I just think it'll be easier for you to score each city for each metric

Example: say that the Metric = "Quality of Life" and the score for individual cities are


LA = 56
NYC = 67
London = 78
Sydney = 89

The sum by city and weighted would give you the composite score


Doesn't have to be 0 - 100, I've seen just five values used -2, -1, 0, 1, 2

waimea
04-19-2019, 12:34 PM
Hi Paul,

thank you for your reply! I have read all of your posts again and you are spot on!

I don´t want to double count and I don´t want to use the A way.

I have condensed my metrics down to two metrics.

Metric A is the score of city vs city. Metric B is the score of city vs state. I am planning to weight my metrics 50% and 50%. So city01 vs city02 in metric A could be 55 and metric B could be 77. Metric A is a combination of other metrics and a score of 55 means that there are 55 comparisons where metric A is above the City 2 value and 77 is also a combination of metrics where metric B is above or equal to the state mean.

Ex. City01 has a value of size that is greater then City02 and also greater then the state mean. The state mean is calculated from all cities in that state.

So what I am trying to do now is to get my output (in the attachment above) to be as in your example.

But I am not sure how?

Paul_Hossler
04-20-2019, 05:23 AM
So city01 vs city02 in metric A could be 55 and metric B could be 77. Metric A is a combination of other metrics and a score of 55 means that there are 55 comparisons where metric A is above the City 2 value and 77 is also a combination of metrics where metric B is above or equal to the state mean.


Been thinking on this

Metric 1 - So if

1 vs 2 = 55
1 vs 3 = 45
1 vs 4 = 65

2 vs 3 = 70
2 vs 4 = 60

3 vs 4 = 30

how would you generate a single rank-able score for

1 = ?
2 = ?
3 = ?
4 = ?

Ranking a PAIR of cities is easy: 2-3, 1-4, 2-4, 1-2, 1-3,3-4

But I don't know what that tells you



Metric 2 is much more straight foward


1 = 35 above state mean
2 = 15 above state mean
3 = 45 above state mean
4 = 20 above state mean

Rankings are City 3, 1,4, 2

waimea
04-20-2019, 06:43 AM
Hi Paul,

thank you for your reply! I am looking at sports rankings where they count wins, draws or losses.

The best city would be the city with the most wins?

I am working on a sample workbook with my macro that creates the output, I will post it as soon as I get it more complete.

waimea
04-20-2019, 08:21 AM
Here is a sample workbook which represents my actual workbook pretty good.

I am using a macro to loop through two comboboxes and store the value of metric A and metric B in a new worksheet called Score.

Is is the table in Score that I want to analyze and find the best combination of cities.

waimea
04-21-2019, 01:14 AM
Any suggestions on how I can analyze this data?

The sample workbook uses the state with the least number of cities.

Paul_Hossler
04-22-2019, 06:17 AM
Any suggestions on how I can analyze this data?

The sample workbook uses the state with the least number of cities.

Didn't forget -- got busy this weekend

Trying a few approaches

Paul_Hossler
04-23-2019, 07:26 AM
OK, see if this floats your boat. I didn't want to go too far in case it wasn't what you were looking for


For the first composite metric

I compared the score for each metric for each combanation of cities, and if CityX+MetricM minus CityY+MetricM was positive then add 1 to CityX/CityY score, if negative subtract 1



For the second composite metric

I compared the score for each metric for each combanation of cities, and if CityX+MetricM minus Average for MetricM was positive then add 1 to CityX score, if negative subtract 1

24120

The worksheet "Dump" is not really needed - I just used it to display intermediate array results

The worksheet "Score" is not automatically created yet

waimea
04-23-2019, 02:02 PM
Hi Paul,

thank you for your reply! It is getting late here and I am not sure that I follow what you are doing.

I will take a fresh look tomorrow at your workbook. I think that you float my boat! Thank you again for your response!

waimea
04-24-2019, 03:20 AM
Hi again Paul,

I have now had a long look on your workbook and I don't understand all of your code in mod_PH1.

However, I think that I understand your output.

You are taking "CityX+MetricM minus CityY+MetricM " to avoid duplicates?

I don't understand what you mean with +1 or 0 and NOT +1 and -1?

Thank you again for your help with this!

EDIT: I am not sure but I think that it is the best way to start with "Score" as input?

Paul_Hossler
04-24-2019, 06:39 AM
Like I said, I'm struggling with meaningful analysis for city pairs. One city at a time is easy


What I tried was ...

1. Comparing Karlshamn against Karlskrona for 20 metrics, if Karlshamn > Karlskrona for 7 and Karlshamn < Karlskrona for 12 and even for 1, then I scored the Karlshamn - Karlskrona pair as -5 ( i.e. 7 - 12 + 0)

I could just count the 'greaters than' and then Karlshamn would = 0 and Karlskrona = 12


2. I started with the 'Data' sheet and 'Score' was the last. I'd format 'Score' better to make it more understandable ("easy enough for a VP to read" :) )


3. I think I'll take a shot at just a single city comparison and see what it looks like

waimea
04-24-2019, 07:44 AM
Hi Paul,

thank you for your reply. I am very grateful for your help on this.

It isn't an easy ask and I am also not sure on how to do a meaningfull analysis.

What makes the city pair analysis hard? I am not sure that I follow on the why.

Paul_Hossler
04-24-2019, 08:14 AM
The City Pair just feels like comparing apples and oranges


I don't see knowing that (Karlskrona - Olofström) is 'higher' than (Karlshamn - Sölvesborg) is useful, but then again it's not my call





City1 - City2 Summary
#Metrics


Karlskrona - Olofström
6


Ronneby - Sölvesborg
4


Olofström - Ronneby
-2


Karlskrona - Sölvesborg
-2


Karlshamn - Olofström
-2


Olofström - Sölvesborg
-4


Karlskrona - Ronneby
-5


Karlshamn - Karlskrona
-5


Karlshamn - Ronneby
-8


Karlshamn - Sölvesborg
-10




It's just that I'm used to (in my tiny world) more of a single parameter evaluation

This is just a simple one city at a time scoring matrix to rank each city (not each pair)


24136

waimea
04-24-2019, 08:42 AM
Hi Paul,

thank you for your reply! I will start with your city by city analysis and see what I get.

Maybe you are right about apples and oranges! A while back I posted a question about regressions, could regressions be used on the city pairs somehow?

Paul_Hossler
04-24-2019, 09:25 AM
Well, it just seems to me that it's hard to take a 2 cities' metric scores and use them

Now if you had a score for each city pair, then it seems you could evaluate the pair which is now just a single paramater

24138


So if comparing Karlshamn to Karlskrona for Metric1 was 82, then I think it would be valid to compare that pair against the others

The hard part would be to figure out how to come up with the 82

I took a shot at trying to derive a city pair score from each combination by subtracting the city score from each other in Workbook to be analyzed_2.xlsx and

waimea
04-25-2019, 10:26 AM
Hi Paul,

thank you for your help with this.

I am going to read the entire thread again and see if I can find a good and meaningfull way to analyze this data.

I am not sure but did you find a way to "select" the possible combinations without reapeating items?

As in your picture above with 10 combinations instead of the whole shebang.

Paul_Hossler
04-25-2019, 12:07 PM
Yes

Analysis depends on what you have and what you're trying to determine

1. If you can determine a parameter for each city pair, i.e. how Karlshamn compares to Karlskrona (-5 in A:B) meaning that Karlshamn is -5 below Karlskrona, that's what I think you're after

2. If you can only determine a parameter for single cities, i.e. Karlshamn (E2 = 45) and another number for Karlskrona (E3 = 82), then you / we need to calculate a single 'value' for the Karlshamn - Karlskrona pair

2a. Subtract (H:H) and if negative score that pair as -1, positive +1. Total (N:N) and the Karlskrona - Olofström pair 'wins' ( =3)

2b. Subtract (H:H) and if negative score that pair as 0, positive +1. Total (T:T) and the Karlskrona - Olofström pair 'wins' ( =3)

3. If you decide you don't need to compare city pairs, but only each city by itself, that's a whole lot easier

24144

waimea
10-09-2019, 03:40 AM
Hi Paul,

sorry for awakening a old thread!


I am still working on this and I have to look at your conceptual genius ideas again!


How do you mean with city by itself?

waimea
10-09-2019, 03:47 AM
As an CONCEPTUAL EXAMPLE .... :yes … you could look at something like this

10 cities (GREEN), 3 metrics (YELLOW)with each metric having a weighting factor (ORANGE)

The BLUE are just random numbers 0-100 but could be your raw scoring of City-X against City-Y for Metric-M (Col A - Col K)

Each raw score is weighted (Col M - Col N) in Col P - Col W

Weighted scores are summed Col Y - Col Z

The weighted summary is sorted High to Low in Col AB - Col AC

I used 2 UDFs and array functions

F9 to recalc and see what happens



24062

I think that this is a good way to go, with weights.

How can I go from my tabular data to a matrix as in your screenshot?

Edit:

I have a small formula that calculates the difference in percentage units between two city pairs.

Say City1 vs City2. City1 could be 30%, and 50% better. Does this help in anyway?

Paul_Hossler
10-09-2019, 08:19 AM
There's the raw data and the metric weights on WS Data

The report sheet and the difference between city pairs weighted

Minor final formatting to report sheet so far, since not sure as to how you want the end product. If it's easy enough, let me know


25246

waimea
10-09-2019, 08:48 AM
Hi Paul,

thank you for your reply and for helping me!

I am looking at your code and I don't understand all of it.

I am at a stage where I can decide what data that is outputted as raw data.

I have score1, score2, total score and categories that are all numeric. There are 14 categories and each category contains a different number of metrics.


I guess that earlier I only saved score1, score2 for city1 and perhaps I should store score1, score2 for city2 aswell?

If you were to select raw data, what kind of data is meaningfull to analyze? And what kind of file structure would you select?

Paul_Hossler
10-09-2019, 09:02 AM
You never mention 'Categories' so I don't know how they would play

My Assumptions --

1. For each metric

2. Compare each city against all others by subtracting scores

3. Weight the difference by the metric weight

4. Highest score wins


In my screen shot above, row 7 City K and R have score of 82 and 38 for metric 1 which has a weight of .05

The weighted difference is 2.2 which is the highest for metric 1


Again ....



I have a small formula that calculates the difference in percentage units between two city pairs.

Say City1 vs City2. City1 could be 30%, and 50% better. Does this help in anyway?


I still don't see how your scoring for city pairs is going to work

To me it makes much more sense to score each city in each metric, weight the score, and them pick the city with the highest weighted score in each metric

waimea
10-09-2019, 09:14 AM
Hi Paul,

I see what you mean and I think that we should try it the way that you are suggesting and forget about the city pairs.

I want to use 4 metrics, score1 and score2 for city 1 and score1 and score2 for city2.

I then add the scores to get 2 total scores which I then weight with weights? This would give me 1 weight for score1 and 1 weight for score2? And perhaps a weight for total score? Or is that just adding useless information?



Name
Score1
Score2
Total score
Score1
Score2
Total score


City1 City1
37
22
59
9
36
45


City1 City2
16
78
44
29
63
92


City1 City3
48
25
29
79
79
158


City1 City4
80
25
39
31
13
44


City1 City5
61
86
81
41
61
102


City1 City6
73
73
42
48
38
86


City1 City7
73
86
0
9
13
22




This is how I think that my raw data will look like.

waimea
10-09-2019, 09:22 AM
I can change the raw data layout to something more approtiate if you think it is better?

Paul_Hossler
10-09-2019, 09:27 AM
Still looks city-pair oriented (before my red lines)

25249





Example

waimea
10-09-2019, 09:36 AM
Hi Paul,

I like your example and the use of weights and then picking the city with the highest total score.

I think that I have to modify how I create my dataset.



'Cityname1 & Cityname2
wkarr(i, 1) = City1 & " " & City2

'Score1
wkarr(i, 2) = sht.Range("I65").Value2

'Score2
wkarr(i, 3) = sht.Range("J65").Value2

'Score1
wkarr(i, 4) = sht.Range("S65").Value2

'Score2
wkarr(i, 5) = sht.Range("T65").Value2


With Sheets("Rawdata")

R = Cells(Rows.Count, 1).End(xlUp).Row + 1
.Cells(R, 1).Resize(UBound(wkarr, 1), UBound(wkarr, 2)) = wkarr

End With



Should I be moving in the other direction?


'Cityname1 & Cityname2
wkarr(1, i) = City1 & " " & City2

'Score1
wkarr(2, i) = sht.Range("I65").Value2

waimea
10-09-2019, 10:00 AM
The score1 of say 31 is calculated from a country average and the score2 is calculated on city1 vs city2.

So I am not sure on how to proceed. Without the city pairs there is no score2.

Perhaps there is another way to calculate the scores? Say percentage increase from 2017 to 2018?

Or something else?

Paul_Hossler
10-09-2019, 12:29 PM
The score1 of say 31 is calculated from a country average and the score2 is calculated on city1 vs city2.

So I am not sure on how to proceed. Without the city pairs there is no score2.

Why do you need a 'score2'?



I still think you're thinking city pairs and I'm not sure I'd do it that way

Very simple example of how I'm used to doing a weighted score matrix

1. You're looking at 5 cars (green)

2. Five factors are important (yellow)

3. Since some are more important than others, you weight them (blue)

a. Total = 100% , from 0 - 10, with 10 being the best

b. Some are subject (e.g. color) and some can be quantitative (e.g. gas mileage, 1-10 mpg = 2, 11-15 mpg = 3, ....., > 40mpg = 10)

4. Score each car on each metric 0 - 10 (orange)

5. Apply weighting factors (gold)

6. Total a composite score (gray)

7. Rank high to low by composite score (brown)

In this context, it doesn't seem important to me that (for ex) Ford minus Chevy Low Cost = 3 since Toyota has the highest weighted score



25250


Your thoughts??

waimea
10-09-2019, 01:03 PM
Hi Paul,

I like it very much!

I am trying to write a macro that loops through one combobox with few values to speed up the time it takes to complete the macro.

I will follow your instructions and see what I come up with.

Green is cities, yellow is metrics?

Say I have one category that has 16 possible points,

1-5 points = OK, 6-10 points = GOOD, 11-16 points = GREAT.

Is that how you would do it?

Paul_Hossler
10-09-2019, 02:46 PM
Say I have one category that has 16 possible points,

1-5 points = OK, 6-10 points = GOOD, 11-16 points = GREAT.


No sure I understand. Weighted scores would need to be numeric

Do you mean that the category has 16 different metrics, each scored (say) 1 - 10?

E.G.

Category = Demographics

1. Education level -- 1- 10 for each city
2. Household income -- 1- 10 for each city
....
16. Household size -- 1- 10 for each city

waimea
10-10-2019, 12:38 AM
Hi Paul,

thank you for your reply!

Say I have one category with one metric in it: Tax rate:

I then have these 5 intervalls:


— 30,00
31,00
32,00
33,00
34,00 —


I would then asssign different points based on the intervalls?


-30 = 10 points
31,00 = 8 points
32,00 = 6 points
33,00 = 4 points
34.00 - 2 points


Lower is better and lower value gets higher points.

How would I then weight the metric?

snb
10-10-2019, 02:40 AM
Why don't you post a realistic example file, containing data and desiterd results ?
Why so cryptic, hideous, secretive ?
Remember this is a public forum.

Paul_Hossler
10-10-2019, 05:14 AM
That's a example of a quantitative parameter

-30 = 10 points
31,00 = 8 points
32,00 = 6 points
33,00 = 4 points
34.00 - 2 points


So if the weighting factor for tax rate as .10 (out of a total of 1.00),

1) City-1 with a 33% rate would have a raw score of 4 and weighting would add 0.4 to the city's composite score
2) City-2 with a 31% rate would have a raw score of 8 and weighting would add 0.8 to the city's composite score

>>>>>> This has the built-in assumption that low tax rates are better than high tax rates <<<<<<

Population, city size (square KM), public parks (square KM), %employment etc. could be treated similarly



City Size, weighting .02

0 - 10,000 = 10 point
10,001 - 25,000 = 9 point
...
Over 10M = 1 point


1) City-1 with 20,000 population would have a raw score of 9 and weighting would add 0.18 to the city's composite score
2) City-2 with 50,000 population would have a raw score of 5 and weighting would add 0.1 to the city's composite score

>>>>>> This has the built-in assumption that small towns/villages are better than large cities <<<<<<



Subjective parameters (quality of life, education, etc.) are a little trickier



I recommend that the data and the scoring algorithm is first, and the reporting is after you're happy with the technique

I think it's easier and more understandable if all scores be 0 - 9, but 10 brackets are not necessary (i.e. 0, 2, 4, 6, and 8 as responses)

Can you provide a small but realistic sample of your data in the format that you collect it in?

Bapec
09-08-2022, 06:24 AM
There is a specific command for such operation:

Select a cell in a data range > select the Analyze Data button on the Home tab.

Let me start by saying that data analysis doesn't imply gathering data itself.
In order to relieve your analysis process, you should equip your Excel with Pivot tables, which is the most popular analytic tool. Data analysis starts with sorting data in multiple columns ascending or descending. The second step would be filtering data according to your needs:


Click on any single cell inside your data
Go to Data Tab > Sort and Filter > Filter

The third step would be COUNTIF used for counting cells, and in the fourth, the command SUMIF function provides the sum of the cell with a single condition inserted.