Bob Phillips
04-12-2009, 01:16 PM
SUMPRODUCT Basics
Overview
In its native form, SUMPRODUCT is used to multiply each element of 2 or more arrays and sum the individual products. As an example, suppose that we have the following table
A B
1 1 4
2 2 5
3 3 6
We can use
=SUMPRODUCT(A1:A3,B1:B3)
which returns 32. This is calculated as 1*4+2*5+3*6.
This can be extended to more arrays, but with either 2 or more arrays, it is relatively simple in this form. One thing to take note of at the time is that although it SUMs and it PRODUCTs, there is not a * or a + in sight, these operations are intrinsic to SUMPRODUCT. But ... notice the comma, separating the array to be multiplied (semi-colon for our continental colleagues). Remember this detail, it will be touched on again later.
SUMPRODUCT Is SUMIF
Simple so far, but SUMPRODUCT can be 'tricked' into being a conditional function, like SUMIF, but more like SUMIF on steroids. This is what we will in cover more detail in the following discussions.
Let us assume we have some data in A2:A20. We'll start by considering a simple conditional test, such as
=IF(A2="test",1,0)
We could add this formula to B2, copy down as far as B20, then simply sum B2:B20. This would count how many items in A2:A20 are equal to "test". Of course, we could do a simple COUNTIF(A2:A20,"test"), but we will ignore this as we will get much smarter over the course of the discussions.
If we enter =A2:A20="test" in B2, we get #VALUE!. But, and just trust me on this for now, we will actually have an array of TRUE/FALSE values in B2, we just don't see them. (If we array-entered this formula, we would get either TRUE or FALSE, depending on whether A2 was "test" or not - if this means nothing to you, do not worry about it for now).
Now extend this to evaluate that array with SUMPRODUCT, that is
=SUMPRODUCT(A2:A20="test"),
and we get 0, even if there are some values of "test" within the range A2:A20. So SUMPRODUCT doesn't help then? Well yes it can, but we have to do a bit more work yet.
As it stands so far, with the formula =SUMPRODUCT(A2:A20="test") we are trying to sum an array of values of TRUE and FALSE. For obvious reasons, SUM, and by inference SUMPRODUCT, likes to add numbers, so we need to force this array of TRUE and FALSE values into an array of 1s and 0s for the SUM to get a meaningful result. Why 1 and 0, why not 23 or 76 say? As we are simply trying to count the number of cells in A2:A20 equal to the value "test", adding 1 to our total where this is TRUE is effectively counting those matching cells.
Fortunately, there is an easy way to get where we want to be. Under the covers, Excel treats TRUE as a 1, and FALSE as a 0. If we perform some arithmetic operation on those values, we can coerce the array of TRUE/FALSE into an array of 1/0. One way is to multiply the resultant array by 1, that is
=SUMPRODUCT((A2:A20="test")*1)
which effectively multiples every test by 1, giving 1 or 0, and stores these values in the resultant array.
Another way is to add 0
=SUMPRODUCT((A2:A20="test")+0)
which adds 0 to every test, again giving 1 or 0, and stores these values in the resultant array.
There are many other ways, but for now we will just cover one more, the double unary.
=SUMPRODUCT(--(A2:A20="test"))
This negates every test, giving an array of -1s and 0s, and negates it again, giving an array of 1s and 0s. The single unary has the effect of coercing TRUE and FALSE to a number, the second switches the -1s to 1s, thereby not messing up our results.
Summary
We have learnt two things here. The first is that we can include a conditional test with SUMPRODUCT, and this returns an array of values that can be acted upon further. The second thing is that we can coerce the conditions into numbers that can be summed by using an arithmetic operator, such as the double unary, *1, or +0. This is the fundamental principle of all SUMPRODUCT solutions.
So far, so good, but also so simple. We have only shown how to count items matching a given condition within an array, nothing that needs this powerful beast called SUMPRODUCT ... yet.
Next time
We will extend into multiple conditions.
Feel free to make comments on this post, it will help for further posts.
Overview
In its native form, SUMPRODUCT is used to multiply each element of 2 or more arrays and sum the individual products. As an example, suppose that we have the following table
A B
1 1 4
2 2 5
3 3 6
We can use
=SUMPRODUCT(A1:A3,B1:B3)
which returns 32. This is calculated as 1*4+2*5+3*6.
This can be extended to more arrays, but with either 2 or more arrays, it is relatively simple in this form. One thing to take note of at the time is that although it SUMs and it PRODUCTs, there is not a * or a + in sight, these operations are intrinsic to SUMPRODUCT. But ... notice the comma, separating the array to be multiplied (semi-colon for our continental colleagues). Remember this detail, it will be touched on again later.
SUMPRODUCT Is SUMIF
Simple so far, but SUMPRODUCT can be 'tricked' into being a conditional function, like SUMIF, but more like SUMIF on steroids. This is what we will in cover more detail in the following discussions.
Let us assume we have some data in A2:A20. We'll start by considering a simple conditional test, such as
=IF(A2="test",1,0)
We could add this formula to B2, copy down as far as B20, then simply sum B2:B20. This would count how many items in A2:A20 are equal to "test". Of course, we could do a simple COUNTIF(A2:A20,"test"), but we will ignore this as we will get much smarter over the course of the discussions.
If we enter =A2:A20="test" in B2, we get #VALUE!. But, and just trust me on this for now, we will actually have an array of TRUE/FALSE values in B2, we just don't see them. (If we array-entered this formula, we would get either TRUE or FALSE, depending on whether A2 was "test" or not - if this means nothing to you, do not worry about it for now).
Now extend this to evaluate that array with SUMPRODUCT, that is
=SUMPRODUCT(A2:A20="test"),
and we get 0, even if there are some values of "test" within the range A2:A20. So SUMPRODUCT doesn't help then? Well yes it can, but we have to do a bit more work yet.
As it stands so far, with the formula =SUMPRODUCT(A2:A20="test") we are trying to sum an array of values of TRUE and FALSE. For obvious reasons, SUM, and by inference SUMPRODUCT, likes to add numbers, so we need to force this array of TRUE and FALSE values into an array of 1s and 0s for the SUM to get a meaningful result. Why 1 and 0, why not 23 or 76 say? As we are simply trying to count the number of cells in A2:A20 equal to the value "test", adding 1 to our total where this is TRUE is effectively counting those matching cells.
Fortunately, there is an easy way to get where we want to be. Under the covers, Excel treats TRUE as a 1, and FALSE as a 0. If we perform some arithmetic operation on those values, we can coerce the array of TRUE/FALSE into an array of 1/0. One way is to multiply the resultant array by 1, that is
=SUMPRODUCT((A2:A20="test")*1)
which effectively multiples every test by 1, giving 1 or 0, and stores these values in the resultant array.
Another way is to add 0
=SUMPRODUCT((A2:A20="test")+0)
which adds 0 to every test, again giving 1 or 0, and stores these values in the resultant array.
There are many other ways, but for now we will just cover one more, the double unary.
=SUMPRODUCT(--(A2:A20="test"))
This negates every test, giving an array of -1s and 0s, and negates it again, giving an array of 1s and 0s. The single unary has the effect of coercing TRUE and FALSE to a number, the second switches the -1s to 1s, thereby not messing up our results.
Summary
We have learnt two things here. The first is that we can include a conditional test with SUMPRODUCT, and this returns an array of values that can be acted upon further. The second thing is that we can coerce the conditions into numbers that can be summed by using an arithmetic operator, such as the double unary, *1, or +0. This is the fundamental principle of all SUMPRODUCT solutions.
So far, so good, but also so simple. We have only shown how to count items matching a given condition within an array, nothing that needs this powerful beast called SUMPRODUCT ... yet.
Next time
We will extend into multiple conditions.
Feel free to make comments on this post, it will help for further posts.