Excel Hints

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 21

Thread: SUMPRODUCT Step By Step - Part 1

  1. #1
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location

    SUMPRODUCT Step By Step - Part 1

    SUMPRODUCT Basics

    Overview

    In its native form, SUMPRODUCT is used to multiply each element of 2 or more arrays and sum the individual products. As an example, suppose that we have the following table

         A      B
    1    1      4
    2    2      5
    3    3      6
    We can use

    =SUMPRODUCT(A1:A3,B1:B3)

    which returns 32. This is calculated as 1*4+2*5+3*6.

    This can be extended to more arrays, but with either 2 or more arrays, it is relatively simple in this form. One thing to take note of at the time is that although it SUMs and it PRODUCTs, there is not a * or a + in sight, these operations are intrinsic to SUMPRODUCT. But ... notice the comma, separating the array to be multiplied (semi-colon for our continental colleagues). Remember this detail, it will be touched on again later.

    SUMPRODUCT Is SUMIF

    Simple so far, but SUMPRODUCT can be 'tricked' into being a conditional function, like SUMIF, but more like SUMIF on steroids. This is what we will in cover more detail in the following discussions.

    Let us assume we have some data in A2:A20. We'll start by considering a simple conditional test, such as

    =IF(A2="test",1,0)

    We could add this formula to B2, copy down as far as B20, then simply sum B2:B20. This would count how many items in A2:A20 are equal to "test". Of course, we could do a simple COUNTIF(A2:A20,"test"), but we will ignore this as we will get much smarter over the course of the discussions.

    If we enter =A2:A20="test" in B2, we get #VALUE!. But, and just trust me on this for now, we will actually have an array of TRUE/FALSE values in B2, we just don't see them. (If we array-entered this formula, we would get either TRUE or FALSE, depending on whether A2 was "test" or not - if this means nothing to you, do not worry about it for now).

    Now extend this to evaluate that array with SUMPRODUCT, that is

    =SUMPRODUCT(A2:A20="test"),

    and we get 0, even if there are some values of "test" within the range A2:A20. So SUMPRODUCT doesn't help then? Well yes it can, but we have to do a bit more work yet.

    As it stands so far, with the formula =SUMPRODUCT(A2:A20="test") we are trying to sum an array of values of TRUE and FALSE. For obvious reasons, SUM, and by inference SUMPRODUCT, likes to add numbers, so we need to force this array of TRUE and FALSE values into an array of 1s and 0s for the SUM to get a meaningful result. Why 1 and 0, why not 23 or 76 say? As we are simply trying to count the number of cells in A2:A20 equal to the value "test", adding 1 to our total where this is TRUE is effectively counting those matching cells.

    Fortunately, there is an easy way to get where we want to be. Under the covers, Excel treats TRUE as a 1, and FALSE as a 0. If we perform some arithmetic operation on those values, we can coerce the array of TRUE/FALSE into an array of 1/0. One way is to multiply the resultant array by 1, that is

    =SUMPRODUCT((A2:A20="test")*1)

    which effectively multiples every test by 1, giving 1 or 0, and stores these values in the resultant array.

    Another way is to add 0

    =SUMPRODUCT((A2:A20="test")+0)

    which adds 0 to every test, again giving 1 or 0, and stores these values in the resultant array.

    There are many other ways, but for now we will just cover one more, the double unary.

    =SUMPRODUCT(--(A2:A20="test"))

    This negates every test, giving an array of -1s and 0s, and negates it again, giving an array of 1s and 0s. The single unary has the effect of coercing TRUE and FALSE to a number, the second switches the -1s to 1s, thereby not messing up our results.

    Summary

    We have learnt two things here. The first is that we can include a conditional test with SUMPRODUCT, and this returns an array of values that can be acted upon further. The second thing is that we can coerce the conditions into numbers that can be summed by using an arithmetic operator, such as the double unary, *1, or +0. This is the fundamental principle of all SUMPRODUCT solutions.

    So far, so good, but also so simple. We have only shown how to count items matching a given condition within an array, nothing that needs this powerful beast called SUMPRODUCT ... yet.

    Next time

    We will extend into multiple conditions.


    Feel free to make comments on this post, it will help for further posts.
    Last edited by xld; 05-07-2009 at 05:37 AM.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  2. #2
    Bob, excellent start and so simple to absorb!
    Quote Originally Posted by xld
    =SUMPRODUCT(--(A2:A20="test"))

    This negates every test, giving an array of -1s and 0s, and negates it again, giving an array of 1s and 0s. The single unary has the effect of coercing TRUE and FALSE to a number, the second switches the -1s to 1s, thereby not messing up our results.
    Now i understand the use of the unary, can you tell me, as we are SUMming an array for an instance of text does that mean we have no use for the unary if SUMming numbers?, if we do use the unary (i assume we do really as you explained it changes the results from true/false to 0's & 1's) what is it's effect on numbers say if we were looking for -1 etc?
    Regards,
    Simon
    Please read this before cross posting!
    In the unlikely event you didn't get your answer here try Microsoft Office Discussion @ The Code Cage
    If I have seen further it is by standing on the shoulders of giants.
    Isaac Newton, Letter to Robert Hooke, February 5, 1675 English mathematician & physicist (1642 - 1727)

  3. #3
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    Simon,

    In the examples shown so far, the coercion was/is needed on the conditional test. As such, it is not material whether the test is for text or for numbers. SO if A2:A20 contained numbers, and we wanted to count those > 10, we would still use some coercion operator, such as

    =SUMPRODUCT(--(A2:A2>10))

    Is that what you were asking?
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  4. #4
    Quote Originally Posted by xld
    Simon,

    In the examples shown so far, the coercion was/is needed on the conditional test. As such, it is not material whether the test is for text or for numbers. SO if A2:A20 contained numbers, and we wanted to count those > 10, we would still use some coercion operator, such as

    =SUMPRODUCT(--(A2:A2>10))

    Is that what you were asking?
    In a way yes (although i pretty much new the answer before i posted), however i was more interested as to whether the coercion was affected by the numbers that were already -1, 1 and 0.

    Like:
    =SUMPRODUCT(--(A2:A2=-1))
    Regards,
    Simon
    Please read this before cross posting!
    In the unlikely event you didn't get your answer here try Microsoft Office Discussion @ The Code Cage
    If I have seen further it is by standing on the shoulders of giants.
    Isaac Newton, Letter to Robert Hooke, February 5, 1675 English mathematician & physicist (1642 - 1727)

  5. #5
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    No, it is not affected, because the conditional test still returns TRUE or FALSE, EVEN when the values being tested for are -1, 1, or 0. It is the test that needs coercing, not the values (this is a common misconception that you often see in SP formulae, like so

    =SUMPRODUCT(--(A2:A20="x"),--(B2:B20="y"),--(C2:C20))

    But that will be covered more in part 2 or 3.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  6. #6
    Thanks for the detailed explanation!
    Regards,
    Simon
    Please read this before cross posting!
    In the unlikely event you didn't get your answer here try Microsoft Office Discussion @ The Code Cage
    If I have seen further it is by standing on the shoulders of giants.
    Isaac Newton, Letter to Robert Hooke, February 5, 1675 English mathematician & physicist (1642 - 1727)

  7. #7
    Site Admin
    Urban Myth
    VBAX Guru
    Joined
    May 2004
    Location
    Oregon, United States
    Posts
    4,842
    Location
    The way I always looked at it, Simon, was if the array in question was a condition of mine, i.e. it would return a boolean result, it would need coercion. If it needs coercion, you need either the *1, +0 or --. Thus if the array doesn't need coercion, then no mathematical coercion is necessary. Not sure if that helps..

  8. #8
    Thanks Zack (oops Lawrence!), i understood we could do without using the unary i was just curious to find out if the coercion of the number -1 to -1 by the first unary caused a problem of any kind....evidently not, but i do thank you for your take on the quandry
    Regards,
    Simon
    Please read this before cross posting!
    In the unlikely event you didn't get your answer here try Microsoft Office Discussion @ The Code Cage
    If I have seen further it is by standing on the shoulders of giants.
    Isaac Newton, Letter to Robert Hooke, February 5, 1675 English mathematician & physicist (1642 - 1727)

  9. #9

    Numeric Value of TRUE

    Quote Originally Posted by xld
    Under the covers, Excel treats TRUE as a 1, and FALSE as a 0. If we perform some arithmetic operation on those values, we can coerce the array of TRUE/FALSE into an array of 1/0. One way is to multiply the resultant array by 1, that is

    =SUMPRODUCT((A2:A20="test")*1)

    which effectively multiples every test by 1, giving 1 or 0, and stores these values in the resultant array.
    How sensible, but I always thought that the numeric values of TRUE and FALSE were 0 and -1 respectively. How did I manage such a basic misconception? Where within Micorsoft's help system does it provide such vital information?

  10. #10
    Quote Originally Posted by swaatacba
    How sensible, but I always thought that the numeric values of TRUE and FALSE were 0 and -1 respectively. How did I manage such a basic misconception? Where within Micorsoft's help system does it provide such vital information?

    About SUMPRODUCT() ? or about the numeric values of true and false? If the former, nowhere. If the latter it's just standard, not MS specific. See.

  11. #11
    Site Admin
    Urban Myth
    VBAX Guru
    Joined
    May 2004
    Location
    Oregon, United States
    Posts
    4,842
    Location
    Of course, when dealing with VBA it's different as well. False is still = 0, but True returns to -1. This should show it (in your Immediate Window):
    [vba]?"True = " & clng(true) & ", False = " & clng(false)[/vba]

    The bum part about Microsoft's SUMPRODUCT() (of course a reason for this forum) is that formula was never really meant to do what its main uses are for. Part of the beauty of keeping things open-ended, even if it is unintentional.

  12. #12
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    Quote Originally Posted by swaatacba
    How sensible, but I always thought that the numeric values of TRUE and FALSE were 0 and -1 respectively. How did I manage such a basic misconception? Where within Micorsoft's help system does it provide such vital information?
    In most computer languages (all of the ones that I know), including VBA, TRUE is -1, but Excel treats it as 1 for some reason. Actually it treats any non-zero value as TRUE, but an actual TRUE can be coerced to 1.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  13. #13
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    Quote Originally Posted by swaatacba
    Where within Micorsoft's help system does it provide such vital information?
    It doesn't, nor does it explain how to use SUMPRODUCT in this way.

    I tell this story often, but it is enlihtening I feel. I was in Redmond at Microsoft one year, and I met up with an Excel help technical writer. This guy had no idea that you could use SUMPRODUCT in this way, it was all news to him.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  14. #14
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    Quote Originally Posted by Zack Barresse
    The bum part about Microsoft's SUMPRODUCT() (of course a reason for this forum) is that formula was never really meant to do what its main uses are for.
    I don't think that is bum at all. If it had been designed to do (some of these things) you can bet your life it would be far more limited than it is now (just like SUMIF is) precisely because constraints would be designed in.

    And of course, it is not actually anything about SUMPRODUCT per se, all of these things (and more) can be done with array SUM formulae.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  15. #15
    Site Admin
    Urban Myth
    VBAX Guru
    Joined
    May 2004
    Location
    Oregon, United States
    Posts
    4,842
    Location
    True, it is quite helpful, didn't mean it in any derogatory way.

  16. #16
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    Quote Originally Posted by Zack Barresse
    True, it is quite helpful, didn't mean it in any derogatory way.
    I should hope so too, you know you can't diss SP with me
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  17. #17
    Site Admin
    Urban Myth
    VBAX Guru
    Joined
    May 2004
    Location
    Oregon, United States
    Posts
    4,842
    Location
    Bob, I would never!

  18. #18
    Heh this is cool stuff. Well written guide but I can see it being a confusing subject for complete newbs to this topic. Jumping to the next thread. Hope I'll have more to say there

  19. #19

    Breakthrough!

    Seriously, the best SUMPRODUCT guide I have seen. Totally had an "ah ha" moment while sitting on the loo.

    Well done boys.

  20. #20
    Distinguished Lord of VBAX VBAX Grand Master xld's Avatar
    Joined
    Apr 2005
    Posts
    24,101
    Location
    I hope you didn't take your computer in there.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •