Consulting

Results 1 to 12 of 12

Thread: Frequently bought together

  1. #1

    Frequently bought together

    Hello everyone,

    I have a large excel file that has many receipt product lines.
    This excel is formed as follows

    A | B | C
    Receipt Code | SKU Code| SKU Title
    955151151 | 205010110 | Apple Iphone 4 Black
    955151151 | 205010112 | Sony Ericsson
    955151151 | 205010117 | Nokia 4SX
    955151152 | 205010116 | Samsnung 4SX
    955151153 | 205010118 | Casio 4BM
    955151154 | 205010125 | Nokia 4SX
    955151154 | 205010112 | Sony Ericsson
    955151154 | 205010110 | Apple Iphone 4 Black

    So what I would like to extract out of the above set would be
    Sony Ericsoon (205010112) & Apple Iphone 4 Black (205010110)| 2

    So it is similar logic to what e-commerce sites do online "Frequently bought together" style.

    The data set is quite large (5000 lines) so memory buffer limits should be taken into account

    Thanks in advance

  2. #2
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Maybe: =SUMPRODUCT(--(B2:B9=205010112),--(C2:C9="Sony Ericsson"))

  3. #3
    No this doesnt work like that because this way you cant tell if they are in the same receipt and there are like 3000 unique products in the file

  4. #4
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Okay, I missed the IPhone. As to this line:
    Sony Ericsoon (205010112) & Apple Iphone 4 Black (205010110)| 2
    What does the 2 represent?

  5. #5
    2 represents that the combination of Sony Ericsoon (205010112) & Apple Iphone 4 Black (205010110) under same receipt code occurs 2 times.

  6. #6
    Knowledge Base Approver VBAX Guru GTO's Avatar
    Joined
    Sep 2008
    Posts
    3,368
    Location
    Maybe?
    [VBA]=SUM(SUMPRODUCT(--(A2:A9=955151154),--(B2:B9=205010112),--(C2:C9="Sony Ericsson")),SUMPRODUCT(--(A2:A9=955151154),--(B2:B9=205010110),--(C2:C9="Apple Iphone 4 Black")))[/VBA]

  7. #7
    Nope I think it cant be done with a formula needs to be coded on vba:S Imagine the fie has 5.000 lines and we need all posible combinations that occur more than 1 time and we dont have just this combination but many many combinations

  8. #8
    Mac Moderator VBAX Guru mikerickson's Avatar
    Joined
    May 2007
    Location
    Davis CA
    Posts
    2,778
    The OP has your data base.
    What is the question?
    Is it "what is frequently bought with an iPhone"?
    or is it "what items are frequently bought together"?

    In either case a logic like this could be used:

    Filter database to only receipts for iPhone.
    What is the second most frequently bought product from those (filtered) records (presumably "iPhone" would be most frequent)

    For question 2 one could substitute "most frequently bought product" for "iPhone" in the logic.

  9. #9
    The question is "what items are frequently bought together". And the OP has just a small portion of the database. Yep the logic of filtering is correct but I would like to know how I can automate this process since the database has many unique prdouct that needs to be examined so filtering one by one would take ages

  10. #10
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    A few questions

    Why isn't it ?

    Sony Ericsoon (205010112) & Apple Iphone 4 Black (205010110) & Nokia 4SX (205010117) | 3

    Do you just want each receipt listed, or all combinations under each and every receipt?
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  11. #11
    Xld you are right about this combination (there are multiple combinations not just one)

    Sony Ericsoon (205010112) & Apple Iphone 4 Black (205010110) & Nokia 4SX (205010117) | 3 (the number is not 3 thought its 2) because this combination occurs twice:

    Once with the receipt code 955151151
    Once with the receipt code 955151154

    Another combination is the one I listed
    Sony Ericsoon (205010112) & Apple Iphone 4 Black (205010110)| 2 which happens twice as well

    Once with the receipt code 955151151
    Once with the receipt code 955151154

    So in another words i need all combinations under each and every receipt

  12. #12
    Mac Moderator VBAX Guru mikerickson's Avatar
    Joined
    May 2007
    Location
    Davis CA
    Posts
    2,778
    I've been thinking about this problem.
    The size of the problem isn't how many rows of data that you have, the size of the problem is the number of individuals SKU's.

    In the attached, I added another receipt and created a chart. So that K4 holds the number of people who bought both a Nokia 4SX (E4) and an Orange (K1)

    This approach uses names.
    Name: Receipts RefersTo: =Sheet1!$A$2:$A$13
    Name: Titles RefersTo: =Sheet1!$C$2:$C$13

    Name: FilterOne RefersTo: =--(Titles=Sheet1!F$1)*Receipts
    Name: FilterTwo RefersTo: =--(Titles=Sheet1!$E2)*Receipts

    (The relative addresses in FilterOne and FilterTwo are relative to F2)

    Then the CSE formula
    =COUNT(IF(FilterOne<>0,(MATCH(FilterOne,FilterTwo,0))))

    returns the count of common receipts.
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •