Consulting

Page 2 of 4 FirstFirst 1 2 3 4 LastLast
Results 21 to 40 of 66

Thread: Help Required : Random Sampling of Data

  1. #21
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    If you have a large number data points with only one or two instances of each, proportionate really isn't going to work

    Example - look at the yellow columns. There are 20 data points with 12 different values. Assuming that you want to get a proportionate random sample of 10, many of the values (like '2') need .5 of a sample data point

    That was why the lines that you deleted were there, so that if the number of occurrences of a sample value size was so small that no data of that value would be selected, I forced at least 1

    The only alternative I can think of (again this is outside my range) would be to use percentiles and proportionately sample within a percentile (the orange columns)

    Capture.JPG
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  2. #22
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Understood Paul.
    Yes, if the values are numbers then the samples haves to be picked based on the weightage in the ranges as per the list box

    What do you think if I do the below code to proportionate again the shortfall in samples. I hope I will get some more samples

        If n < numRecordsProportional Then
            d = numRecordsProportional - n ' shortfall
            s = WorksheetFunction.Sum(aryStrataSamples)
            For iStrata = 1 To numStrata
                If aryStrataSamples(iStrata) = 0 Then
                Else
                    aryStrataSamples(iStrata) = aryStrataSamples(iStrata) + Round(d * aryStrataSamples(iStrata) / s, 0)
                End If
            Next iStrata
        End If

  3. #23
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Also to consider/giving importance to amount columns in picking samples, I have added a checkbox to select the column which has numbers/amounts to pick the highest values from each numStrata for both equal and proportionate methods.

    I need to think of how this can be addressed in the existing codes. Can you help me in this
    I have made few cosmetic changes in the codes and attached the version here
    Attached Files Attached Files

  4. #24
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Quote Originally Posted by anish.ms View Post
    Also to consider/giving importance to amount columns in picking samples, I have added a checkbox to select the column which has numbers/amounts to pick the highest values from each numStrata for both equal and proportionate methods.

    I need to think of how this can be addressed in the existing codes. Can you help me in this
    I have made few cosmetic changes in the codes and attached the version here
    I didn't like some of your changes, especially the part about using Selection

    Don't understand the purpose or use of Checkbox.

    Using Revenue and Sample Size = 40 with 300 Data Points, one way would be to

    1. Sort the data
    2. Break it into Sample Size = 40 bins
    3. Pick 1 random sample from each bin

    That way you get your 40 samples, and by sorting the data points with the same values will most likely be in the same bins

    In the picture then, you'd have a random pick from the Green, one from the Yellow, one from the blue, .... and one from the Gray

    Capture2.JPG


    It'll take a while to do this, but it's do-able
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  5. #25
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Hi Paul, thanks for your response

    I changed it to selection because finally this will go as an add-in into my excel and it is not necessary to be the data range always starts from row 1 and column 1 and the number of columns also will vary.

    I added check box to give an option to pick samples from the highest values in the strata. For example if Quarter is the selected variable and with equal/proportionate method
    With checkbox false - sample selection as per the current random method
    With checkbox true - the samples shall be picked in the order of highest to lowest values in the selected column (combx_SelHVField)

    attached another sample with 50000 rows data (normally the data range is almost 100k to 200K rows)

    Can you add your above example of 'sort and sample size bins' as a third method in addition to the current proportionate and equal methods

    Can you also check the possibility of picking samples from the highest to lowest values with reference to my check box option.
    Attached Files Attached Files
    Last edited by anish.ms; 08-02-2021 at 11:41 AM.

  6. #26
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Have not forgotten

    Can you also check the possibility of picking samples from the highest to lowest values with reference to my check box option.
    Don't understand. Can you give simple example
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  7. #27
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Quote Originally Posted by Paul_Hossler View Post
    Have not forgotten


    Don't understand. Can you give simple example
    Screenshot 2021-08-05 080115.jpg
    As per the above screenshot if the checkbox is ticked and for example if revenue is the filed selected, then it shall pick samples in the order of highest to lowest values from revenue for each strata (Q1 to Q4). The result would be like below-
    Screenshot 2021-08-05 081125.jpg
    Same for proportionate method, if checkbox is ticked
    Attached Files Attached Files
    Last edited by anish.ms; 08-04-2021 at 08:32 PM.

  8. #28
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Did not do anything with the check box

    Enter the number of bins

    Macro sorts Strata in order

    Divides sorted list into number of bins

    picks one randomly from each bins


    Capture.JPG
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  9. #29
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Thanks a Ton Paul!

  10. #30
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Hi Paul, Please find attached the latest version.
    Request your help on the checkbox option to pick highest value samples.
    Attached Files Attached Files

  11. #31
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    Not clear as to the results you're expecting

    With no changes, I did this ...

    Capture.JPG



    ... and got this

    Capture2.JPG


    What would be the 'Highest Values' from Facility Name?
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  12. #32
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Hi Paul, sorry for not making it very clear.
    combx_SelHVField was showing all column headers. I have now changed it to show only those column headers which has numeric values.
    In the above case, if the field selected is either SubAmt or TAT, then the samples should be the highest 20 values from the filed selected in each Category.
    Hope it is clear now.
    Maybe by ranking the values for each stratum and then sort and mark Yes for first 20 in each stratum
    Thanks in advance!
    Attached Files Attached Files
    Last edited by anish.ms; 08-24-2021 at 08:36 PM.

  13. #33
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    I marked the major changes in the code with ' <<<<<<<<<<<<<<<<<

    Capture.JPG
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  14. #34
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Thanks a lot Paul!
    But the samples are limited to the number of strata
    Last edited by anish.ms; 08-25-2021 at 09:19 PM.

  15. #35
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    The 190 is ignored

    There are 19 strata in "Facility Name" and the check box to pick highest from "SubAmt" is checked

    So the highest SubAmt from each of the 19 "Facility Name" gets the "YES"

    Capture.JPG
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  16. #36
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Hi Paul,
    Sorry for the confusion again.
    I wanted the sample size to be considered like how it was working previously. And if the check box is checked, it shall pick in the order of highest to lowest values instead of random pick.
    If the checkbox is unchecked, then random pick, the way it was working previously
    Thanks for you time.

  17. #37
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    So for example

    High Value NOT Checked

    Strata = "Facility Name" , 19 values
    option = Equal from each
    # = 10

    There are 190 total sample picked, 10 randomly from each Facility Name


    High Value Checked, pick from "SubAmt"

    Strata = "Facility Name" , 19 values

    There are 19 samples picked, each the largest SubAmt in each Facility Name
    Attached Files Attached Files
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  18. #38
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Hi Paul,
    Thanks for your time!

    For example

    High Value NOT Checked

    Strata = "Facility Name" , 19 values
    Option = Equal from each
    # = 10

    There are 190 total sample picked, 10 randomly from each Facility Name


    High Value Checked, pick from "SubAmt"


    Strata = "Facility Name" , 19 values
    Option = Equal from each
    # = 10

    There shall be 190 samples picked, 10 each from highest to lowest in SubAmt for each Facility Name.
    Given below the sample for facility 3

    Screenshot 2021-08-27 085039.jpg

    Similarly for the the other pick options. Hope I was able to make it clear this time.

    I think the latest version is not picking high values
    Attached Files Attached Files

  19. #39
    VBAX Sage
    Joined
    Apr 2007
    Location
    United States
    Posts
    8,711
    Location
    So if the High Values checkbox is check, then

    From the 200 "Facility1" pick 10 randomly, and THEN only selected the highest from the 10?

    Repeat for 10 from Facility2, etc. ???
    ---------------------------------------------------------------------------------------------------------------------

    Paul


    Remember: Tell us WHAT you want to do, not HOW you think you want to do it

    1. Use [CODE] ....[/CODE ] Tags for readability
    [CODE]PasteYourCodeHere[/CODE ] -- (or paste your code, select it, click [#] button)
    2. Upload an example
    Go Advanced / Attachments - Manage Attachments / Add Files / Select Files / Select the file(s) / Upload Files / Done
    3. Mark the thread as [Solved] when you have an answer
    Thread Tools (on the top right corner, above the first message)
    4. Read the Forum FAQ, especially the part about cross-posting in other forums
    http://www.vbaexpress.com/forum/faq...._new_faq_item3

  20. #40
    VBAX Mentor
    Joined
    Nov 2020
    Location
    Cochin, Kerala
    Posts
    314
    Location
    Hi Paul,
    Sorry, what is 200 here?
    No, there is no random picking if the High Values checkbox is checked. 10 highest values from the entire data range for each facility.
    Yes, for all facilities (all Strata)
    Thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •