PDA

View Full Version : Solved: generate random sample without replacement and with second criteria



Angeline
01-04-2008, 07:18 PM
Hi Everyone,

Feel inclined to help a struggling biology grad student?

I need to generate a random sample without replacement with a second criteria. I need to make sure that consecutive numbers in the returned array do not equal to +1 or +2 of the number above it.

For example, if I sample without replacement 5 numbers from a list of numbers ranging between 1 and 10 then if the first number returned is 5, the second number could not be 6 or 7. And if the second number is 1, the third number could not be 5 or 1 (sampling without replacement, so not repeating any numbers returned earlier) but it could also not be 2 or 3 (1+1=2 or 1+2=3). Does this make sense?

matthewspatrick has created VBA code to sample without replacement (search "replacement" in the VBase. The posting is titled, "Function to return random sample, with or without replacement") I've attached his sample worksheet as a starting point.

Thanks so much for any help you can offer!!!

Angeline

Bob Phillips
01-05-2008, 03:20 AM
Couple of questions.

I assume you want to be able to take a n-sized sample, one time 10 items, another 15, etc.?

You example uses integer values, but the workbook has decimals. Is it still just 1 or 2 whole integers greater that are taboo?

Angeline
01-05-2008, 10:00 AM
Let's see if I can clarify:

I want to be able to select a random sample (say of size n) from a list of values (x number of values) where typically n<x. So if I understand your question correctly, then yes, I want to be able to change the size of n, manually where necessary.

The example workbook uses decimals but I will use whole integers. Consecutive numbers are taboo if they have been chosen before (no repeating numbers, also called sampling without replacement) or if they are equal to 1+ or 2+ the number before it.

Thanks again for your help!!!

Bob Phillips
01-05-2008, 10:13 AM
Here is my attempt.

Unlike matthewspatrick, my code doesn't need the target array size, the code works it out. It also doesn't need the Transpose, as the code works out whether the target array is columnar or horizontal, and transposes accordingly.

I stole matthewspatrick idea of having arguments to determine whether the numbers have to be unique, and whether a near miss (within 1 or 2) is allowed. Unlike matthewspatrick I don't default them, I prefer to be explicit.

So a typical call would be select L1:L10 and use


=SampleUDF(B2:B1201,TRUE,TRUE)

where all items most be unique, and nears are not allowed; and returns a columnar set of results.

Anothe, select N1:R1, and use


=SampleUDF(B2:B1201,FALSE,TRUE)

where all items don't have to be unique, but again nears are not allowed; and returns a horizontal set of results.

Here is the code



Public Function SampleUDF(ByVal Source As Range, ByVal Unique As Boolean, ByVal NotNear As Boolean)
Dim mpResults As Variant
Dim mpCaller As Range
Dim mpItems As Long
Dim mpSelect As Long
Dim mpRows As Long
Dim mpExtract As Variant
Dim mpValid As Boolean
Dim mpByRow As Boolean
Dim i As Long

Randomize

mpRows = Source.Rows.Count

Set mpCaller = Application.Caller
If mpCaller.Rows.Count > 1 And mpCaller.Columns.Count > 1 Then

SampleUDF = "# Must be single dimension"
Exit Function
End If

If mpCaller.Rows.Count > mpCaller.Columns.Count Then

mpItems = mpCaller.Rows.Count
mpByRow = True
Else

mpItems = mpCaller.Columns.Count
mpByRow = False
End If

ReDim mpResults(1 To mpItems)

mpSelect = Int((Rnd * mpRows) + 1)
mpResults(1) = Source.Cells(mpSelect, 1)
For i = LBound(mpResults) + 1 To mpItems

Do

mpSelect = Int((Rnd * mpRows) + 1)
mpExtract = Source.Cells(mpSelect, 1)
mpValid = ((mpExtract < mpResults(i - 1) Or mpExtract > mpResults(i - 1) + 1) Or Not NotNear) And _
(IsError(Application.Match(mpExtract, mpResults, 0)) Or Not Unique)
Loop Until mpValid

mpResults(i) = Source.Cells(mpSelect, 1)
Next i

For i = i + 1 To UBound(mpResults)
mpResults(i) = ""
Next i

If mpByRow Then

SampleUDF = Application.Transpose(mpResults)
Else

SampleUDF = mpResults
End If

End Function

Angeline
01-05-2008, 11:15 AM
Thanks for your help! This is a great start but there's seems to be a small glitch.

I just ran 200 trials of the sampleUDF function where I chose 5 numbers from a set of numbers between 1-10. In 84 of those trials, a consecutive number was 2+ the number before it. None of the trials returned a number that was 1+ the number before it so I think you're very close. Could it be somewhere in the line of code where you define a valid selection? i.e.

mpValid = ((mpExtract < mpResults(i - 1) Or mpExtract > mpResults(i - 1) + 1) Or Not NotNear) And _
(IsError(Application.Match(mpExtract, mpResults, 0)) Or Not Unique)

Out of the 200 trials, all the integers were returned in pretty much equal frequency so this is great!

Also, sorry I didn't mention this earlier (I forgot), but is it somehow possible to get the random sample to generate a new sample every time 'enter' is hit? Like the random number function does [=rand()]? I want to use this function as part of a resampling strategy where I select a set of numbers and then calculate a statistic and do this a 1,000 times.

Thanks again so much for your help!!!!!

Bob Phillips
01-05-2008, 11:56 AM
I couldn't reproduce that problem, but I guess you could change



mpValid = ((mpExtract < mpResults(i - 1) Or mpExtract > mpResults(i - 1) + 1) Or Not NotNear) And _


to



mpValid = ((mpExtract < mpResults(i - 1) Or mpExtract > mpResults(i - 1) + 2) Or Not NotNear) And _


Hit enter where? The formula is used over a block of cells, so you need to do a select and ctrl-shift-enter to get them recalculated. They will also recalculate if there is any other update on the sheet that causes a recalcuation, because of the Randomize. Or you could just do a Ctrl-Alt-F9, or add a button to do a sheet calculate.

Angeline
01-05-2008, 02:40 PM
This seems to works perfectly!

Also, I included the application, "Application.Volatile" in the function so it recalculates every time a change is made.

Thanks again for your help, xld!!:clap2: