Randomness versus pseudorandomness and drawing without replacement

Top  Previous  Next

Suppose I wish to pick a series of numbers from 1 to 6.

 

I could pick them at random. I could do this by rolling a true die. Every time I rolled the die, I would obtain a number from 1-6 with equal probability. The probability of obtaining a "1" would be 1/6 (approximately 0.17); the probability of obtainin a "2" would be 1/6, and so on.

 

If I rolled the die 100 times, I would expect to get roughly 17 ones, roughly 17 twos, roughly 17 threes, roughly 17 fours, roughly 17 fives, and roughly 17 sixes. It is possible that I would get 100 sixes—but very unlikely (the probability of this is 1.53 × 10–78, or one in six hundred thousand quintillion quintillion quintillion quintillion). But it is certain that I would not get exactly the same number of ones, twos, threes, fours, fives, and sixes (because you can't have six equal whole numbers adding up to 100). If I rolled the die 60 times, you'd expect about 10 of each number—but it's also pretty unlikely that I'd get exactly 10 of each number.

 

If I rolled the die like this, there is absolutely no way to predict the next number that will come up on each roll.

 

So much for randomness.

 

If I want a series of 60 numbers and I want to ensure that I end up with equal numbers of ones, twos, threes, fours, fives, and sixes, I could simply take them in a predictable order: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6. This gives me the distribution I want (ten of each number), but the sequence is anything but random, and is highly predictable. If you know the last number that came up, you have an extremely good idea of what the next number would be.

 

So much for total predictability.

 

To obtain a reasonable combination of unpredictability and obtaining an overall equal distribution of numbers, we could use a pseudorandom technique called drawing without replacement, or draw-without-replacement, sometimes shortened to draw-w/o-replacement or just DWOR. Imagine we would like a sequence of 60 numbers, each in the range 1-6, more or less randomly, but ensuring that we get 10 of each number. We could write the number "1" on ten pieces of paper, the number "2" on ten more pieces of paper, and so on. We could then put all 60 pieces of paper in a hat, shuffle them, and draw them out in sequence, without replacing them. We'd then be guaranteed ten of each type, but the local order would be fairly random. It would not be totally random—if we've drawn out 10 ones, 10 twos, 9 threes, 10 fours, 10 fives, and 10 sixes, then we can be absolutely certain that the last number out of the hat will be a 3. However, we'd have to remember the numbers that had come out so far.

 

So it's impossible to have a completely random order and to guarantee a certain distribution—but drawing without replacement is a good way to approximate randomness while guaranteeing a certain distribution.

 

There are several ways to draw without replacement. I've just described a situation in which 10 copies of each number (1-6) are put into the hat, shuffled, and drawn individually for 60 trials. It's possible (but again extremely unlikely!) that the first ten trials would all be ones, the next ten all twos, and so on. If we wanted to guarantee that in every six consecutive trials we will see each possible digit (1-6) once, we should do something different. We should write the number "1" on one piece of paper, the number "2" on a second piece of paper, and so on, up to six pieces of paper. We should then put the six into the hat, shuffle them, and draw them out (without replacement) for the first six trials. We should then put the six pieces of paper back in the hat, shuffle, and repeat the process for the next six trials, and so on. This process gives less randomness (if you know just the first five trials in a set of six, then in principle, you can have perfect knowledge of the sixth number to be drawn) but better control over the local distribution of numbers.

 

We've just seen two examples in which a list of six numbers (1, 2, 3, 4, 5, 6) are put into a hat and drawn for 60 trials. In the first, we put ten copies of each item in the list into the hat (giving 60 pieces of paper in total) and drew them. I call this a draw-without-replacement (DWOR) multiplier of 10. In the second, we put one copy of each item in the list into the hat, drew them until we'd run out of numbers (after six trials), then put them all back into the hat. I call this a DWOR multiplier of 1.

 

We've just seen the concept of a list of possibilities, which is multiplied by a DWOR multiplier to give a set of options that are then drawn at random without replacement from a hat; when the hat is empty, we restart the process.

 

The bigger the DWOR multiplier, the closer the DWOR technique comes to total randomness (if the DWOR multiplier were infinitely large, they are exactly the same process). The smaller the DWOR multiplier, the closer the technique is to total predictability.

 

This program offers the DWOR technique as a way of selecting possible values for various parameters.