Some Thoughts on Random Number Generation
Posted: Thursday, October 18, 2007
by Wilfred DeVoe
Suppose we wanted to select a number at random, say, from an interval of 1 to 27 inclusive. How could we go about it in a computer program?
For some languages it's surprisingly easy; tools for random number selection exist in these languages. If we wanted to "pick a card" from 1 to 27 in BASIC, or rather the DOS-based variant QBASIC, we would use
LET NRANDOM = INT(RND * 27) + 1
Since QBASIC is a loose, weakly-typed language, NRANDOM is defined "on the fly" and its type determined by what is assigned to it.
To do the same in C++ we'd write
srand((unsigned)time(NULL));
int nrandom = rand() % 27;
In both cases only two lines are needed: one to appropriate the system clock, and one to apply a random number function with the constraints of the language and our chosen limit of 27.
When we need a random number generated by the computer, we have to include a random or quasi-random input parameter; customarily, programmers use the computer's clock.
Java is syntactically similar to C/C++, so we'd expect it to have a very similar way of handling random numbers. However, Java has an object class called Random, which is accessed by
import java.util.Random;
Cay Horstmann¹ says that to get random numbers we'd construct an object of the Random class, then use one of these methods:
nextInt(n) , which returns a random integer between 0 (inclusive) and n (exclusive), and
nextDouble() , which returns a random floating point number between 0 (inclusive) and 1 (exclusive).
Horstmann gives an example with the cast of a die:
Random generator = new Random();
int d = 1 + generator.nextInt(6);
generator.nextInt(6) yields a random number between 0 and 5 inclusive, which we'd correct by adding 1 to get the interval 1 to 6 . If we wanted to get a number between 1 and 27 , we'd substitute 27 for 6 in nextInt() .
Random number functions are available in parametric languages such as SPSS (Statistical Package for the Social Sciences) and SAS (Statistical Analysis System). Usually statistics such as Mean and Standard Deviation are invoked using number parameters or other brief commands. In SPSS code written directly, the only randomizer of which I'm aware is the SAMPLE command, followed by a number which, if memory serves me correctly, is the number of samples to be extracted at random from a list of data records. In that case, the only way we could get a random number between 1 and 27 is to have a file of 27 records each containing only a unique number between 1 and 27 inclusive, and using the SAMPLE command followed by 1. Nowadays, social scientists favor SPSS for Windows, which puts SPSS code "under the hood." As it's a proprietary language with an aggressively-enforced licensing system, I'm afraid to illustrate the graphic user interface from the package, so we'll have to use our imagination for this: a viewer window is invoked along with a Data Editor window that asks for a data source. The viewer window gives program system responses, the Data Editor window lets us choose our analysis. It's possible to enter data in a spreadsheet frame on the Data Editor, and for our purposes it's necessary to create a list of numbers 1.0 to 27.0. From the Data Editor's pulldown menu we can select
Transform-> Random Number Seed
That gives the option of manually choosing a seed value, or allowing the computer to choose it at random.
Transform-> Compute
calls a dialog with a spinner box from which we can choose a function from a long list. For random number selection we'd select
UNIFORM(max) where max can be set to 27, the higher limit of an interval starting at 0 (there's a function named RND , but its name is misleading; it stands for "Round," not "Random."). If the package has been properly licensed, the "OK" button won't be disabled and we may get our random number.
But what can we do when the language we use doesn't have built-in randomize functions? For a programmer who's an astute mathematician it might be a minor inconvenience but not a major roadblock. For others there are algorithms available in some reference materials. Michel Boillot wrote a Fortran textbook² containing a linear congruential algorithm that can be used to write random number generators. In the linear congruential method, a pseudo-random number is generated using an input "seed" value (everywhere I look, it seems, authors use the word " seed"). The basic formula is
X N + 1 = (A x XN + C) MOD M
The value X , as a result of the computations, is always a real number between 0 and 1.
I wrote a Fortran subroutine that uses the linear congruential algorithm:
SUBROUTINE IRNDSB(IFIRST,ILAST,IRNDM,ISEED)
DATA IA,IM,IC/13077,32768,6925/
ISEED=MOD((ISEED*IA+IC),IM)
X=FLOAT(ISEED)/FLOAT(IM)
IRNDM=X*(ILAST-IFIRST+1)+IFIRST
RETURN
END
IFIRST and ILAST represent the endpoints of the interval from which the routine will select the number at random; these variables make the routine general enough to accept any endpoints, not just 1 and 27. IRNDM is the random number returned by the routine. ISEED is the input used. In Fortran every integer variable is prefixed by " I ." The numeric constants 13077 , 32768 , and 6925 were not clearly explained by Boillot to my recollection, but I recognize them as powers of two, minus one.
I'd also written a COBOL version, and not to disparage COBOL (much of my field experience is in it), but that language was not created for brevity, and the COBOL version is rather wordy.
In later editions of his textbook³ Boillot wrote his own Fortran function RAND() , which returns real random numbers between 0 and 1. The calling program uses it to yield an integer between 1 and 6, as below:
NUMBR = RAND(1)*6.0 + 1.0
.
.
.
FUNCTION RAND(K)
INTEGER IM, IB, IA
DATA IM, IB, IA/ 24211, 32767, 19727/
IA = MOD(IM*IA, IB)
RAND = FLOAT(IA)/FLOAT(IB)
RETURN
END
Again substitute 27.0 for 6.0 and we have yet another way of getting the random number we wanted.
This may all be a trivial exercise for some gifted programmers, but it's a nice find for many who need it.
¹ Horstmann, Cay, Big Java (2002, John Wiley & Sons, Inc., pp. 262-263)
² Boillot, Michel, Understanding Fortran77 with Structured Problem Solving (1978 (?), West Publishing Company, St. Paul, New York, Los Angeles, San Francisco)
³ Boillot, Michel, Understanding Fortran77 with Structured Problem Solving (1984, 1987, publisher as above, p. 239)
This Article has been viewed 2,257 times. (Not updated in real-time.)
Top-level comments on this article: (5 total)Wildred an interesting read, thanks for sharing. The only problem with basing random number generation on the clock is that it isn't truely random. If the seed is the same (ie the time is exactly the same) for two itterations then will you not get the same result? I understand it's a bit of an art in itself creating a number generator that looks random since computer programs are inherently ordered. It's my understanding that most random generators, given the same starting point will show the same sequence. And yes I profess to having only limited programming skills. Might not a combination of factors work better such as the PC Clock combined with some other thread/stack/process specific number? Thus given the same exact starting time you get still get a different result. Or is that overly simplistic? Regards, Ben.
To Ben Jones: Yes, you'll get the same result if the sample time is the same value as in a previous computation. This I found out when testing the Fortran subroutine, which I called from within a COBOL program. The system time obtained by the COBOL program was frozen at the time the program began executing. I wasn't sure this would happen with a C++ program running in Windows, so I wrote and ran a mini-program to test the C++ version. I took and displayed eight samples; they all had identical results. When testing the Fortran subroutine on a mainframe I had a SAS (Statistical Analysis System) program produce a list of numbers 1-27 displayed randomly; when I repeated the command, the resulting list was identical to what was displayed previously. I guess these random number generators are one-shot deals. Maybe a steady stream of randomly-produced numbers could occur if the program were fed a steady stream of data from whatever source that could be hashed at intervals to give ever-changing "seed" input to a random number generator. Inputting data from an outside datasource is a non-trivial operation in any language; you have to decide how much overhead you're willing to incur. A Psychology professor showed his class a book that was a table of random numbers. They were called pseudo-random numbers, purposely selected to have a correlation with one another of exactly zero. That sort of phenomenon does not occur in Nature; instead we have areas of variable randomness and pockets of apparent order -- what mathematicians call chaos. "Chaos systems" are a fascinating area of study to some mathematicians. Thank you for your response. Wilfred D. DeVoe
It gave me some idea of how to Create Unique Random number generation.
thanks
I've discovered something that makes me blush.In the body of my article I mention writing a random number generator in COBOL. I'd implemented a linear congruential algorithm explicitly in COBOL, though I left the details out of the article. It turns out, though, that COBOL does have its own random number function. I'm surprised that none of my readers pointed it out. I'll quote from IBM's own COBOL Language Reference:7.1.38 RANDOMThe RANDOM function returns a numeric value that is a pseudo-random number from a rectangular distribution.The function type is numeric.+--- Format -------------------------------------------------------------+| || >>--FUNCTION RANDOM------------------------------------>< || +-(ARGUMENT-1)-+ || |+---------------------------------------------------------------------------+argument-1If argument-1 is specified. it must be zero or a positive integer, up to and including (10**18)-1 which is the maximum value that can be specified in a PIC9(18) fixed item; however, only those in the range from zero up to and including 2,147,483,645 will yield a distinct sequence of pseudo-random numbers.If a subsequent reference specifies argument-1, a new sequence of pseudo-random numbers is started.If the first reference to this function in the run unit does not specify argument-1, the seed value used will be zero.In each case, subsequent references without specifying argument-1 return the next number in the current sequence.The returned value is exclusively between zero and one.For a given seed value, the sequence of pseudo-random numbers will always be the same._Copyright IBM Corp. 1991, 19967.1.38 - 1
I found a previously-lost reference for this article, so I can now properly credit the source.
Under superscript 2, the earlier Fortran textbook by Michel Boillot:
Boillot, Michel, Understanding Fortran (1978, West Publishing Company, St. Paul, pp. 372-375)
I try to give accurate references wherever possible (it's just good scholarship).
We want your comments! If you can read this, you don't have javascript enabled, so you can't use this comment system. Please enable javascript.
