G05 Class
関数リスト一覧   NagLibrary Namespaceへ  ライブラリイントロダクション  本ヘルプドキュメントのchm形式版

This chapter is concerned with the generation of sequences of independent pseudorandom and quasi-random numbers from various distributions, and the generation of pseudorandom time series from specified time series models.

Syntax

C#
public static class G05
Visual Basic (Declaration)
Public NotInheritable Class G05
Visual C++
public ref class G05 abstract sealed
F#
[<AbstractClassAttribute>]
[<SealedAttribute>]
type G05 =  class end

Background to the Problems

Pseudorandom Numbers

PRNGs can be split into base generators, and distributional generators. Within the context of this document a base generator is defined as a PRNG that produces a sequence (or stream) of variates (or values) Uniformly distributed over the interval 0,1. Depending on the algorithm being considered, this interval may be open, closed or half-closed. A distribution generator is a routine that takes variates generated from a base generator and transforms them into variates from a specified distribution, for example a Uniform, Gaussian (Normal) or gamma distribution.
The period (or cycle length) of a base generator is defined as the maximum number of values that can be generated before the sequence starts to repeat. The initial state of the base generator is often called the seed.
There are five base generators currently available in the nAG Library, these are; a basic linear congruential generator (referred to as the nAG basic generator) (see Knuth (1981)), two sets of Wichmann–Hill generators (see Maclaren (1989) and Wichmann and Hill (2006)), the Mersenne Twister (see Matsumoto and Nishimura (1998)) and the ACORN generator (see Wikramaratna (1989)).

nAG Basic Generator

The nAG basic generator is a linear congruential generator (LCG) and, like all LCGs, has the form:
xi = a1 xi-1  mod  m1 , ui = xi m1 ,
where the ui, for i=1,2,, form the required sequence.
The nAG basic generator uses a1=1313 and m1=259, which gives a period of approximately 257.
This generator has been part of the nAG Library since Mark 6 and as such has been widely used. It suffers from no known problems, other than those due to the lattice structure inherent in all LCGs, and, even though the period is relatively short compared to many of the newer generators, it is sufficiently large for many practical problems.
The performance of the nAG basic generator has been analysed by the Spectral Test, see Section 3.3.4 of Knuth (1981), yielding the following results in the notation of Knuth (1981).
n νn Upper bound for νn
2 3.44×108 4.08×108
3 4.29×105 5.88×105
4 1.72×104 2.32×104
5 1.92×103 3.33×103
6 593 939
7 198 380
8 108 197
9 67 120
The right-hand column gives an upper bound for the values of νn attainable by any multiplicative congruential generator working modulo 259.
An informal interpretation of the quantities νn is that consecutive n-tuples are statistically uncorrelated to an accuracy of 1/νn. This is a theoretical result; in practice the degree of randomness is usually much greater than the above figures might support. More details are given in Knuth (1981), and in the references cited therein.
Note that the achievable accuracy drops rapidly as the number of dimensions increases. This is a property of all multiplicative congruential generators and is the reason why very long periods are needed even for samples of only a few random numbers.

Wichmann–Hill I Generator

Wichmann–Hill II Generator

This Wichmann–Hill base generator (see Wichmann and Hill (2006)) is of the same form as that described in [Wichmann–Hill I Generator], i.e., a combination of four LCGs. In this case a1=11600, m1=2147483579, a2=47003, m2=2147483543, a3=23000, m3=2147483423, a4=33000, m4=2147483123.
Unlike in the original Wichmann–Hill generator, these values are too large to carry out the calculations detailed in (1) using 32-bit integer arithmetic, however, if
wi = 11600 wi-1  mod  2147483579
then setting
Wi = 11600 wi-1  mod  185127 - 10379 wi-1 / 185127
gives
wi = Wi ​ if ​ Wi0 2147483579+Wi ​ otherwise
and Wi can be calculated in 32-bit integer arithmetic. Similar expressions exist for xi, yi and zi. The period of this generator is approximately 2121.
Further details of implementing this algorithm and its properties are given in Wichmann and Hill (2006). This paper also gives some useful guidelines on testing PRNGs.

Mersenne Twister Generator

The Mersenne Twister (see Matsumoto and Nishimura (1998)) is a twisted generalized feedback shift register generator. The algorithm underlying the Mersenne Twister is as follows:
(i) Set some arbitrary initial values x1,x2,,xr, each consisting of w bits.
(ii) Letting
A= 0 Iw-1 aw aw-1a1 ,
where Iw-1 is the w-1×w-1 identity matrix and each of the ai,i=1 to w take a value of either 0 or 1 (i.e., they can be represented as bits). Define
x i+r = x i+s x i ω : l+1 | x i+1 l:1 A ,
where x i ω : l+1 | x i+1 l:1  indicates the concatenation of the most significant (upper) w-l bits of xi and the least significant (lower) l bits of xi+1.
(iii) Perform the following operations sequentially:
z = xi+r xi+r t1 z = z z t2 ​ AND ​ m1 z = z z t3 ​ AND ​ m2 z = z z t4 u i+r = z/ 2w - 1 ,
where t1, t2, t3 and t4 are integers and m1 and m2 are bit-masks and ‘t’ and ‘t’ represent a t bit shift right and left respectively,  is bit-wise exclusively or (xor) operation and ‘AND’ is a bit-wise and operation.
The ui+r, for i=1,2,, form the required sequence. The supplied implementation of the Mersenne Twister uses the following values for the algorithmic constants:
w = 32 a = 0x9908b0 df l = 31 r = 624 s = 397 t1 = 11 t2 = 7 t3 = 15 t4 = 18 m1 = 0x9d2c5680 m2 = 0xefc60000
where the notation 0xDD  indicates the bit pattern of the integer whose hexadecimal representation is DD .
This algorithm has a period length of approximately 219,937-1 and has been shown to be uniformly distributed in 623 dimensions (see Matsumoto and Nishimura (1998)).

ACORN Generator

The ACORN generator is a special case of a multiple recursive generator (see Wikramaratna (1989) and Wikramaratna (2007)). The algorithm underlying ACORN is as follows:
(i) Choose an integer value k1.
(ii) Choose an integer value M, and an integer seed Y00, such that 0<Y00<M and Y00 and M are relatively prime.
(iii) Choose an arbitrary set of k initial integer values, Y01,Y02,,Y0k, such that 0 Y0m<M, for all m=1,2,,k.
(iv) Perform the following sequentially:
Y i m = Y i m-1 + Y i-1 m  mod  M
for m=1,2,,k.
(v) Set ui=Yik/M.
The ui, for i=1,2,, then form a pseudorandom sequence, with ui 0,1, for all i.
Although you can choose any value for k, M, Y00 and the Y0m, within the constraints mentioned in (i) to (iii) above, it is recommended that k10, M is chosen to be a large power of two with M260 and Y00 is chosen to be odd.
The period of the ACORN generator, with the modulus M equal to a power of two, and an odd value for Y00 has been shown to be an integer multiple of M (see Wikramaratna (1992)). Therefore, increasing M will give a series with a longer period.

Quasi-random Numbers

Low discrepancy (quasi-random) sequences are used in numerical integration, simulation and optimization. Like pseudorandom numbers they are uniformly distributed but they are not statistically independent, rather they are designed to give more even distribution in multidimensional space (uniformity). Therefore they are often more efficient than pseudorandom numbers in multidimensional Monte–Carlo methods.
The quasi-random number generators implemented in this chapter generate a set of points x1,x2,,xN with high uniformity in the S-dimensional unit cube IS=0,1S. One measure of the uniformity is the discrepancy which is defined as follows:
  • Given a set of points x1,x2,,xNIS and a subset GIS, define the counting function SNG as the number of points xiG. For each x=x1,x2,,xSIS, let Gx be the rectangular S-dimensional region
    G x = 0, x 1 × 0, x 2 ×× 0, x S
    with volume x1,x2,,xS. Then the discrepancy of the points x1,x2,,xN is
    DN* x1,x2,,xN = sup xIS SN Gx - N x1 , x2 , , xS .
    The discrepancy of the first N terms of such a sequence has the form
    DN*x1,x2,,xNCSlogNS+OlogNS-1  for all  N2.
    The principal aim in the construction of low-discrepancy sequences is to find sequences of points in IS with a bound of this form where the constant CS is as small as possible.

Scrambled Quasi-random Numbers

Scrambled quasi-random sequences are an extension of standard quasi-random sequences that attempt to eliminate the bias inherent in a quasi-random sequence whilst retaining the low-discrepancy properties. The use of a scrambled sequence allows error estimation of Monte–Carlo results by performing a number of iterates and computing the variance of the results.
This implementation of scrambled quasi-random sequences is based on TOMS algorithm 823 and details can be found in the accompanying paper, Hong and Hickernell (2003). Three methods of scrambling are supplied; the first a restricted form of Owen's scrambling (Owen (1995)), the second based on the method of Faure and Tezuka (2000) and the last method combines the first two.
Scrambled versions of both Sobol sequences and the Niederreiter sequence can be obtained.
The efficiency of a simulation exercise may often be increased by the use of variance reduction methods (see Morgan (1984)). It is also worth considering whether a simulation is the best approach to solving the problem. For example, low-dimensional integrals are usually more efficiently calculated by methods in D01 class rather than by Monte–Carlo integration.

Non-uniform Random Numbers

Random numbers from other distributions may be obtained from the uniform random numbers by the use of transformations and rejection techniques, and for discrete distributions, by table based methods.
(a) Transformation Methods
For a continuous random variable, if the cumulative distribution function (CDF) is Fx then for a uniform 0,1 random variate u, y=F-1u will have CDF Fx. This method is only efficient in a few simple cases such as the exponential distribution with mean μ, in which case F-1u=-μlogu. Other transformations are based on the joint distribution of several random variables. In the bivariate case, if v and w are random variates there may be a function g such that y=gv,w has the required distribution; for example, the Student's t-distribution with n degrees of freedom in which v has a Normal distribution, w has a gamma distribution and gv,w=vn/w.
(b) Rejection Methods
Rejection techniques are based on the ability to easily generate random numbers from a distribution (called the envelope) similar to the distribution required. The value from the envelope distribution is then accepted as a random number from the required distribution with a certain probability; otherwise, it is rejected and a new number is generated from the envelope distribution.
(c) Table Search Methods
For discrete distributions, if the cumulative probabilities, Pi=Probxi, are stored in a table then, given u from a uniform 0,1 distribution, the table is searched for i such that Pi-1<uPi. The returned value i will have the required distribution. The table searching can be made faster by means of an index, see Ripley (1987). The effort required to set up the table and its index may be considerable, but the methods are very efficient when many values are needed from the same distribution.

Copulas

A copula is a function that links the univariate marginal distributions with their multivariate distribution. Sklar's theorem (see Sklar (1973)) states that if f is an m-dimensional distribution function with continuous margins f1 , f2 ,, fm , then f has a unique copula representation, c, such that
f x1 , x2 ,, xm = c f1 x1 , f2 x2 ,, fm xm
The copula, c, is a multivariate uniform distribution whose dependence structure is defined by the dependence structure of the multivariate distribution f, with
c u1 , u2 ,, um = f f1-1 u1 , f2-1 u2 , , fm-1 um
where ui 0,1 . This relationship can be used to simulate variates from distributions defined by the dependence structure of one distribution and each of the marginal distributions given by another. For additional information see Nelsen (1998) or Boye (Unpublished manuscript) and the references therein.

Other Random Structures

Multiple Streams of Pseudorandom Numbers

It is often advantageous to be able to generate variates from multiple, independent, streams (or sequences) of random variates. For example when running a simulation in parallel on several processors. There are four ways of generating multiple streams using the routines available in this chapter:
(i) using different initial values (seeds);
(ii) using different generators;
(iii) skip ahead (also called block-splitting);
(iv) leap-frogging.

Multiple Streams via Different Initial Values (Seeds)

A different sequence of variates can be generated from the same base generator by initializing the generator using a different set of seeds. The statistical properties of the base generators are only guaranteed within, not between sequences. For example, two sequences generated from two different starting points may overlap if these initial values are not far enough apart. The potential for overlapping sequences is reduced if the period of the generator being used is large. In general, of the four methods for creating multiple streams described here, this is the least satisfactory.
The one exception to this is the Wichmann–Hill II generator. The Wichmann and Hill (2006) paper describes a method of generating blocks of variates, with lengths up to 290, by fixing the first three seed values of the generator (w0, x0 and y0), and setting z0 to a different value for each stream required. This is similar to the skip-ahead method described in [Multiple Streams via Skip-ahead], in that the full sequence of the Wichmann–Hill II generator is split into a number of different blocks, in this case with a fixed length of 290. But without the computationally intensive initialization usually required for the skip-ahead method.
Using different initial values is the only way of generating multiple streams using this implementation of the Mersenne Twister as the base generator. Although the extremely large period of this generator reduces the chance of the different sequences overlapping, especially if the number of sequences required is small, compared to using a generator that has a smaller period, it is not recommended that the Mersenne Twister be used if multiple streams are required.

Multiple Streams via Different Generators

Multiple Streams via Skip-ahead

Multiple Streams via Leap-frog

Skip-ahead and Leap-frog: An Example

As an illustrative example, a brief description of the algebra behind the implementation of the leap-frog and skip-ahead algorithms for a linear congruential generator (LCG) is given. A linear congruential generator has the form xi+1=a1 xi  mod  m1. The recursive nature of a LCG means that
xi+v = a1 x i+v-1  mod  m1 = a1 a1 x i+v-2  mod  m1  mod  m1 = a 1 2 x i+v-2  mod  m1 = a1v xi  mod  m1 .
The sequence can therefore be quickly advanced v places by multiplying the current state (xi) by a1v  mod  m1, hence skipping the sequence ahead. Leap-frogging can be implemented by using a1k, where k is the number of streams required, in place of a1 in the standard LCG recursive formula, in order to advance k places, rather than one, at each iteration.
In a linear congruential generator the multiplier a1 is constructed so that the generator has good statistical properties in, for example, the spectral test. When using leap-frogging to construct multiple streams this multiplier is replaced with a1k, and there is no guarantee that this new multiplier will have suitable properties especially as the value of k depends on the number of streams required and so is likely to change depending on the application. This problem can be emphasised by the lattice structure of LCGs. Similiarly, the value of a1 is often chosen such that the computation a1 xi  mod  m1 can be performed efficiently. When a1 is replaced by a1k, this is often no longer the case.
Note that, due to rounding, when using a distributional generator, a sequence generated using leap-frogging and a sequence constructed by taking every k value from a set of variates generated without leap-frogging may differ slightly. These differences should only affect the least significant digit.
In addition to the above generators a leap frog version of the Mersenne Twister is provided.

Recommendations on Choice and Use of Available Methods

Pseudorandom Numbers

Initialization

Repeated initialization

As mentioned in [Multiple Streams via Different Initial Values (Seeds)], it is important to note that the statistical properties of pseudorandom numbers are only guaranteed within sequences and not between sequences produced by the same generator. Repeated initialization will thus render the numbers obtained less rather than more independent. In a simple case there should be only one call to the state constructor and this call should be before any call to an actual generation method.

Choice of Base Generator

If multiple sequences are required, then the Wichmann–Hill II generator is a good choice, especially if a period of 290 is sufficient, in which case the method described in [Multiple Streams via Different Initial Values (Seeds)] can be used. This generator has also been shown to perform well on various test suites (see Wichmann and Hill (2006)).
When choosing a base generator, the period of the chosen generator should be borne in mind. A good rule of thumb is never to use more numbers than the square root of the period in any one experiment as the statistical properties are impaired. For closely related reasons, breaking numbers down into their bit patterns and using individual bits may also cause trouble.

Choice of Method for Generating Multiple Streams

If the Wichmann–Hill II base generator is being used, and a period of 290 is sufficient, then the method described in [Multiple Streams via Different Initial Values (Seeds)] can be used. If a different generator is used, or a longer period length is required then generating multiple streams by altering the initial values should be avoided.
Using a different generator works well if less than 277 streams are required.
Of the remaining two methods, both skip-ahead and leap-frogging use the sequence from a single generator, both guarantee that the different sequences will not overlap and both can be scaled to an arbitrary number of streams. Leap-frogging requires no a-priori knowledge about the number of variates being generated, whereas skip-ahead requires you to know (approximately) the maximum number of variates required from each stream. Skip-ahead requires no a-priori information on the number of streams required. In contrast leap-frogging requires you to know the maximum number of streams required, prior to generating the first value. Of these two, if possible, skip-ahead should be used in preference to leap-frogging. Both methods required additional computation compared with generating a single sequence, but for skip-ahead this computation occurs only at initialization. For leap-frogging additional computation is required both at initialization and during the generation of the variates. In addition, as mentioned in [Multiple Streams via Leap-frog], using leap-frogging can, in some instances, change the statistical properties of the sequences being generated.

Copulas

After calling g05rc or g05rd the G01F methods in G01 class can be used to convert the uniform marginal distributors into a different form as required.

Quasi-random Numbers

Prior to generating any quasi-random variates the generator being used must be initialized via g05yl or g05yn. Of these, g05yl can be used to initialize a standard Sobol, Faure or Niederreiter sequence and g05yn can be used to initialize a scrambled Sobol or Niederreiter sequence.
Due to the random nature of the scrambling, prior to calling the initialization routine g05yn one of the pseudorandom initialization routines, the state constructor, must be called.
Once a quasi-random generator has been initialized, using either g05yl or g05yn, one of three generation routines can be called to generate uniformly distributed sequences (g05ym), Normally distributed sequences ( (g05yj not in this release)) or sequences with a log-normal distribution ( (g05yk not in this release)). For example, for a repeatable sequence of scrambled quasi-random variates from the Normal distribution, the state constructor (for a repeatable sequence) must be called first (to initialize a pseudorandom generator), followed by g05yn (to initialize a scrambled quasi-random generator) and then (g05yj not in this release) can be called to generate the sequence from the required distribution.
Sequences from other distributions can be obtained by calling the ‘deviate’ routines supplied in G01 class on the results from g05ym. However, care should be taken when doing this as some of these ‘deviate’ routines are only accurate up to a limited number of significant figures which may effect the statistical properties of the resulting sequence of variates.

References

Inheritance Hierarchy

See Also