[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Low hanging fruit - Accelerated random distribution functions
From: |
Paul Kienzle |
Subject: |
Re: Low hanging fruit - Accelerated random distribution functions |
Date: |
Fri, 23 Feb 2007 01:06:35 -0500 |
On Feb 22, 2007, at 7:35 PM, David Bateman wrote:
rande, randg and randp do call the old generators with "seed". However,
that is not enough as the these patch (I give the credit to Paul for
the
hint in the randg help text) remove the inverse mapping that was
previously used for a different expression of the random deviate.. I
don't think its possible to have both the speedup and the old
sequence..
On the subject of random number generator speed, I
did a comparison of various open source packages,
with the Mersenne Twister as the base generator.
U: uniform
N: standard normal
X: standard exponential
G(a): standard gamma
P(L): poisson
dist numpy oct* oct R* R GSL
U(0,1) 0.42 0.30 ---- 0.68 ---- 0.33
U(-5,5) 0.42 0.48 ---- 0.66 ---- 0.33
N 0.87 0.38 ---- 1.74 ---- 1.38
X 0.78 0.31 ---- 0.88 ---- 0.75
G(15) 1.67 0.76 1.37 1.99 2.17 3.77
G(0.5) 2.09 1.56 2.16 2.84 2.84 3.20
P(2) 1.68 0.43 1.52 0.64 1.50 1.26
P(9) 3.45 0.48 3.34 0.73 2.15 2.49
P(13) 4.31 1.08 5.44 2.06 2.47 4.00
P(45) 3.44 1.14 6.11 1.95 2.37 5.97
P(1e6) 2.65 1.12 6.45 1.86 2.25 23.73
P(1e9) 2.82 ---- 1.26 1.86 2.25 35.75
*use constant for distribution parameters rather
than forcing a reset for each parameter.
The conclusion is that we are doing very well,
except that we should be using the poisson code
from R when the lambda parameter changes from
sample to sample. For the sake of maintainability,
we may want to use the R code even if it is 2x
slower in the oct* case. I'm working on
converting it.
- Paul
Notes on generators:
uniform: Octave does not accept limits so uniform(0,1)
does less work but uniform(-5,5) must be implemented
as rand(1e6,1)*10-5
normal: Marsaglia's ziggurat from octave is faster
than Box-Muller
exponential: Marsaglia's ziggurat lookup table is faster
than -log(1-uniform)
gamma(a>1): Both numpy and octave use Marsaglia's
algorithm; numpy repeats the sqrt in the initialization
1e6 times so it is slower than octave*; numpy and octave
single do the same amount of work, so presumably the
compiler parameters determine the speed difference (this
is also seen in Poisson)
gamma(a<1): Octave uses gamma(1+a)*uniform^(1/a) which is
equal to gamma(1+a)*exp(-exponential/a); with fast gamma
and fast exponential this is a win.
poisson(L<10): Numpy and octave single both use the direct
method of repeated multiplication of uniforms. Octave*
does a linear lookup in the poisson CDF (large overhead
but good payoff). The higher the lambda shape parameter,
the more numbers needed for repeated multiplication.
poisson(L>12): Octave single uses the Poisson Rejection
Method from Numerical Recipes. Numpy uses the
Transformed Poisson Rejection Method (Hoermann 1992).
R cites:
* Ahrens, J.H. and Dieter, U. (1982). Computer generation of
* Poisson deviates from modified normal distributions.
* ACM Trans. Math. Software 8, 163-179.
Octave repeat uses Zechner's pprsc from Stadloeber's winrand:
* H. Zechner (1994): Efficient sampling from continuous and
* discrete unimodal distributions, Doctoral Dissertation,
* 156 pp., Technical University Graz, Austria.
poisson(L>1e8): Octave switches over to a normal generator
/* normal approximation: from Phys. Rev. D (1994) v50 p1284 */
ret = floor(normal*sqrt(L) + L + 0.5);
if (ret < 0.0) ret = 0; /* will probably never happen */
The cutoff of L <= 1e8 for the normal approximation is based on:
> L=1e8; x=floor(linspace(0,2*L,1000));
> max(abs(normal_pdf(x,L,L)-poisson_pdf(x,L)))
ans = 1.1376e-28
For L=1e7, the max is around 1e-9, which is within the step size
of uniform. For L>1e10 the pprsc function breaks down.
Octave repeat is not listed because of a bug; the corrected
version will be approximately the speed of normal, or about 0.4
Notes on measurements:
numpy 1.0rc1
Timer('import numpy as N;
N.random.ZZZ(shape,size=1000000)').timeit(10)/10
for ZZZ in uniform standard_normal standard_exponential
standard_gamma poisson
octave* 2.1.73 (octave-forge) repeat
tic; ZZZ(shape,1000000,1); toc
for ZZZ in rand randn rande randg randp
octave 2.1.73 (octave-forge)
L=shape*ones(1000000,1); tic; randZZZ(L); toc
for ZZZ in randg randp
R* 2.2.1
gc(); system.time(x<-ZZZ(1000000))[3]
for ZZZ in runif rnorm rexp rgamma(shape=shape) rpois(lambda=shape)
R 2.2.1
L<-runif(1000000)/10+shape;
gc(); system.time(x<-ZZZ(1000000,lambda=L))[3]
for ZZZ in rgamma(shape=shape) rpois(lambda=shape)
GSL 0.9.0
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
int main(int argc, char *argv[])
{
const gsl_rng_type *T;
gsl_rng *r;
int i;
double *x;
gsl_rng_env_setup();
T = gsl_rng_mt19937;
r = gsl_rng_alloc (T);
x = malloc(1000000*sizeof(*x));
for (i=0; i < 1000000; i++) {
x[i] = gsl_ran_ZZZ(r,shape);
}
return 0;
}
for ZZZ in uniform gaussian(1.0) exponential(1.0) gamma(shape,1.0)
poisson(shape)
- Low hanging fruit - Accelerated random distribution functions, David Bateman, 2007/02/22
- Low hanging fruit - Accelerated random distribution functions, John W. Eaton, 2007/02/22
- Re: Low hanging fruit - Accelerated random distribution functions, Paul Kienzle, 2007/02/22
- Re: Low hanging fruit - Accelerated random distribution functions, Paul Kienzle, 2007/02/22
- Re: Low hanging fruit - Accelerated random distribution functions, David Bateman, 2007/02/23
- Re: Low hanging fruit - Accelerated random distribution functions, Paul Kienzle, 2007/02/23
- Re: Low hanging fruit - Accelerated random distribution functions, David Bateman, 2007/02/23
- Re: Low hanging fruit - Accelerated random distribution functions, Paul Kienzle, 2007/02/23
- Re: Low hanging fruit - Accelerated random distribution functions, David Bateman, 2007/02/23
- Re: Low hanging fruit - Accelerated random distribution functions, David Bateman, 2007/02/23
Re: Low hanging fruit - Accelerated random distribution functions, Paul Kienzle, 2007/02/22