octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Low hanging fruit - Accelerated random distribution functions


From: Daniel J Sebald
Subject: Re: Low hanging fruit - Accelerated random distribution functions
Date: Fri, 23 Feb 2007 17:47:57 -0600
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020

David Bateman wrote:
Daniel J Sebald wrote:

David Bateman wrote:

Paul Kienzle wrote:


lognrnd,

Current code can be improved a bit from:

rnd(k) = exp(mu(k)) .* exp(sigma(k).*randn(1,length(k)));

to:

rnd(k) = exp(mu(k) + sigma(k).*randn(1,length(k)));


This will only be a win in the case where mu or sigma are a matrix.

It's still a win even for scalar mu and sigma, isn't it?  A pair of
exp and multiply loses to an add and exp.  The extra exp is
significant, probably the overhead for the octave interpretation masks
that.  Anyway...

But its an exp of a scalar plus a scalar multiply, which is
insignificant.. Which is why I couldn't get an advantage

Principle though.  Exp is a function call (probably recursive approximation) 
while +/-/* are simple one instruction machine cycles.


Why the restriction of mu > 0?  The underlying exponentially
distributed r.v. could have a negative mean with no consequence. Naturally, sigma needs to be greater than 0 from a definition
standpoint.  (And I wonder about the > 0 part because if variance is
zero, that simply means the variable is a constant, exp(mu)).  And
then even if the user specifies a sigma less than zero, it is
effectively still a log normal random variable.  I.e., flip the sign of

sigma(k) .* randn (1, length(k))

and it is an r.v. with the same distribution.  (OK, may still want to
take out the sigma < 0 case.)

Also, I wonder if the special conditioning is necessary.  If mu = Inf,
exp(Inf) = Inf.  If mu = -Inf, exp(-Inf) = 0.  How about avoiding the
use of the index?  In most cases I'd think that the user will have no
pathological cases.  I.e.,

   rnd = exp(mu + sigma .* randn (1, length(sigma)));     k = find
((sigma < 0) | (sigma == Inf));
   if (any (k))
   rnd(k) = NaN
   endif

instead of

   k = find ((mu > 0) & (mu < Inf) & (sigma > 0) & (sigma < Inf));
   if (any (k))
     rnd(k) = exp(mu(k) + sigma(k) .* randn (1, length(k)));     endif

Does that help any for when k is the full array?


The restriction are reproduced from what was in the original code for
geometric_rnd.

Oh, well geometric and log normal are both positive support (one's discrete the 
other continuous).  Quite different otherwise.  That explains things.


Looking at what the matlab code does it restricts sigma >
1 and has no restrictions on mu.. So yes I think you are right and we
should remove the restriction.

Sigma > 1?  There's no reason sigma can't be in (0,1].


Yes I  agree that some of the special casing for NaN values might be
simplified, especially if the underlying operators return NaN for these
cases. I didn't both doing this at this point as I wanted the gains and
simplifications with the minimum work.. I incorporated your ideas in the
attached version of the patch.. However, similar thinks should be down
throughout the *rnd.m functions (while respecting the NaN values), which
I don't propose to do at this time.. Want to propose a patch?

Rather than tic/toc, use cputime().  tic/toc is too susceptible to other system 
processes.  Also, be sure to run the routine once before taking the 
measurement.  (First use is slower.)

octave:1> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.20397
octave:2> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.17997
octave:3> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.18097
octave:4> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.17697
octave:5> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.17897
octave:6> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.17997
octave:7> before_time = mean([0.17997 0.18097 0.17697 0.17897 0.17997])
before_time =  0.17937
octave:8> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.19397
octave:9> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15598
octave:10> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15698
octave:11> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15698
octave:12> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15598
octave:13> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15598
octave:14> after_time = mean([0.15598 0.15698 0.15698 0.15598 0.15598])
after_time =  0.15638
octave:15> percent = 100*(after_time-before_time)/before_time
percent = -12.817

So, a 13% improvement for a 400x400 case in the scalar mu/sigma case.  Suppose 
worth the change.  In any case, I think the change in limits is appropriate.  
Keep in mind that the above are results for a change to exp(# + #) and the 
find() test.  If I change the exp(# + #) back to exp(#)*exp(#), I get:

octave:52> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.19197
octave:53> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15698
octave:54> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15698
octave:55> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15998
octave:56> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15698
octave:57> t=cputime(); y=lognrnd(1,1,400); cputime()-t
ans =  0.15897
octave:58> after_time_2 = mean([0.15698 0.15698 0.15998 0.15698 0.15897])
after_time_2 =  0.15798
octave:59> percent = 100*(after_time_2-before_time)/before_time
percent = -11.926

So that would suggest 1 percent due to the math arrangement, but as David 
explained exp(mu) is a simple scalar so there is really no difference between 
the two.  So let's put the find() back in:

   elseif find ((mu > 0) & (mu < Inf) & (sigma > 0) & (sigma < Inf));

octave:70> after_time_3 = mean([0.17997 0.17997 0.18197 0.18097 0.18097])
after_time_3 =  0.18077
octave:71> percent = 100*(after_time_3-before_time)/before_time
percent =  0.78051

Yes, so it seems that the "find()", on scalars no less, is the main sorce of 
cpu consumption.  Don't understand that one.

Let's look at indexing.  Changing the mu/sigma to matrices... oops, getting a 
bug here with the CVS version:

t=cputime(); y=lognrnd(mu,sigma,400,400); cputime()-t
error: product: nonconformant arguments (op1 is 160000x1, op2 is 1x160000)
error: evaluating binary operator `.*' near line 97, column 45

The documentation says "Both MU and SIGMA must be scalar or of size R by C." so 
I assume what I wrote above is OK.

Dave's version works:

octave:81> after_time_4 = mean([0.26196 0.26196 0.26396 0.26396 0.26396])
after_time_4 =  0.26316

The size of y is 400x400 as I'd expect.  Also, y=lognrnd(mu,-1*sigma,400,400); works as 
expected.  (However, there is a semicolon missing from the end of "rnd(k) = 
NaN".)

Since CVS doesn't work, let me change to
octave:92> sigma = ones(1,160000);
octave:93> mu = ones(1,160000);

octave:101> before_time = mean([0.79288 0.78888 0.79088 0.79388 0.79288])
before_time =  0.79188

and with the patch from Dave:

after_time_5 =  0.26456
octave:109> percent = 100*(after_time_5-before_time)/before_time
percent = -66.591

The removal of (mu > 0) & (mu < Inf) gives

octave:139> before_time = mean([0.69689 0.69190 0.68989 0.69089 0.69089])
before_time =  0.69209
octave:140> after_time_6
after_time_6 =  0.65510
octave:141> percent = 100*(after_time_6-before_time)/before_time
percent = -5.3450

And I see now what Dave was saying about scalar/matrix.  The change from

   rnd = exp (mu) .* exp( sigma .* randn (sz));

accounts for about 8 percent by my rough calculations.

In summary, indexing seems to have most effect (55 percent?), the mu tests had 
rough 5-6 percent, the math structuring 8 to 10 percent.  Of course, the most 
common use by the user will be scalar mu and sigma.

Dan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]