octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nchoosek warning, checks and tests


From: Francesco Potortì
Subject: Re: nchoosek warning, checks and tests
Date: Wed, 10 Dec 2008 16:56:16 +0100

>> Discussion:
...
>> - log of sum is used instead of product so that overflow is further away
...
>>   if (n == 1)
>> -    if (k > v/2)
>> -      k = v - k;
>> +    k = min (k, v-k);          # improve precision at next step
>> +    A = round (exp (sum (log (v-k+1:v)) - sum (log (2:k))));
>> +    if (A > 1/(exp (2*k*eps) - 1))
>> +      warning ("nchoosek", "nchoosek: possible loss of precision");
>>     endif
>> -    A = round (prod ((v-k+1:v)./(1:k)));

>Why did you replace
>round (prod ((v-k+1:v)./(1:k)))
>by
>round (exp (sum (log (v-k+1:v)) - sum (log (2:k))))
>?

I had included a "Discussion" section at the beginning of my mail
describing a short rationale for every change.  Please have a look at it
also for discussion of other changes before installing this changeset,
as some changes may be objectionable, and feel free to change what you
think is wrong or suboptimal.

For this specific change, here is a more in-depth discussion.

When comparing nchoosek to bincoeff, it turns out that the first one is
thought for integer and not very big numbers, while the second is
tailored for fractional and big numbers.  In fact, bincoeff uses the
gamma function, which allows real arguments, and is faster for big
numbers, because nchoosek creates an array to be summed or multiplied.

So I think it is important for nchoosek to emit a warning when the
result is not exact (Matlab does the same).  However, as far as I can
tell, the error bounds for the two expressions above (the prod and the
sum of logs) is the same, so using one or the other is the same, as long
as we want to have an exact result.  On the other hand, the sum of logs
avoids overflow in some cases, such as v=300, k=140.

Important: this assumes that the log and exp functions produce a
relative error near eps, which I am not sure is true in Octave's
implementation.  If this assumption is wrong, then the prod
implementation is preferable.

To summarise, both implementations give the same results as long as we
are interested in an exact result.  If we can live with an approximate
result, the prod implementation is on the average slightly more accurate
while the sum of logs can handle some more cases without overflow.

Some more considerations:

rethinking about it, the above expression
  if (A > 1/(exp (2*k*eps) - 1))
is equivalent with the much simpler
  if (A*2*k*eps > 1)
moreover, this is not entirely accurate and should be instead:
  if (A*2*k*eps > 0.5)

Note that Matlab is not so picky about rounding errors, and as long as
its help page is accurate, gives a warning only when (A > 1e15).

Please let me know your comments.

-- 
Francesco Potortì (ricercatore)        Voice: +39 050 315 3058 (op.2111)
ISTI - Area della ricerca CNR          Fax:   +39 050 315 2040
via G. Moruzzi 1, I-56124 Pisa         Email: address@hidden
(entrance 20, 1st floor, room C71)     Web:   http://fly.isti.cnr.it/


reply via email to

[Prev in Thread] Current Thread [Next in Thread]