[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #54619] randi() is biased
From: |
Rik |
Subject: |
[Octave-bug-tracker] [bug #54619] randi() is biased |
Date: |
Thu, 6 Sep 2018 14:24:07 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0 |
Follow-up Comment #3, bug #54619 (project octave):
the rand from Octave is implemented in liboctave/numeric/randmtzig.cc. As far
as I can tell, there is no 64-bit integer that is cast to double. Similarly,
there is no "division + floor" in randi. There is multiplication, and floor
is used to remove the fractional part. Because I can't verify some of the
first assertions, I can't go on to accept the conclusions that follow. The
Stack Overflow response has this comment
The suggestion to use floating-point division is mathematically plausible but
suffers from rounding issues in principle. Perhaps double is high-enough
precision to make it work; perhaps not. I don't know and I don't want to have
to figure it out; in any case, the answer is system-dependent.
So it's not clear, even to the original author, that Octave's current system
doesn't work well enough. Before any large project is undertaken it makes
sense to have an idea of the costs and the benefits. It seems to me that the
benefits need to be quantified. How large is the bias of the existing system
(if it even exists) that is implemented in Octave?
Also, relevant, are we any worse than the competition? According to The
Mathworks, the algorithm Octave implements is the one suggested for random
double numbers (http://www.mathworks.com/help/matlab/ref/rand.html)
Generate a 10-by-1 column vector of uniformly distributed numbers in the
interval (-5,5).
r = -5 + (5+5)*rand(10,1)
r = 10×1
3.1472
4.0579
-3.7301
4.1338
1.3236
-4.0246
-2.2150
0.4688
4.5751
4.6489
In general, you can generate N random numbers in the interval (a,b) with the
formula r = a + (b-a).*rand(N,1).
For random integers, The Mathworks suggestion is to use randi(), but I don't
know if they implement something special there or not.
As an attempt at quantifying the bias, I used
x = randi ([1, 10], 1e8, 1);
[nn,xx] = hist (x, 1:10);
errpct = abs (nn - 1e7) / 1e7 * 100
errpct =
Columns 1 through 8:
0.0376100 0.0484100 0.0173600 0.0174000 0.0120900 0.0249200
0.0092400 0.0054200
Columns 9 and 10:
0.0250300 0.0050400
So, as much as .05% deviation, but this could be statistical fluctuations.
Running a second time I get
errpct =
Columns 1 through 8:
0.0057600 0.0249700 0.0160600 0.0201400 0.0295600 0.0249300
0.0148500 0.0361300
Columns 9 and 10:
0.0643700 0.0242500
This time it is a .06% deviation, but it is not in the same bin as before so
it doesn't look like systematic bias.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?54619>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #54619] randi() is biased, anonymous, 2018/09/04
- [Octave-bug-tracker] [bug #54619] randi() is biased, Rik, 2018/09/05
- [Octave-bug-tracker] [bug #54619] randi() is biased, anonymous, 2018/09/06
- [Octave-bug-tracker] [bug #54619] randi() is biased,
Rik <=
- [Octave-bug-tracker] [bug #54619] randi() is biased, anonymous, 2018/09/06
- [Octave-bug-tracker] [bug #54619] randi() is biased, Rik, 2018/09/06
- [Octave-bug-tracker] [bug #54619] randi() is biased, Juan Pablo Carbajal, 2018/09/06
- [Octave-bug-tracker] [bug #54619] randi() is biased, Michael Godfrey, 2018/09/06
- [Octave-bug-tracker] [bug #54619] randi() is biased, Michael Godfrey, 2018/09/07
- [Octave-bug-tracker] [bug #54619] randi() is biased, Rik, 2018/09/07
- [Octave-bug-tracker] [bug #54619] randi() is biased, Michael Godfrey, 2018/09/07
- [Octave-bug-tracker] [bug #54619] randi() is biased, anonymous, 2018/09/07
- [Octave-bug-tracker] [bug #54619] randi() is biased, Michael Leitner, 2018/09/07
- [Octave-bug-tracker] [bug #54619] randi() is biased, Juan Pablo Carbajal, 2018/09/07