swarm-modeling
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Randomly initializing a N-element vector


From: Dwight W. Read
Subject: Re: Randomly initializing a N-element vector
Date: Sun, 10 Oct 1999 15:40:51 +0100


REPLACE MY PREVIOUS EMAIL BY THIS ONE.  THE PREVIOUS EMAIL USED "n" IN TWO DIFFERENT WAYS AND THE SUGGESTED SAMPLING SCHEME IS VALID BUT IMPRACTICAL.  D. Read

Dave Koelle suggested that the sampling be done by selecting the first coordinate, x1, of the vector from a uniform distribution over the interval [0,1], then the second coordinate, x2, of the vector from a uniform distribution over the interval [0, 1-x1], . . . , the n-1 coordinate from a uniform distribution over the interval [0, 1-x1-x2-...-xn-2], and finally set the nth coordinate xn = 1 - (x1+x2...+xn-1).

This sampling scheme won't work for the problem as it was posed.

Here's why.

(1) Let T be the space defined by the constraint, x1 + x2 + ... + xn = 1 and let S be the space defined by the constraint x1 + x2 + ... + xn-1 <= 1.  The space T  is in 1-1 correspondance with the space S under the mapping  m : S -- > T defined by m((x1, x2, ..., xn-1)) = (x1, x2, ..., xn-1, 1-(x1+...+xn-1)).  Hence we can examine the proposed sampling scheme by examing whether or not it samples the space, S, uniformly.

(2) To see that proposed sampling scheme does not sample S uniformly, suppose we set n = 3 and restrict the values of xi to the set {0, 1/2, 1}.  Then S consists of the 6 pairs (0,0), (0, 1/2), (0, 1), (1/2, 0), (1/2, 1/2) and (1,0).  Our sampling scheme must select each point in S with probablity 1/6  if the sampling scheme samples S uniformly.  However the propose sampling scheme has the following, non-uniform probability distribution:
  Pr[(0,0)] = Pr[(0,1/2)] = Pr[(0,1)] = 1/9
  Pr[(1/2,0)=  Pr[(1/2,1/2)] = 1/6
  Pr[(1,0)] = 1/3.

More generally, if the xi are from the m-element set {0, 1/(m-1), 2/(m-1), ... , 1} then under the proposed sampling scheme, the probabiliy of selecting a point, p, in S is given by: Pr[p] = (1/m)(1/(m-k) ), where p = ( _ , k/(m-1)).  Cleary this does not define a uniform distribution over S.  The space S has m(m+1)/2 points, so any proposed sampling scheme must sample each point in S with probability 2/(m(m+1))  if the samping scheme is to sample the space T uniformly.

The query about bias in the last coordinate towards small numbers must first take into account that the last coordinate will be small in a uniform distribution over S.  In the example, if the 6 points were equally probable, then the probability of the 2nd coodinate being 0 is 9/18, the probability of the 2nd coordinate being 1/2 is 6/18 and the probabiliy that the 2nd coordiate is 1 is 3/18.

For the proposed sampling scheme, the probability that the 2nd coordinate is 0 is 11/18, the probablity  that the 2nd coordinate is 1/2  is 5/18, and the probability that the 2nd coordinate is 1 is 2/18, so there is a bias towards small numbers in the last coordinate under the proposed sampling scheme over what would occur with a uniform distribution over S.

More generally, with S having m points as discussed above, Pr[( _ , 0)] = 2/(m+1)  --> 0 for a uniform distribution over S, whereas Pr[( _ , 0)] = 1/m[1/m + 1/(m-1)  +...+1] for the proposed sampling scheme and since the term in the [ ] scales with m, Pr[( _, 0)] will be on the order of 1 as m increases for the proposed sampling scheme.

Randomly rearranging the coordinates won't solve the non-uniformity of the sampling scheme.

Dwight Read



Dwight W. Read, Professor
Department of Anthorpology and
Department of Statistics
University of California, Los Angeles
Visiting Professor
Department of Anthropology
University of Kent at Canterbury, Kent, UK
Email: address@hidden
or   address@hidden
Office Phone: (310) 825-3988
FAX: (310) 556-0703

reply via email to

[Prev in Thread] Current Thread [Next in Thread]