algorithm - Choosing k out of n -
i want choose k
elements uniformly @ random out of possible n
without choosing same number twice. there 2 trivial approaches this.
- make list of
n
possibilities. shuffle them (you don't need shufflen
numbersk
of them performing firstk
steps of fisher yates). choose firstk
. approach takeso(k)
time (assuming allocating array of sizen
takeso(1)
time) ,o(n)
space. problem ifk
small relativen
. - store set of seen elements. choose number @ random
[0, n-1]
. while element in set choose new number. approach takeso(k)
space. run-time little more complicated analyze. ifk = theta(n)
run-timeo(k*lg(k))=o(n*lg(n))
because coupon collector's problem. ifk
small relativen
takes moreo(k)
because of probability (albeit low) of choosing same number twice. better above solution in terms of space worse in terms of run-time.
my question:
is there o(k)
time, o(k)
space algorithm k
, n
?
with o(1) hash table, partial fisher-yates method can made run in o(k) time , space. trick store changed elements of array in hash table.
here's simple example in java:
public static int[] getrandomselection (int k, int n, random rng) { if (k > n) throw new illegalargumentexception( "cannot choose " + k + " elements out of " + n + "." ); hashmap<integer, integer> hash = new hashmap<integer, integer>(2*k); int[] output = new int[k]; (int = 0; < k; i++) { int j = + rng.nextint(n - i); output[i] = (hash.containskey(j) ? hash.remove(j) : j); if (j > i) hash.put(j, (hash.containskey(i) ? hash.remove(i) : i)); } return output; }
this code allocates hashmap of 2×k buckets store modified elements (which should enough ensure hash table never rehashed), , runs partial fisher-yates shuffle on it.
here's quick test on ideone; picks 2 elements out of 3 30,000 times, , counts number of times each pair of elements gets chosen. unbiased shuffle, each ordered pair should appear approximately 5,000 (±100 or so) times, except impossible cases both elements equal.
Comments
Post a Comment