algorithm - Choosing k out of n -
i want choose k elements uniformly @ random out of possible n without choosing same number twice. there 2 trivial approaches this.
- make list of
npossibilities. shuffle them (you don't need shufflennumberskof them performing firstksteps of fisher yates). choose firstk. approach takeso(k)time (assuming allocating array of sizentakeso(1)time) ,o(n)space. problem ifksmall relativen. - store set of seen elements. choose number @ random
[0, n-1]. while element in set choose new number. approach takeso(k)space. run-time little more complicated analyze. ifk = theta(n)run-timeo(k*lg(k))=o(n*lg(n))because coupon collector's problem. ifksmall relativentakes moreo(k)because of probability (albeit low) of choosing same number twice. better above solution in terms of space worse in terms of run-time.
my question:
is there o(k) time, o(k) space algorithm k , n?
with o(1) hash table, partial fisher-yates method can made run in o(k) time , space. trick store changed elements of array in hash table.
here's simple example in java:
public static int[] getrandomselection (int k, int n, random rng) { if (k > n) throw new illegalargumentexception( "cannot choose " + k + " elements out of " + n + "." ); hashmap<integer, integer> hash = new hashmap<integer, integer>(2*k); int[] output = new int[k]; (int = 0; < k; i++) { int j = + rng.nextint(n - i); output[i] = (hash.containskey(j) ? hash.remove(j) : j); if (j > i) hash.put(j, (hash.containskey(i) ? hash.remove(i) : i)); } return output; } this code allocates hashmap of 2×k buckets store modified elements (which should enough ensure hash table never rehashed), , runs partial fisher-yates shuffle on it.
here's quick test on ideone; picks 2 elements out of 3 30,000 times, , counts number of times each pair of elements gets chosen. unbiased shuffle, each ordered pair should appear approximately 5,000 (±100 or so) times, except impossible cases both elements equal.
Comments
Post a Comment