PoliciesMultiPlayers.DepRound module¶
DepRound()
: implementation of the dependent rounding procedure, from [[Dependent rounding and its applications to approximation algorithms, by R Gandhi, S Khuller, S Parthasarathy, Journal of the ACM, 2006](http://dl.acm.org/citation.cfm?id=1147956)].
It solves the problem of efficiently selecting a set of \(k\) distinct actions from \(\{1,\dots,K\}\), while satisfying the condition that each action \(i\) is selected with probability \(p_i\) exactly.
The distribution \((p_1, \dots, p_K)\) on \(\{1,\dots,K\}\) is assumed to be given.
Dependent rounding developed by [Gandhi et al.] is a kind of technique that randomly selects a set of edges from a bipartite graph under some cardinality constraints.
It runs in \(\mathcal{O}(K)\) space complexity, and at most \(\mathcal{O}(K^2)\) time complexity (note that the article [Uchiya et al., 2010] wrongly claim it is in \(\mathcal{O}(K)\)).
References: see also https://www.cs.umd.edu/~samir/grant/jacm06.pdf
-
PoliciesMultiPlayers.DepRound.
random
() → x in the interval [0, 1).¶
-
PoliciesMultiPlayers.DepRound.
DepRound
(weights_p, k=1)[source]¶ [[Algorithms for adversarial bandit problems with multiple plays, by T.Uchiya, A.Nakamura and M.Kudo, 2010](http://hdl.handle.net/2115/47057)] Figure 5 (page 15) is a very clean presentation of the algorithm.
Inputs: \(k < K\) and weights_p \(= (p_1, \dots, p_K)\) such that \(\sum_{i=1}^{K} p_i = k\) (or \(= 1\)).
Output: A subset of \(\{1,\dots,K\}\) with exactly \(k\) elements. Each action \(i\) is selected with probability exactly \(p_i\).
Example:
>>> import numpy as np; import random >>> np.random.seed(0); random.seed(0) # for reproductibility! >>> K = 5 >>> k = 2
>>> weights_p = [ 2, 2, 2, 2, 2 ] # all equal weights >>> DepRound(weights_p, k) [3, 4] >>> DepRound(weights_p, k) [3, 4] >>> DepRound(weights_p, k) [0, 1]
>>> weights_p = [ 10, 8, 6, 4, 2 ] # decreasing weights >>> DepRound(weights_p, k) [0, 4] >>> DepRound(weights_p, k) [1, 2] >>> DepRound(weights_p, k) [3, 4]
>>> weights_p = [ 3, 3, 0, 0, 3 ] # decreasing weights >>> DepRound(weights_p, k) [0, 4] >>> DepRound(weights_p, k) [0, 4] >>> DepRound(weights_p, k) [0, 4] >>> DepRound(weights_p, k) [0, 1]
See [[Gandhi et al, 2006](http://dl.acm.org/citation.cfm?id=1147956)] for the details.