Policies.IMED module¶
The IMED policy of [Honda & Takemura, JMLR 2015].
Reference: [[“Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards”, J. Honda and A. Takemura, JMLR, 2015](http://jmlr.csail.mit.edu/papers/volume16/honda15a/honda15a.pdf)].
-
Policies.IMED.Dinf(x=None, mu=None, kl=CPUDispatcher(<function klBern>), lowerbound=0, upperbound=1, precision=1e-06, max_iterations=50)[source]¶ The generic Dinf index computation.
x: value of the cum reward,mu: upperbound on the meany,kl: the KL divergence to be used (klBern(),klGauss(), etc),lowerbound,upperbound=1: the known bound of the valuesyandx,precision=1e-6: the threshold from where to stop the research,max_iterations: max number of iterations of the loop (safer to bound it to reduce time complexity).
\[D_{\inf}(x, d) \simeq \inf_{\max(\mu, \mathrm{lowerbound}) \leq y \leq \mathrm{upperbound}} \mathrm{kl}(x, y).\]Note
It uses a call the
scipy.optimize.minimize_scalar(). If this fails, it uses a bisection search, and one call toklfor each step of the bisection search.
-
class
Policies.IMED.IMED(nbArms, tolerance=0.0001, kl=CPUDispatcher(<function klBern>), lower=0.0, amplitude=1.0)[source]¶ Bases:
Policies.DMED.DMEDThe IMED policy of [Honda & Takemura, JMLR 2015].
Reference: [[“Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards”, J. Honda and A. Takemura, JMLR, 2015](http://jmlr.csail.mit.edu/papers/volume16/honda15a/honda15a.pdf)].
-
__init__(nbArms, tolerance=0.0001, kl=CPUDispatcher(<function klBern>), lower=0.0, amplitude=1.0)[source]¶ New policy.
-
one_Dinf(x, mu)[source]¶ Compute the \(D_{\inf}\) solution, for one value of
x, and one value formu.
-
Dinf(xs, mu)[source]¶ Compute the \(D_{\inf}\) solution, for a vector of value of
xs, and one value formu.
-
choice()[source]¶ Choose an arm with minimal index (uniformly at random):
\[A(t) \sim U(\arg\min_{1 \leq k \leq K} I_k(t)).\]Where the indexes are:
\[I_k(t) = N_k(t) D_{\inf}(\hat{\mu_{k}}(t), \max_{k'} \hat{\mu_{k'}}(t)) + \log(N_k(t)).\]
-
__module__= 'Policies.IMED'¶