Policies.RCB module¶
The RCB, Randomized Confidence Bound, policy for bounded bandits.
Reference: [[“On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems”, by Baekjin Kim, Ambuj Tewari, arXiv:1902.00610]](https://arxiv.org/pdf/1902.00610.pdf)
-
class
Policies.RCB.RCB(nbArms, perturbation='uniform', lower=0.0, amplitude=1.0, *args, **kwargs)[source]¶ Bases:
Policies.RandomizedIndexPolicy.RandomizedIndexPolicy,Policies.UCBalpha.UCBalphaThe RCB, Randomized Confidence Bound, policy for bounded bandits.
Reference: [[“On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems”, by Baekjin Kim, Ambuj Tewari, arXiv:1902.00610]](https://arxiv.org/pdf/1902.00610.pdf)
-
__module__= 'Policies.RCB'¶