Policies package

Policies module : contains all the (single-player) bandits algorithms:


The list above might not be complete, see the details below.

All policies have the same interface, as described in BasePolicy, in order to use them in any experiment with the following approach:

my_policy = Policy(nbArms)
my_policy.startGame()  # start the game
for t in range(T):
    chosen_arm_t = k_t = my_policy.choice()  # chose one arm
    reward_t     = sampled from an arm k_t   # sample a reward
    my_policy.getReward(k_t, reward_t)       # give it the the policy
Policies.klucb_mapping = {'Bernoulli': CPUDispatcher(<function klucbBern>), 'Exponential': CPUDispatcher(<function klucbExp>), 'Gamma': CPUDispatcher(<function klucbGamma>), 'Gaussian': CPUDispatcher(<function klucbGauss>), 'Poisson': CPUDispatcher(<function klucbPoisson>)}

Maps name of arms to kl functions