Policies.BasePolicy module

Base class for any policy.

  • If rewards are not in [0, 1], be sure to give the lower value and the amplitude. Eg, if rewards are in [-3, 3], lower = -3, amplitude = 6.

Policies.BasePolicy.CHECKBOUNDS = False

If True, every time a reward is received, a warning message is displayed if it lies outsides of [lower, lower + amplitude].

class Policies.BasePolicy.BasePolicy(nbArms, lower=0.0, amplitude=1.0)[source]

Bases: object

Base class for any policy.

__init__(nbArms, lower=0.0, amplitude=1.0)[source]

New policy.

nbArms = None

Number of arms

lower = None

Lower values for rewards

amplitude = None

Larger values for rewards

t = None

Internal time

pulls = None

Number of pulls of each arms

rewards = None

Cumulated rewards of each arms


-> str


Start the game (fill pulls and rewards with 0).

getReward(arm, reward)[source]

Give a reward: increase t, pulls, and update cumulated sum of rewards for that arm (normalized in [0, 1]).


Not defined.


Not defined.


Not defined.


Not defined.

__dict__ = mappingproxy({'__module__': 'Policies.BasePolicy', '__doc__': ' Base class for any policy.', '__init__': <function BasePolicy.__init__>, '__str__': <function BasePolicy.__str__>, 'startGame': <function BasePolicy.startGame>, 'getReward': <function BasePolicy.getReward>, 'choice': <function BasePolicy.choice>, 'choiceWithRank': <function BasePolicy.choiceWithRank>, 'choiceFromSubSet': <function BasePolicy.choiceFromSubSet>, 'choiceMultiple': <function BasePolicy.choiceMultiple>, 'choiceIMP': <function BasePolicy.choiceIMP>, 'estimatedOrder': <function BasePolicy.estimatedOrder>, '__dict__': <attribute '__dict__' of 'BasePolicy' objects>, '__weakref__': <attribute '__weakref__' of 'BasePolicy' objects>})
__module__ = 'Policies.BasePolicy'

list of weak references to the object (if defined)

choiceIMP(nb=1, startWithChoiceMultiple=True)[source]

Not defined.


Return the estimate order of the arms, as a permutation on [0..K-1] that would order the arms by increasing means.

  • For a base policy, it is completely random.