configuration_markovian module

Configuration for the simulations, for the single-player case for Markovian problems.

configuration_markovian.CPU_COUNT = 4

Number of CPU on the local machine

configuration_markovian.HORIZON = 1000

HORIZON : number of time steps of the experiments. Warning Should be >= 10000 to be interesting “asymptotically”.

configuration_markovian.REPETITIONS = 100

REPETITIONS : number of repetitions of the experiments. Warning: Should be >= 10 to be statistically trustworthy.

configuration_markovian.DO_PARALLEL = True

To profile the code, turn down parallel computing

configuration_markovian.N_JOBS = -1

Number of jobs to use for the parallel computations. -1 means all the CPU cores, 1 means no parallelization.

configuration_markovian.VARIANCE = 10

Variance of Gaussian arms

configuration_markovian.TEST_Aggregator = True

To know if my Aggregator policy is tried.

configuration_markovian.configuration = {'environment': [{'arm_type': 'Markovian', 'params': {'rested': False, 'transitions': [{(0, 0): 0.7, (0, 1): 0.3, (1, 0): 0.5, (1, 1): 0.5}, [[0.2, 0.8], [0.6, 0.4]]], 'steadyArm': <class 'Arms.Bernoulli.Bernoulli'>}}], 'horizon': 1000, 'n_jobs': -1, 'policies': [{'archtype': <class 'Policies.UCBalpha.UCBalpha'>, 'params': {'alpha': 1}}, {'archtype': <class 'Policies.Thompson.Thompson'>, 'params': {}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'klucb': CPUDispatcher(<function klucbBern>)}}, {'archtype': <class 'Policies.BayesUCB.BayesUCB'>, 'params': {}}], 'repetitions': 100, 'verbosity': 6}

This dictionary configures the experiments

configuration_markovian.nbArms = 3

Number of arms in the first environment

configuration_markovian.klucb[source]

Warning: if using Exponential or Gaussian arms, gives klExp or klGauss to KL-UCB-like policies!