configuration_comparing_aggregation_algorithms module¶

Configuration for the simulations, for the single-player case, for comparing Aggregation algorithms.

configuration_comparing_aggregation_algorithms.HORIZON = 10000¶: HORIZON : number of time steps of the experiments. Warning Should be >= 10000 to be interesting “asymptotically”.

configuration_comparing_aggregation_algorithms.REPETITIONS = 4¶: REPETITIONS : number of repetitions of the experiments. Warning: Should be >= 10 to be statistically trustworthy.

configuration_comparing_aggregation_algorithms.DO_PARALLEL = True¶: To profile the code, turn down parallel computing

configuration_comparing_aggregation_algorithms.N_JOBS = -1¶: Number of jobs to use for the parallel computations. -1 means all the CPU cores, 1 means no parallelization.

configuration_comparing_aggregation_algorithms.NB_ARMS = 9¶: Number of arms for non-hard-coded problems (Bayesian problems)

configuration_comparing_aggregation_algorithms.RANDOM_SHUFFLE = False¶: The arms are shuffled (shuffle(arms)).

configuration_comparing_aggregation_algorithms.RANDOM_INVERT = False¶: The arms are inverted (arms = arms[::-1]).

configuration_comparing_aggregation_algorithms.NB_RANDOM_EVENTS = 5¶: Number of random events. They are uniformly spaced in time steps.

configuration_comparing_aggregation_algorithms.CACHE_REWARDS = False¶: Should we cache rewards? The random rewards will be the same for all the REPETITIONS simulations for each algorithms.

configuration_comparing_aggregation_algorithms.UPDATE_ALL_CHILDREN = False¶: Should the Aggregator policy update the trusts in each child or just the one trusted for last decision?

configuration_comparing_aggregation_algorithms.UNBIASED = True¶: Should the rewards for Aggregator policy use as biased estimator, ie just r_t, or unbiased estimators, r_t / p_t

configuration_comparing_aggregation_algorithms.UPDATE_LIKE_EXP4 = False¶: Should we update the trusts proba like in Exp4 or like in my initial Aggregator proposal

configuration_comparing_aggregation_algorithms.TRUNC = 1¶: Trunc parameter, ie amplitude, for Exponential arms

configuration_comparing_aggregation_algorithms.VARIANCE = 0.05¶: Variance of Gaussian arms

configuration_comparing_aggregation_algorithms.MINI = 0¶: lower bound on rewards from Gaussian arms

configuration_comparing_aggregation_algorithms.MAXI = 1¶: upper bound on rewards from Gaussian arms, ie amplitude = 1

configuration_comparing_aggregation_algorithms.SCALE = 1¶: Scale of Gamma arms

configuration_comparing_aggregation_algorithms.ARM_TYPE¶: alias of Arms.Bernoulli.Bernoulli

configuration_comparing_aggregation_algorithms.configuration = {'cache_rewards': False, 'environment': [{'arm_type': <class 'Arms.Bernoulli.Bernoulli'>, 'params': [0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7000000000000001, 0.8, 0.9]}], 'horizon': 10000, 'n_jobs': -1, 'nb_random_events': 5, 'policies': [{'archtype': <class 'Policies.Aggregator.Aggregator'>, 'params': {'children': [{'archtype': <class 'Policies.UCBalpha.UCBalpha'>, 'params': {'alpha': 1, 'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.Thompson.Thompson'>, 'params': {'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbBern>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbExp>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': <function klucbGauss>}}, {'archtype': <class 'Policies.BayesUCB.BayesUCB'>, 'params': {'lower': 0, 'amplitude': 1}}], 'unbiased': True, 'update_all_children': False, 'decreaseRate': 'auto', 'update_like_exp4': False}}, {'archtype': <class 'Policies.Aggregator.Aggregator'>, 'params': {'children': [{'archtype': <class 'Policies.UCBalpha.UCBalpha'>, 'params': {'alpha': 1, 'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.Thompson.Thompson'>, 'params': {'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbBern>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbExp>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': <function klucbGauss>}}, {'archtype': <class 'Policies.BayesUCB.BayesUCB'>, 'params': {'lower': 0, 'amplitude': 1}}], 'unbiased': True, 'update_all_children': False, 'decreaseRate': 'auto', 'update_like_exp4': True}}, {'archtype': <class 'Policies.LearnExp.LearnExp'>, 'params': {'children': [{'archtype': <class 'Policies.UCBalpha.UCBalpha'>, 'params': {'alpha': 1, 'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.Thompson.Thompson'>, 'params': {'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbBern>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbExp>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': <function klucbGauss>}}, {'archtype': <class 'Policies.BayesUCB.BayesUCB'>, 'params': {'lower': 0, 'amplitude': 1}}], 'unbiased': True, 'eta': 0.9}}, {'archtype': <class 'Policies.UCBalpha.UCBalpha'>, 'params': {'alpha': 1, 'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.Thompson.Thompson'>, 'params': {'lower': 0, 'amplitude': 1}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbBern>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': CPUDispatcher(<function klucbExp>)}}, {'archtype': <class 'Policies.klUCB.klUCB'>, 'params': {'lower': 0, 'amplitude': 1, 'klucb': <function klucbGauss>}}, {'archtype': <class 'Policies.BayesUCB.BayesUCB'>, 'params': {'lower': 0, 'amplitude': 1}}], 'random_invert': False, 'random_shuffle': False, 'repetitions': 4, 'verbosity': 6}¶: This dictionary configures the experiments

configuration_comparing_aggregation_algorithms.LOWER = 0¶: And get LOWER, AMPLITUDE values

configuration_comparing_aggregation_algorithms.AMPLITUDE = 1¶: And get LOWER, AMPLITUDE values

configuration_comparing_aggregation_algorithms.klucbGauss(x, d, precision=0.0)[source]¶: klucbGauss(x, d, sig2x) with the good variance (= 0.05).

configuration_comparing_aggregation_algorithms.klucbGamma(x, d, precision=0.0)[source]¶: klucbGamma(x, d, sig2x) with the good scale (= 1).