# Environment.EvaluatorSparseMultiPlayers module¶

EvaluatorSparseMultiPlayers class to wrap and run the simulations, for the multi-players case with sparse activated players. Lots of plotting methods, to have various visualizations. See documentation.

Warning

FIXME this environment is not as up-to-date as Environment.EvaluatorMultiPlayers.

Environment.EvaluatorSparseMultiPlayers.uniform_in_zero_one()

random() -> x in the interval [0, 1).

Environment.EvaluatorSparseMultiPlayers.REPETITIONS = 1

Default nb of repetitions

Environment.EvaluatorSparseMultiPlayers.ACTIVATION = 1

Default probability of activation

Environment.EvaluatorSparseMultiPlayers.DELTA_T_PLOT = 50

Default sampling rate for plotting

Environment.EvaluatorSparseMultiPlayers.MORE_ACCURATE = True

Use the count of selections instead of rewards for a more accurate mean/std reward measure.

Environment.EvaluatorSparseMultiPlayers.FINAL_RANKS_ON_AVERAGE = True

Default value for finalRanksOnAverage

Environment.EvaluatorSparseMultiPlayers.USE_JOBLIB_FOR_POLICIES = False

Default value for useJoblibForPolicies. Does not speed up to use it (too much overhead in using too much threads); so it should really be disabled.

Environment.EvaluatorSparseMultiPlayers.PICKLE_IT = True

Default value for pickleit for saving the figures. If True, then all plt.figure object are saved (in pickle format).

class Environment.EvaluatorSparseMultiPlayers.EvaluatorSparseMultiPlayers(configuration, moreAccurate=True)[source]

Evaluator class to run the simulations, for the multi-players case.

__init__(configuration, moreAccurate=True)[source]

Initialize self. See help(type(self)) for accurate signature.

activations = None

Probability of activations

collisionModel = None

Which collision model should be used

full_lost_if_collision = None

Is there a full loss of rewards if collision ? To compute the correct decomposition of regret

startOneEnv(envId, env)[source]

Simulate that env.

getCentralizedRegret_LessAccurate(envId=0)[source]

Compute the empirical centralized regret: cumsum on time of the mean rewards of the M best arms - cumsum on time of the empirical rewards obtained by the players, based on accumulated rewards.

getFirstRegretTerm(envId=0)[source]

Extract and compute the first term $$(a)$$ in the centralized regret: losses due to pulling suboptimal arms.

getSecondRegretTerm(envId=0)[source]

Extract and compute the second term $$(b)$$ in the centralized regret: losses due to not pulling optimal arms.

getThirdRegretTerm(envId=0)[source]

Extract and compute the third term $$(c)$$ in the centralized regret: losses due to collisions.

getCentralizedRegret_MoreAccurate(envId=0)[source]

Compute the empirical centralized regret, based on counts of selections and not actual rewards.

getCentralizedRegret(envId=0, moreAccurate=None)[source]

Using either the more accurate or the less accurate regret count.

getLastRegrets_LessAccurate(envId=0)[source]

Extract last regrets, based on accumulated rewards.

getAllLastWeightedSelections(envId=0)[source]

Extract weighted count of selections.

getLastRegrets_MoreAccurate(envId=0)[source]

Extract last regrets, based on counts of selections and not actual rewards.

getLastRegrets(envId=0, moreAccurate=None)[source]

Using either the more accurate or the less accurate regret count.

strPlayers(short=False, latex=True)[source]

Get a string of the players and their activations probability for this environment.

__module__ = 'Environment.EvaluatorSparseMultiPlayers'
Environment.EvaluatorSparseMultiPlayers.delayed_play(env, players, horizon, collisionModel, activations, seed=None, repeatId=0)[source]

Helper function for the parallelization.

Environment.EvaluatorSparseMultiPlayers.with_proba(proba)[source]

True with probability = proba, False with probability = 1 - proba.

Examples:

>>> import random; random.seed(0)
>>> tosses = [with_proba(0.6) for _ in range(10000)]; sum(tosses)
5977
>>> tosses = [with_proba(0.111) for _ in range(100000)]; sum(tosses)
11158