Environment.EvaluatorSparseMultiPlayers module¶
EvaluatorSparseMultiPlayers class to wrap and run the simulations, for the multi-players case with sparse activated players. Lots of plotting methods, to have various visualizations. See documentation.
Warning
FIXME this environment is not as up-to-date as Environment.EvaluatorMultiPlayers.
-
Environment.EvaluatorSparseMultiPlayers.uniform_in_zero_one()¶ random() -> x in the interval [0, 1).
-
Environment.EvaluatorSparseMultiPlayers.REPETITIONS= 1¶ Default nb of repetitions
-
Environment.EvaluatorSparseMultiPlayers.ACTIVATION= 1¶ Default probability of activation
-
Environment.EvaluatorSparseMultiPlayers.DELTA_T_PLOT= 50¶ Default sampling rate for plotting
-
Environment.EvaluatorSparseMultiPlayers.MORE_ACCURATE= True¶ Use the count of selections instead of rewards for a more accurate mean/std reward measure.
-
Environment.EvaluatorSparseMultiPlayers.FINAL_RANKS_ON_AVERAGE= True¶ Default value for
finalRanksOnAverage
-
Environment.EvaluatorSparseMultiPlayers.USE_JOBLIB_FOR_POLICIES= False¶ Default value for
useJoblibForPolicies. Does not speed up to use it (too much overhead in using too much threads); so it should really be disabled.
-
Environment.EvaluatorSparseMultiPlayers.PICKLE_IT= True¶ Default value for
pickleitfor saving the figures. If True, then allplt.figureobject are saved (in pickle format).
-
class
Environment.EvaluatorSparseMultiPlayers.EvaluatorSparseMultiPlayers(configuration, moreAccurate=True)[source]¶ Bases:
Environment.EvaluatorMultiPlayers.EvaluatorMultiPlayersEvaluator class to run the simulations, for the multi-players case.
-
__init__(configuration, moreAccurate=True)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
activations= None¶ Probability of activations
-
collisionModel= None¶ Which collision model should be used
-
full_lost_if_collision= None¶ Is there a full loss of rewards if collision ? To compute the correct decomposition of regret
-
getCentralizedRegret_LessAccurate(envId=0)[source]¶ Compute the empirical centralized regret: cumsum on time of the mean rewards of the M best arms - cumsum on time of the empirical rewards obtained by the players, based on accumulated rewards.
-
getFirstRegretTerm(envId=0)[source]¶ Extract and compute the first term \((a)\) in the centralized regret: losses due to pulling suboptimal arms.
-
getSecondRegretTerm(envId=0)[source]¶ Extract and compute the second term \((b)\) in the centralized regret: losses due to not pulling optimal arms.
-
getThirdRegretTerm(envId=0)[source]¶ Extract and compute the third term \((c)\) in the centralized regret: losses due to collisions.
-
getCentralizedRegret_MoreAccurate(envId=0)[source]¶ Compute the empirical centralized regret, based on counts of selections and not actual rewards.
-
getCentralizedRegret(envId=0, moreAccurate=None)[source]¶ Using either the more accurate or the less accurate regret count.
-
getLastRegrets_MoreAccurate(envId=0)[source]¶ Extract last regrets, based on counts of selections and not actual rewards.
-
getLastRegrets(envId=0, moreAccurate=None)[source]¶ Using either the more accurate or the less accurate regret count.
-
strPlayers(short=False, latex=True)[source]¶ Get a string of the players and their activations probability for this environment.
-
__module__= 'Environment.EvaluatorSparseMultiPlayers'¶
-
-
Environment.EvaluatorSparseMultiPlayers.delayed_play(env, players, horizon, collisionModel, activations, seed=None, repeatId=0)[source]¶ Helper function for the parallelization.
-
Environment.EvaluatorSparseMultiPlayers.with_proba(proba)[source]¶ True with probability = proba, False with probability = 1 - proba.
Examples:
>>> import random; random.seed(0) >>> tosses = [with_proba(0.6) for _ in range(10000)]; sum(tosses) 5977 >>> tosses = [with_proba(0.111) for _ in range(100000)]; sum(tosses) 11158