Arms package¶
Arms : contains different types of bandit arms:
Constant, UniformArm, Bernoulli, Binomial, Poisson, Gaussian, Exponential, Gamma, DiscreteArm.
Each arm class follows the same interface:
> my_arm = Arm(params)
> my_arm.mean
0.5
> my_arm.draw() # one random draw
0.0
> my_arm.draw_nparray(20) # or ((3, 10)), many draw
array([ 0., 1., 0., 0., 0., 0., 0., 1., 1., 0., 1., 0., 0.,
1., 0., 0., 0., 1., 1., 1.])
Also contains:
uniformMeans(), to generate uniformly spaced means of arms.uniformMeansWithSparsity(), to generate uniformly spaced means of arms, with sparsity constraints.randomMeans(), to generate randomly spaced means of arms.randomMeansWithGapBetweenMbestMworst(), to generate randomly spaced means of arms, with a constraint on the gap between the M-best arms and the (K-M)-worst arms.randomMeansWithSparsity(), to generate randomly spaced means of arms with sparsity constraint.shuffled(), to return a shuffled version of a list.Utility functions
array_from_str()list_from_str()andtuple_from_str()to obtain a numpy.ndarray, a list or a tuple from a string (used for the CLI env variables interface).geometricChangePoints(), to obtain randomly spaced change points.continuouslyVaryingMeans()andrandomContinuouslyVaryingMeans(), to get new random means for continuously varying non-stationary MAB problems.
-
Arms.shuffled(mylist)[source]¶ Returns a shuffled version of the input 1D list. sorted() exists instead of list.sort(), but shuffled() does not exist instead of random.shuffle()…
>>> from random import seed; seed(1234) # reproducible results >>> mylist = [ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] >>> shuffled(mylist) [0.9, 0.4, 0.3, 0.6, 0.5, 0.7, 0.1, 0.2, 0.8] >>> shuffled(mylist) [0.4, 0.3, 0.7, 0.5, 0.8, 0.1, 0.9, 0.6, 0.2] >>> shuffled(mylist) [0.4, 0.6, 0.9, 0.5, 0.7, 0.2, 0.1, 0.3, 0.8] >>> shuffled(mylist) [0.8, 0.7, 0.3, 0.1, 0.9, 0.5, 0.6, 0.2, 0.4]
-
Arms.uniformMeans(nbArms=3, delta=0.05, lower=0.0, amplitude=1.0, isSorted=True)[source]¶ Return a list of means of arms, well spaced:
in [lower, lower + amplitude],
sorted in increasing order,
starting from lower + amplitude * delta, up to lower + amplitude * (1 - delta),
and there is nbArms arms.
>>> np.array(uniformMeans(2, 0.1)) array([0.1, 0.9]) >>> np.array(uniformMeans(3, 0.1)) array([0.1, 0.5, 0.9]) >>> np.array(uniformMeans(9, 1 / (1. + 9))) array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
-
Arms.uniformMeansWithSparsity(nbArms=10, sparsity=3, delta=0.05, lower=0.0, lowerNonZero=0.5, amplitude=1.0, isSorted=True)[source]¶ Return a list of means of arms, well spaced, in [lower, lower + amplitude].
Exactly
nbArms-sparsityarms will have a mean =lowerand the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].All means will be different, except if
mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234) # reproducible results >>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2)) array([ 0. , 0. , 0. , 0. , 0.55, 0.95]) >>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2, lowerNonZero=0.8, delta=0.03)) array([ 0. , 0. , 0. , 0. , 0.806, 0.994]) >>> np.array(uniformMeansWithSparsity(nbArms=10, sparsity=2)) array([ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.55, 0.95]) >>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2, delta=0.05)) array([ 0. , 0. , 0. , 0. , 0.525, 0.975]) >>> np.array(uniformMeansWithSparsity(nbArms=10, sparsity=4, delta=0.05)) array([ 0. , 0. , 0. , 0. , 0. , 0. , 0.525, 0.675, 0.825, 0.975])
-
Arms.randomMeans(nbArms=3, mingap=None, lower=0.0, amplitude=1.0, isSorted=True)[source]¶ Return a list of means of arms, randomly sampled uniformly in [lower, lower + amplitude], with a min gap >= mingap.
All means will be different, except if
mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234) # reproducible results >>> randomMeans(nbArms=3, mingap=0.05) [0.191..., 0.437..., 0.622...] >>> randomMeans(nbArms=3, mingap=0.01) [0.276..., 0.801..., 0.958...]
Means are sorted, except if
isSorted=False.
>>> import random; random.seed(1234) # reproducible results >>> randomMeans(nbArms=5, mingap=0.01, isSorted=True) [0.006..., 0.229..., 0.416..., 0.535..., 0.899...] >>> randomMeans(nbArms=5, mingap=0.01, isSorted=False) [0.419..., 0.932..., 0.072..., 0.755..., 0.650...]
-
Arms.randomMeansWithGapBetweenMbestMworst(nbArms=3, mingap=None, nbPlayers=2, lower=0.0, amplitude=1.0, isSorted=True)[source]¶ Return a list of means of arms, randomly sampled uniformly in [lower, lower + amplitude], with a min gap >= mingap between the set Mbest and Mworst.
-
Arms.randomMeansWithSparsity(nbArms=10, sparsity=3, mingap=0.01, delta=0.05, lower=0.0, lowerNonZero=0.5, amplitude=1.0, isSorted=True)[source]¶ Return a list of means of arms, in [lower, lower + amplitude], with a min gap >= mingap.
Exactly
nbArms-sparsityarms will have a mean =lowerand the others are randomly sampled uniformly in[lowerNonZero, lower + amplitude].All means will be different, except if
mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234) # reproducible results >>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.05) [0.0, 0.0, 0.0, 0.0, 0.595..., 0.811...] >>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01) [0.0, 0.0, 0.0, 0.0, 0.718..., 0.892...]
Means are sorted, except if
isSorted=False.
>>> import random; random.seed(1234) # reproducible results >>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01, isSorted=True) [0.0, 0.0, 0.0, 0.0, 0.636..., 0.889...] >>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01, isSorted=False) [0.0, 0.0, 0.900..., 0.638..., 0.0, 0.0]
-
Arms.randomMeansWithSparsity2(nbArms=10, sparsity=3, mingap=0.01, lower=-1.0, lowerNonZero=0.0, amplitude=2.0, isSorted=True)[source]¶ Return a list of means of arms, in [lower, lower + amplitude], with a min gap >= mingap.
Exactly
nbArms-sparsityarms will have a mean sampled uniformly in[lower, lowerNonZero]and the others are randomly sampled uniformly in[lowerNonZero, lower + amplitude].All means will be different, except if
mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234) # reproducible results >>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.05) [0.0, 0.0, 0.0, 0.0, 0.595..., 0.811...] >>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01) [0.0, 0.0, 0.0, 0.0, 0.718..., 0.892...]
Means are sorted, except if
isSorted=False.
>>> import random; random.seed(1234) # reproducible results >>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01, isSorted=True) [0.0, 0.0, 0.0, 0.0, 0.636..., 0.889...] >>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01, isSorted=False) [0.0, 0.0, 0.900..., 0.638..., 0.0, 0.0]
-
Arms.array_from_str(my_str)[source]¶ Convert a string like “[0.1, 0.2, 0.3]” to a numpy array [0.1, 0.2, 0.3], using safe json.loads instead of exec.
>>> array_from_str("[0.1, 0.2, 0.3]") array([0.1, 0.2, 0.3]) >>> array_from_str("0.1, 0.2, 0.3") array([0.1, 0.2, 0.3]) >>> array_from_str("0.9") array([0.9])
-
Arms.list_from_str(my_str)[source]¶ Convert a string like “[0.1, 0.2, 0.3]” to a list (0.1, 0.2, 0.3), using safe json.loads instead of exec.
>>> list_from_str("[0.1, 0.2, 0.3]") [0.1, 0.2, 0.3] >>> list_from_str("0.1, 0.2, 0.3") [0.1, 0.2, 0.3] >>> list_from_str("0.9") [0.9]
-
Arms.tuple_from_str(my_str)[source]¶ Convert a string like “[0.1, 0.2, 0.3]” to a tuple (0.1, 0.2, 0.3), using safe json.loads instead of exec.
>>> tuple_from_str("[0.1, 0.2, 0.3]") (0.1, 0.2, 0.3) >>> tuple_from_str("0.1, 0.2, 0.3") (0.1, 0.2, 0.3) >>> tuple_from_str("0.9") (0.9,)
-
Arms.optimal_selection_probabilities(M, mu)[source]¶ Compute the optimal selection probabilities of K arms of means \(\mu_i\) by \(1 \leq M \leq K\) players, if they all observe each other pulls and rewards, as derived in (15) p3 of [[The Effect of Communication on Noncooperative Multiplayer Multi-Armed Bandit Problems, by Noyan Evirgen, Alper Kose, IEEE ICMLA 2017]](https://arxiv.org/abs/1711.01628v1).
Warning
They consider a different collision model than I usually do, when two (or more) players ask for the same resource at same time t, I usually consider than all the colliding players receive a zero reward (see
Environment.CollisionModels.onlyUniqUserGetsReward()), but they consider than exactly one of the colliding players gets the reward, and all the others get a zero reward (seeEnvironment.CollisionModels.rewardIsSharedUniformly()).Example:
>>> optimal_selection_probabilities(3, [0.1,0.1,0.1]) array([0.33333333, 0.33333333, 0.33333333])
>>> optimal_selection_probabilities(3, [0.1,0.2,0.3]) # weird ? not really... array([0. , 0.43055556, 0.56944444])
>>> optimal_selection_probabilities(3, [0.1,0.3,0.9]) # weird ? not really... array([0. , 0.45061728, 0.54938272])
>>> optimal_selection_probabilities(3, [0.7,0.8,0.9]) array([0.15631866, 0.35405647, 0.48962487])
Note
These results may sound counter-intuitive, but again they use a different collision models: in my usual collision model, it makes no sense to completely drop an arm when K=M=3, no matter the probabilities \(\mu_i\), but in their collision model, a player wins more (in average) if she has a \(50\%\) chance of being alone on an arm with mean \(0.3\) than if she is sure to be alone on an arm with mean \(0.1\) (see examples 3 and 4).
-
Arms.geometricChangePoints(horizon=10000, proba=0.001)[source]¶ Change points following a geometric distribution: at each time, the probability of having a change point at the next step is
proba.>>> np.random.seed(0) >>> geometricChangePoints(100, 0.1) array([ 8, 20, 29, 37, 43, 53, 59, 81]) >>> geometricChangePoints(100, 0.2) array([ 6, 8, 14, 29, 31, 35, 40, 44, 46, 60, 63, 72, 78, 80, 88, 91])
-
Arms.continuouslyVaryingMeans(means, sign=1, maxSlowChange=0.1, horizon=None, lower=0.0, amplitude=1.0, isSorted=True)[source]¶ New means, slightly modified from the previous ones.
The change and the sign of change are constants.
-
Arms.randomContinuouslyVaryingMeans(means, maxSlowChange=0.1, horizon=None, lower=0.0, amplitude=1.0, isSorted=True)[source]¶ New means, slightly modified from the previous ones.
The amplitude
cof the change is constant, but it is randomly sampled in \(\mathcal{U}([-c,c])\).
Submodules¶
- Arms.Arm module
- Arms.Bernoulli module
- Arms.Binomial module
- Arms.Constant module
- Arms.DiscreteArm module
- Arms.Exponential module
- Arms.Gamma module
- Arms.Gaussian module
- Arms.Poisson module
- Arms.RestedRottingArm module
- Arms.RestlessArm module
- Arms.UniformArm module
- Arms.kullback module
- Arms.usenumba module