Arms package¶

Arms : contains different types of bandit arms: Constant, UniformArm, Bernoulli, Binomial, Poisson, Gaussian, Exponential, Gamma, DiscreteArm.

Each arm class follows the same interface:

> my_arm = Arm(params)
> my_arm.mean
0.5
> my_arm.draw()  # one random draw
0.0
> my_arm.draw_nparray(20)  # or ((3, 10)), many draw
array([ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  0.,  0.,
        1.,  0.,  0.,  0.,  1.,  1.,  1.])

Also contains:

uniformMeans(), to generate uniformly spaced means of arms.
uniformMeansWithSparsity(), to generate uniformly spaced means of arms, with sparsity constraints.
randomMeans(), to generate randomly spaced means of arms.
randomMeansWithGapBetweenMbestMworst(), to generate randomly spaced means of arms, with a constraint on the gap between the M-best arms and the (K-M)-worst arms.
randomMeansWithSparsity(), to generate randomly spaced means of arms with sparsity constraint.
shuffled(), to return a shuffled version of a list.
Utility functions array_from_str() list_from_str() and tuple_from_str() to obtain a numpy.ndarray, a list or a tuple from a string (used for the CLI env variables interface).
optimal_selection_probabilities().
geometricChangePoints(), to obtain randomly spaced change points.
continuouslyVaryingMeans() and randomContinuouslyVaryingMeans(), to get new random means for continuously varying non-stationary MAB problems.

Arms.shuffled(mylist)[source]¶

Returns a shuffled version of the input 1D list. sorted() exists instead of list.sort(), but shuffled() does not exist instead of random.shuffle()…

>>> from random import seed; seed(1234)  # reproducible results
>>> mylist = [ 0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9]
>>> shuffled(mylist)
[0.9, 0.4, 0.3, 0.6, 0.5, 0.7, 0.1, 0.2, 0.8]
>>> shuffled(mylist)
[0.4, 0.3, 0.7, 0.5, 0.8, 0.1, 0.9, 0.6, 0.2]
>>> shuffled(mylist)
[0.4, 0.6, 0.9, 0.5, 0.7, 0.2, 0.1, 0.3, 0.8]
>>> shuffled(mylist)
[0.8, 0.7, 0.3, 0.1, 0.9, 0.5, 0.6, 0.2, 0.4]

Arms.uniformMeans(nbArms=3, delta=0.05, lower=0.0, amplitude=1.0, isSorted=True)[source]¶

Return a list of means of arms, well spaced:

in [lower, lower + amplitude],
sorted in increasing order,
starting from lower + amplitude * delta, up to lower + amplitude * (1 - delta),
and there is nbArms arms.

>>> np.array(uniformMeans(2, 0.1))
array([0.1, 0.9])
>>> np.array(uniformMeans(3, 0.1))
array([0.1, 0.5, 0.9])
>>> np.array(uniformMeans(9, 1 / (1. + 9)))
array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

Arms.uniformMeansWithSparsity(nbArms=10, sparsity=3, delta=0.05, lower=0.0, lowerNonZero=0.5, amplitude=1.0, isSorted=True)[source]¶

Return a list of means of arms, well spaced, in [lower, lower + amplitude].

Exactly nbArms-sparsity arms will have a mean = lower and the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].
All means will be different, except if mingap=None, with a min gap > 0.

>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2))  
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.55,  0.95])
>>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2, lowerNonZero=0.8, delta=0.03))  
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.806,  0.994])
>>> np.array(uniformMeansWithSparsity(nbArms=10, sparsity=2))  
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.55,  0.95])
>>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2, delta=0.05))  
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.525,  0.975])
>>> np.array(uniformMeansWithSparsity(nbArms=10, sparsity=4, delta=0.05))  
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.525,  0.675,
        0.825,  0.975])

Arms.randomMeans(nbArms=3, mingap=None, lower=0.0, amplitude=1.0, isSorted=True)[source]¶

Return a list of means of arms, randomly sampled uniformly in [lower, lower + amplitude], with a min gap >= mingap.

All means will be different, except if mingap=None, with a min gap > 0.

>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> randomMeans(nbArms=3, mingap=0.05)  
[0.191..., 0.437..., 0.622...]
>>> randomMeans(nbArms=3, mingap=0.01)  
[0.276..., 0.801..., 0.958...]

Means are sorted, except if isSorted=False.

>>> import random; random.seed(1234)  # reproducible results
>>> randomMeans(nbArms=5, mingap=0.01, isSorted=True)  
[0.006..., 0.229..., 0.416..., 0.535..., 0.899...]
>>> randomMeans(nbArms=5, mingap=0.01, isSorted=False)  
[0.419..., 0.932..., 0.072..., 0.755..., 0.650...]

Arms.randomMeansWithGapBetweenMbestMworst(nbArms=3, mingap=None, nbPlayers=2, lower=0.0, amplitude=1.0, isSorted=True)[source]¶: Return a list of means of arms, randomly sampled uniformly in [lower, lower + amplitude], with a min gap >= mingap between the set Mbest and Mworst.

Arms.randomMeansWithSparsity(nbArms=10, sparsity=3, mingap=0.01, delta=0.05, lower=0.0, lowerNonZero=0.5, amplitude=1.0, isSorted=True)[source]¶

Return a list of means of arms, in [lower, lower + amplitude], with a min gap >= mingap.

Exactly nbArms-sparsity arms will have a mean = lower and the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].
All means will be different, except if mingap=None, with a min gap > 0.

>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.05)  
[0.0, 0.0, 0.0, 0.0, 0.595..., 0.811...]
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01)  
[0.0, 0.0, 0.0, 0.0, 0.718..., 0.892...]

Means are sorted, except if isSorted=False.

>>> import random; random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01, isSorted=True)  
[0.0, 0.0, 0.0, 0.0, 0.636..., 0.889...]
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01, isSorted=False)  
[0.0, 0.0, 0.900..., 0.638..., 0.0, 0.0]

Arms.randomMeansWithSparsity2(nbArms=10, sparsity=3, mingap=0.01, lower=-1.0, lowerNonZero=0.0, amplitude=2.0, isSorted=True)[source]¶

Return a list of means of arms, in [lower, lower + amplitude], with a min gap >= mingap.

Exactly nbArms-sparsity arms will have a mean sampled uniformly in [lower, lowerNonZero] and the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].
All means will be different, except if mingap=None, with a min gap > 0.

>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.05)  
[0.0, 0.0, 0.0, 0.0, 0.595..., 0.811...]
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01)  
[0.0, 0.0, 0.0, 0.0, 0.718..., 0.892...]

Means are sorted, except if isSorted=False.

>>> import random; random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01, isSorted=True)  
[0.0, 0.0, 0.0, 0.0, 0.636..., 0.889...]
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01, isSorted=False)  
[0.0, 0.0, 0.900..., 0.638..., 0.0, 0.0]

Arms.array_from_str(my_str)[source]¶

Convert a string like “[0.1, 0.2, 0.3]” to a numpy array [0.1, 0.2, 0.3], using safe json.loads instead of exec.

>>> array_from_str("[0.1, 0.2, 0.3]")
array([0.1,  0.2,  0.3])
>>> array_from_str("0.1, 0.2, 0.3")
array([0.1,  0.2,  0.3])
>>> array_from_str("0.9")
array([0.9])

Arms.list_from_str(my_str)[source]¶

Convert a string like “[0.1, 0.2, 0.3]” to a list (0.1, 0.2, 0.3), using safe json.loads instead of exec.

>>> list_from_str("[0.1, 0.2, 0.3]")
[0.1, 0.2, 0.3]
>>> list_from_str("0.1, 0.2, 0.3")
[0.1, 0.2, 0.3]
>>> list_from_str("0.9")
[0.9]

Arms.tuple_from_str(my_str)[source]¶

Convert a string like “[0.1, 0.2, 0.3]” to a tuple (0.1, 0.2, 0.3), using safe json.loads instead of exec.

>>> tuple_from_str("[0.1, 0.2, 0.3]")
(0.1, 0.2, 0.3)
>>> tuple_from_str("0.1, 0.2, 0.3")
(0.1, 0.2, 0.3)
>>> tuple_from_str("0.9")
(0.9,)

Arms.optimal_selection_probabilities(M, mu)[source]¶

Compute the optimal selection probabilities of K arms of means \(\mu_i\) by \(1 \leq M \leq K\) players, if they all observe each other pulls and rewards, as derived in (15) p3 of [[The Effect of Communication on Noncooperative Multiplayer Multi-Armed Bandit Problems, by Noyan Evirgen, Alper Kose, IEEE ICMLA 2017]](https://arxiv.org/abs/1711.01628v1).

Warning

They consider a different collision model than I usually do, when two (or more) players ask for the same resource at same time t, I usually consider than all the colliding players receive a zero reward (see Environment.CollisionModels.onlyUniqUserGetsReward()), but they consider than exactly one of the colliding players gets the reward, and all the others get a zero reward (see Environment.CollisionModels.rewardIsSharedUniformly()).

Example:

>>> optimal_selection_probabilities(3, [0.1,0.1,0.1])
array([0.33333333,  0.33333333,  0.33333333])

>>> optimal_selection_probabilities(3, [0.1,0.2,0.3])  # weird ? not really...
array([0.        ,  0.43055556,  0.56944444])

>>> optimal_selection_probabilities(3, [0.1,0.3,0.9])  # weird ? not really...
array([0.        ,  0.45061728,  0.54938272])

>>> optimal_selection_probabilities(3, [0.7,0.8,0.9])
array([0.15631866,  0.35405647,  0.48962487])

Note

These results may sound counter-intuitive, but again they use a different collision models: in my usual collision model, it makes no sense to completely drop an arm when K=M=3, no matter the probabilities \(\mu_i\), but in their collision model, a player wins more (in average) if she has a \(50\%\) chance of being alone on an arm with mean \(0.3\) than if she is sure to be alone on an arm with mean \(0.1\) (see examples 3 and 4).

Arms.geometricChangePoints(horizon=10000, proba=0.001)[source]¶

Change points following a geometric distribution: at each time, the probability of having a change point at the next step is proba.

>>> np.random.seed(0)
>>> geometricChangePoints(100, 0.1)
array([ 8, 20, 29, 37, 43, 53, 59, 81])
>>> geometricChangePoints(100, 0.2)
array([ 6,  8, 14, 29, 31, 35, 40, 44, 46, 60, 63, 72, 78, 80, 88, 91])

Arms.continuouslyVaryingMeans(means, sign=1, maxSlowChange=0.1, horizon=None, lower=0.0, amplitude=1.0, isSorted=True)[source]¶

New means, slightly modified from the previous ones.

The change and the sign of change are constants.

Arms.randomContinuouslyVaryingMeans(means, maxSlowChange=0.1, horizon=None, lower=0.0, amplitude=1.0, isSorted=True)[source]¶

New means, slightly modified from the previous ones.

The amplitude c of the change is constant, but it is randomly sampled in \(\mathcal{U}([-c,c])\).

Arms package¶

Submodules¶