# **Structure and Sparsity of Stochastic Multi-Armed Bandits**

This page explains shortly what I studied on sparse stochastic multi-armed bandits.
Assume a MAB problem with `$K$` arms, each parametrized by its *mean* `$\mu_k\in\mathbb{R}$`.
If you know in advance that only a small subset (of size `$s$`) of the arms have a positive arm, it sounds reasonable to hope to be more efficient in playing the bandit game compared to an approach which is non aware of the sparsity.

The [`SparseUCB`](docs/Policies.SparseUCB.html#Policies.SparseUCB.SparseUCB) is an extension of the well-known [`UCB`](docs/Policies.UCB.html), and requires to known **exactly** the value of `$s$`.
It works by identifying as fast as possible (actually, in a sub-logarithmic number of samples) the arms with non-positive means.
Then it only plays in the "good" arms with positive means, with a regular UCB policy.

I studied extensions of this idea, first of all the [`SparseklUCB`](docs/Policies.SparseklUCB.html#Policies.SparseklUCB.SparseklUCB) policy as it was suggested in the original research paper, but mainly a generic "wrapper" black-box approach.
For more details, see [`SparseWrapper`](docs/Policies.SparseWrapper.html#Policies.SparseWrapper.SparseWrapper).

- Reference: [["Sparse Stochastic Bandits", by J. Kwon, V. Perchet & C. Vernade, COLT 2017](https://arxiv.org/abs/1706.01383)]. Note that this algorithm only works for sparse [Gaussian]((docs/Arms.Gaussian.html)) (or sub-Gaussian) stochastic bandits, and it includes [Bernoulli arms](docs/Arms.Bernoulli.html).

----

## Article

> TODO finish! I am writing a small research article on that topic, it is a better introduction as a self-contained document to explain this idea and the algorithms. Reference: [[Structure and Sparsity of Stochastic Multi-Arm Bandits, Lilian Besson and Emilie Kaufmann, 2018]](https://hal.inria.fr/hal-XXX).

----

## Example of simulation configuration

A simple python file, [`configuration_sparse.py`](https://smpybandits.github.io/docs/configuration_sparse.html), is used to import the [arm classes](Arms/), the [policy classes](Policies/) and define the problems and the experiments.

For example, we can compare the standard [`UCB`](docs/Policies.UCB.html) and  [`BayesUCB`](docs/Policies.BayesUCB.html) algorithms, non aware of the sparsity, against the sparsity-aware [`SparseUCB`](docs/Policies.SparseUCB.html) algorithm, as well as 4 versions of [`SparseWrapper`](docs/Policies.SparseWrapper.html) applied to [`BayesUCB`](docs/Policies.BayesUCB.html).

```python
configuration = {
    "horizon": 10000,    # Finite horizon of the simulation
    "repetitions": 100,  # number of repetitions
    "n_jobs": -1,        # Maximum number of cores for parallelization: use ALL your CPU
    "verbosity": 5,      # Verbosity for the joblib calls
    # Environment configuration, you can set up more than one.
    "environment": [
        {   # sparsity = nb of >= 0 mean, = 3 here
            "arm_type": Bernoulli,
            "params": 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.1, 0.2, 0.3
        }
    ],
    # Policies that should be simulated, and their parameters.
    "policies": [
        {"archtype": UCB, "params": {} },
        {"archtype": SparseUCB, "params": { "sparsity": 3 } },
        {"archtype": BayesUCB, "params": { } },
    ]
}
```

Then add a [Sparse-Wrapper](docs/Policies.SparseWrapper.html) bandit algorithm ([`SparseWrapper` class](docs/Policies.SparseWrapper.html)), you can use this piece of code:

```python
configuration["policies"] += [
    {
        "archtype": SparseWrapper,
        "params": {
            "policy": BayesUCB,
            "use_ucb_for_set_J": use_ucb_for_set_J,
            "use_ucb_for_set_K": use_ucb_for_set_K,
        }
    }
    for use_ucb_for_set_J in [ True, False ]
    for use_ucb_for_set_K in [ True, False ]
]
```

----

## [How to run the experiments ?](How_to_run_the_code.md)

You should use the provided [`Makefile`](Makefile) file to do this simply:
```bash
make install  # install the requirements ONLY ONCE
make sparse   # run and log the main.py script
```

----

## Some illustrations

Here are some plots illustrating the performances of the different [policies](docs/Policies.) implemented in this project, against various sparse problems (with [`Bernoulli`](Arms/Bernoulli.html) or [`UnboundedGaussian`](https://smpybandits.github.io/docs/Arms.Gaussian.html) arms only):

### 3 variants of [Sparse-Wrapper](docs/Policies.SparseWrapper.html) for UCB, on a "simple" sparse Bernoulli problem
![3 variants of Sparse-Wrapper for UCB, on a "simple" sparse Bernoulli problem](plots/main____env1-1_XXX.png)

FIXME run some simulations and explain them!

> These illustrations come from my (work in progress) article, [[Structure and Sparsity of Stochastic Multi-Arm Bandits, Lilian Besson and Emilie Kaufmann, 2018]](https://hal.inria.fr/hal-XXX).


----

### :scroll: License ? [![GitHub license](https://img.shields.io/github/license/SMPyBandits/SMPyBandits.svg)](https://github.com/SMPyBandits/SMPyBandits/blob/master/LICENSE)
[MIT Licensed](https://lbesson.mit-license.org/) (file [LICENSE](LICENSE)).

© 2016-2018 [Lilian Besson](https://GitHub.com/Naereen).

[![Open Source? Yes!](https://badgen.net/badge/Open%20Source%20%3F/Yes%21/blue?icon=github)](https://github.com/SMPyBandits/SMPyBandits/)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/SMPyBandits/SMPyBandits/graphs/commit-activity)
[![Ask Me Anything !](https://img.shields.io/badge/Ask%20me-anything-1abc9c.svg)](https://GitHub.com/Naereen/ama)
[![Analytics](https://ga-beacon.appspot.com/UA-38514290-17/github.com/SMPyBandits/SMPyBandits/README.md?pixel)](https://GitHub.com/SMPyBandits/SMPyBandits/)
![![PyPI version](https://img.shields.io/pypi/v/smpybandits.svg)](https://pypi.org/project/SMPyBandits)
![![PyPI implementation](https://img.shields.io/pypi/implementation/smpybandits.svg)](https://pypi.org/project/SMPyBandits)
[![![PyPI pyversions](https://img.shields.io/pypi/pyversions/smpybandits.svg?logo=python)](https://pypi.org/project/SMPyBandits)](https://pypi.org/project/SMPyBandits)
[![![PyPI download](https://img.shields.io/pypi/dm/smpybandits.svg)](https://pypi.org/project/SMPyBandits)](https://pypi.org/project/SMPyBandits)
[![![PyPI status](https://img.shields.io/pypi/status/smpybandits.svg)](https://pypi.org/project/SMPyBandits)](https://pypi.org/project/SMPyBandits)
[![Documentation Status](https://readthedocs.org/projects/smpybandits/badge/?version=latest)](https://SMPyBandits.ReadTheDocs.io/en/latest/?badge=latest)
[![Build Status](https://travis-ci.org/SMPyBandits/SMPyBandits.svg?branch=master)](https://travis-ci.org/SMPyBandits/SMPyBandits)
[![Stars of https://github.com/SMPyBandits/SMPyBandits/](https://badgen.net/github/stars/SMPyBandits/SMPyBandits)](https://GitHub.com/SMPyBandits/SMPyBandits/stargazers)
[![Releases of https://github.com/SMPyBandits/SMPyBandits/](https://badgen.net/github/release/SMPyBandits/SMPyBandits)](https://github.com/SMPyBandits/SMPyBandits/releases)