Policies.kullback_cython module¶

Kullback-Leibler divergence functions and klUCB utilities. Optimized version that should be compiled using Cython (http://docs.cython.org/).

Faster implementation can be found in a C file, in Policies/C, and should be compiled to speedup computations.
However, the version here have examples, doctests.
Cf. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
Reference: [Filippi, Cappé & Garivier - Allerton, 2011](https://arxiv.org/pdf/1004.5229.pdf) and [Garivier & Cappé, 2011](https://arxiv.org/pdf/1102.2490.pdf)

Note

The C version is still faster, but not so much. The Cython version has the advantage of providing docstrings, and optional arguments, and is only 2 to 3 times slower than the optimal performance! Here are some comparisons of the two, when computing KL-UCB indexes:

>>> r = np.random.random
>>> import kullback_c; import kullback_cython  # fake, just for illustration

>>> %timeit kullback_c.klucbBern(r(), r(), 1e-6)
2.26 µs ± 21.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit kullback_cython.klucbBern(r(), r())
6.04 µs ± 46.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit kullback_c.klucbGamma(r(), r(), 1e-6)
2.56 µs ± 225 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit kullback_cython.klucbGamma(r(), r())
6.47 µs ± 513 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit kullback_c.klucbExp(r(), r(), 1e-6)
2.03 µs ± 92.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit kullback_cython.klucbExp(r(), r())
4.75 µs ± 444 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit kullback_c.klucbGauss(r(), r(), 1e-6)  # so easy computation: no difference!
800 ns ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit kullback_cython.klucbGauss(r(), r())
732 ns ± 71.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit kullback_c.klucbPoisson(r(), r(), 1e-6)
2.33 µs ± 167 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit kullback_cython.klucbPoisson(r(), r())
5.13 µs ± 521 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Warning

This extension should be used with the setup.py script, by running:

$ python setup.py build_ext --inplace

You can also use [pyximport](http://docs.cython.org/en/latest/src/tutorial/cython_tutorial.html#pyximport-cython-compilation-for-developers) to import the kullback_cython module transparently:

>>> import pyximport; pyximport.install()  # instantaneous  
(None, <pyximport.pyximport.PyxImporter at 0x...>)
>>> import kullback_cython as kullback     # takes about two seconds
>>> # then use kullback.klucbBern or others, as if they came from the pure Python version!

Warning

All functions are not vectorized, and assume only one value for each argument. If you want vectorized function, use the wrapper numpy.vectorize:

>>> import numpy as np
>>> klBern_vect = np.vectorize(klBern)
>>> klBern_vect([0.1, 0.5, 0.9], 0.2)  
array([0.036..., 0.223..., 1.145...])
>>> klBern_vect(0.4, [0.2, 0.3, 0.4])  
array([0.104..., 0.022..., 0...])
>>> klBern_vect([0.1, 0.5, 0.9], [0.2, 0.3, 0.4])  
array([0.036..., 0.087..., 0.550...])

For some functions, you would be better off writing a vectorized version manually, for instance if you want to fix a value of some optional parameters:

>>> # WARNING using np.vectorize gave weird result on klGauss
>>> # klGauss_vect = np.vectorize(klGauss, excluded="y")
>>> def klGauss_vect(xs, y, sig2x=0.25) -> float:  # vectorized for first input only
...    return np.array([klGauss(x, y, sig2x) for x in xs])
>>> klGauss_vect([-1, 0, 1], 0.1)  
array([2.42, 0.02, 1.62])

Policies.kullback_cython.klBern()¶

Kullback-Leibler divergence for Bernoulli distributions. https://en.wikipedia.org/wiki/Bernoulli_distribution#Kullback.E2.80.93Leibler_divergence

\[\mathrm{KL}(\mathcal{B}(x), \mathcal{B}(y)) = x \log(\frac{x}{y}) + (1-x) \log(\frac{1-x}{1-y}).\]

>>> klBern(0.5, 0.5)
0.0
>>> klBern(0.1, 0.9)  
1.757779...
>>> klBern(0.9, 0.1)  # And this KL is symmetric  
1.757779...
>>> klBern(0.4, 0.5)  
0.020135...
>>> klBern(0.01, 0.99)  
4.503217...

Special values:

>>> klBern(0, 1)  # Should be +inf, but 0 --> eps, 1 --> 1 - eps  
34.539575...

Policies.kullback_cython.klBin()¶

Kullback-Leibler divergence for Binomial distributions. https://math.stackexchange.com/questions/320399/kullback-leibner-divergence-of-binomial-distributions

It is simply the n times klBern() on x and y.

\[\mathrm{KL}(\mathrm{Bin}(x, n), \mathrm{Bin}(y, n)) = n \times \left(x \log(\frac{x}{y}) + (1-x) \log(\frac{1-x}{1-y}) \right).\]

Warning

The two distributions must have the same parameter n, and x, y are p, q in (0, 1).

>>> klBin(0.5, 0.5, 10)
0.0
>>> klBin(0.1, 0.9, 10)  
17.57779...
>>> klBin(0.9, 0.1, 10)  # And this KL is symmetric  
17.57779...
>>> klBin(0.4, 0.5, 10)  
0.20135...
>>> klBin(0.01, 0.99, 10)  
45.03217...

Special values:

>>> klBin(0, 1, 10)  # Should be +inf, but 0 --> eps, 1 --> 1 - eps  
345.39575...

Policies.kullback_cython.klExp()¶

Kullback-Leibler divergence for exponential distributions. https://en.wikipedia.org/wiki/Exponential_distribution#Kullback.E2.80.93Leibler_divergence

\[\begin{split}\mathrm{KL}(\mathrm{Exp}(x), \mathrm{Exp}(y)) = \begin{cases} \frac{x}{y} - 1 - \log(\frac{x}{y}) & \text{if} x > 0, y > 0\\ +\infty & \text{otherwise} \end{cases}\end{split}\]

>>> klExp(3, 3)
0.0
>>> klExp(3, 6)  
0.193147...
>>> klExp(1, 2)  # Only the proportion between x and y is used  
0.193147...
>>> klExp(2, 1)  # And this KL is non-symmetric  
0.306852...
>>> klExp(4, 2)  # Only the proportion between x and y is used  
0.306852...
>>> klExp(6, 8)  
0.037682...

x, y have to be positive:

>>> klExp(-3, 2)
inf
>>> klExp(3, -2)
inf
>>> klExp(-3, -2)
inf

Policies.kullback_cython.klGamma()¶

Kullback-Leibler divergence for gamma distributions. https://en.wikipedia.org/wiki/Gamma_distribution#Kullback.E2.80.93Leibler_divergence

It is simply the a times klExp() on x and y.

\[\begin{split}\mathrm{KL}(\Gamma(x, a), \Gamma(y, a)) = \begin{cases} a \times \left( \frac{x}{y} - 1 - \log(\frac{x}{y}) \right) & \text{if} x > 0, y > 0\\ +\infty & \text{otherwise} \end{cases}\end{split}\]

Warning

The two distributions must have the same parameter a.

>>> klGamma(3, 3)
0.0
>>> klGamma(3, 6)  
0.193147...
>>> klGamma(1, 2)  # Only the proportion between x and y is used  
0.193147...
>>> klGamma(2, 1)  # And this KL is non-symmetric  
0.306852...
>>> klGamma(4, 2)  # Only the proportion between x and y is used  
0.306852...
>>> klGamma(6, 8)  
0.037682...

x, y have to be positive:

>>> klGamma(-3, 2)
inf
>>> klGamma(3, -2)
inf
>>> klGamma(-3, -2)
inf

Policies.kullback_cython.klGauss()¶

Kullback-Leibler divergence for Gaussian distributions of means x and y and variances sig2x and sig2y, \(\nu_1 = \mathcal{N}(x, \sigma_x^2)\) and \(\nu_2 = \mathcal{N}(y, \sigma_x^2)\):

\[\mathrm{KL}(\nu_1, \nu_2) = \frac{(x - y)^2}{2 \sigma_y^2} + \frac{1}{2}\left( \frac{\sigma_x^2}{\sigma_y^2} - 1 \log\left(\frac{\sigma_x^2}{\sigma_y^2}\right) \right).\]

See https://en.wikipedia.org/wiki/Normal_distribution#Other_properties

By default, sig2y is assumed to be sig2x (same variance).

Warning

The C version does not support different variances.

>>> klGauss(3, 3)
0.0
>>> klGauss(3, 6)
18.0
>>> klGauss(1, 2)
2.0
>>> klGauss(2, 1)  # And this KL is symmetric
2.0
>>> klGauss(4, 2)
8.0
>>> klGauss(6, 8)
8.0

x, y can be negative:

>>> klGauss(-3, 2)
50.0
>>> klGauss(3, -2)
50.0
>>> klGauss(-3, -2)
2.0
>>> klGauss(3, 2)
2.0

With other values for sig2x:

>>> klGauss(3, 3, sig2x=10)
0.0
>>> klGauss(3, 6, sig2x=10)
0.45
>>> klGauss(1, 2, sig2x=10)
0.05
>>> klGauss(2, 1, sig2x=10)  # And this KL is symmetric
0.05
>>> klGauss(4, 2, sig2x=10)
0.2
>>> klGauss(6, 8, sig2x=10)
0.2

With different values for sig2x and sig2y:

>>> klGauss(0, 0, sig2x=0.25, sig2y=0.5)  
-0.0284...
>>> klGauss(0, 0, sig2x=0.25, sig2y=1.0)  
0.2243...
>>> klGauss(0, 0, sig2x=0.5, sig2y=0.25)  # not symmetric here!  
1.1534...

>>> klGauss(0, 1, sig2x=0.25, sig2y=0.5)  
0.9715...
>>> klGauss(0, 1, sig2x=0.25, sig2y=1.0)  
0.7243...
>>> klGauss(0, 1, sig2x=0.5, sig2y=0.25)  # not symmetric here!  
3.1534...

>>> klGauss(1, 0, sig2x=0.25, sig2y=0.5)  
0.9715...
>>> klGauss(1, 0, sig2x=0.25, sig2y=1.0)  
0.7243...
>>> klGauss(1, 0, sig2x=0.5, sig2y=0.25)  # not symmetric here!  
3.1534...

Warning

Using Policies.klUCB (and variants) with klGauss() is equivalent to use Policies.UCB, so prefer the simpler version.

Policies.kullback_cython.klNegBin()¶

Kullback-Leibler divergence for negative binomial distributions. https://en.wikipedia.org/wiki/Negative_binomial_distribution

\[\mathrm{KL}(\mathrm{NegBin}(x, r), \mathrm{NegBin}(y, r)) = r \times \log((r + x) / (r + y)) - x \times \log(y \times (r + x) / (x \times (r + y))).\]

Warning

The two distributions must have the same parameter r.

>>> klNegBin(0.5, 0.5)
0.0
>>> klNegBin(0.1, 0.9)  
-0.711611...
>>> klNegBin(0.9, 0.1)  # And this KL is non-symmetric  
2.0321564...
>>> klNegBin(0.4, 0.5)  
-0.130653...
>>> klNegBin(0.01, 0.99)  
-0.717353...

Special values:

>>> klBern(0, 1)  # Should be +inf, but 0 --> eps, 1 --> 1 - eps  
34.539575...

With other values for r:

>>> klNegBin(0.5, 0.5, r=2)
0.0
>>> klNegBin(0.1, 0.9, r=2)  
-0.832991...
>>> klNegBin(0.1, 0.9, r=4)  
-0.914890...
>>> klNegBin(0.9, 0.1, r=2)  # And this KL is non-symmetric  
2.3325528...
>>> klNegBin(0.4, 0.5, r=2)  
-0.154572...
>>> klNegBin(0.01, 0.99, r=2)  
-0.836257...

Policies.kullback_cython.klPoisson()¶

Kullback-Leibler divergence for Poison distributions. https://en.wikipedia.org/wiki/Poisson_distribution#Kullback.E2.80.93Leibler_divergence

\[\mathrm{KL}(\mathrm{Poisson}(x), \mathrm{Poisson}(y)) = y - x + x \times \log(\frac{x}{y}).\]

>>> klPoisson(3, 3)
0.0
>>> klPoisson(2, 1)  
0.386294...
>>> klPoisson(1, 2)  # And this KL is non-symmetric  
0.306852...
>>> klPoisson(3, 6)  
0.920558...
>>> klPoisson(6, 8)  
0.273907...

Special values:

>>> klPoisson(1, 0)  # Should be +inf, but 0 --> eps, 1 --> 1 - eps  
33.538776...
>>> klPoisson(0, 0)
0.0

Policies.kullback_cython.klucb()¶

The generic KL-UCB index computation.

x: value of the cum reward,
d: upper bound on the divergence,
kl: the KL divergence to be used (klBern(), klGauss(), etc),
upperbound, lowerbound=float(‘-inf’) -> float: the known bound of the values x,
precision=1e-6: the threshold from where to stop the research,
max_iterations: max number of iterations of the loop (safer to bound it to reduce time complexity).

Note

It uses a bisection search, and one call to kl for each step of the bisection search.

For example, for klucbBern(), the two steps are to first compute an upperbound (as precise as possible) and the compute the kl-UCB index:

>>> x, d = 0.9, 0.2   # mean x, exploration term d
>>> upperbound = min(1., klucbGauss(x, d, sig2x=0.25))  # variance 1/4 for [0,1] bounded distributions
>>> upperbound  
1.0
>>> klucb(x, d, klBern, upperbound, lowerbound=0, precision=1e-3, max_iterations=10)  
0.9941...
>>> klucb(x, d, klBern, upperbound, lowerbound=0, precision=1e-6, max_iterations=10)  
0.994482...  
>>> klucb(x, d, klBern, upperbound, lowerbound=0, precision=1e-3, max_iterations=50)  
0.9941...
>>> klucb(x, d, klBern, upperbound, lowerbound=0, precision=1e-6, max_iterations=100)  # more and more precise!  
0.994489...

Note

See below for more examples for different KL divergence functions.

Policies.kullback_cython.klucbBern()¶

KL-UCB index computation for Bernoulli distributions, using klucb().

Influence of x:

>>> klucbBern(0.1, 0.2)  
0.378391...
>>> klucbBern(0.5, 0.2)  
0.787088...
>>> klucbBern(0.9, 0.2)  
0.994489...

Influence of d:

>>> klucbBern(0.1, 0.4)  
0.519475...
>>> klucbBern(0.1, 0.9)  
0.734714...

>>> klucbBern(0.5, 0.4)  
0.871035...
>>> klucbBern(0.5, 0.9)  
0.956809...

>>> klucbBern(0.9, 0.4)  
0.999285...
>>> klucbBern(0.9, 0.9)  
0.999995...

Policies.kullback_cython.klucbExp()¶

KL-UCB index computation for exponential distributions, using klucb().

Influence of x:

>>> klucbExp(0.1, 0.2)  
0.202741...
>>> klucbExp(0.5, 0.2)  
1.013706...
>>> klucbExp(0.9, 0.2)  
1.824671...

Influence of d:

>>> klucbExp(0.1, 0.4)  
0.285792...
>>> klucbExp(0.1, 0.9)  
0.559088...

>>> klucbExp(0.5, 0.4)  
1.428962...
>>> klucbExp(0.5, 0.9)  
2.795442...

>>> klucbExp(0.9, 0.4)  
2.572132...
>>> klucbExp(0.9, 0.9)  
5.031795...

Policies.kullback_cython.klucbGamma()¶

KL-UCB index computation for Gamma distributions, using klucb().

Influence of x:

>>> klucbGamma(0.1, 0.2)  
0.202...
>>> klucbGamma(0.5, 0.2)  
1.013...
>>> klucbGamma(0.9, 0.2)  
1.824...

Influence of d:

>>> klucbGamma(0.1, 0.4)  
0.285...
>>> klucbGamma(0.1, 0.9)  
0.559...

>>> klucbGamma(0.5, 0.4)  
1.428...
>>> klucbGamma(0.5, 0.9)  
2.795...

>>> klucbGamma(0.9, 0.4)  
2.572...
>>> klucbGamma(0.9, 0.9)  
5.031...

Policies.kullback_cython.klucbGauss()¶

KL-UCB index computation for Gaussian distributions.

Note that it does not require any search.

Warning

it works only if the good variance constant is given.

Influence of x:

>>> klucbGauss(0.1, 0.2)  
0.416227...
>>> klucbGauss(0.5, 0.2)  
0.816227...
>>> klucbGauss(0.9, 0.2)  
1.216227...

Influence of d:

>>> klucbGauss(0.1, 0.4)  
0.547213...
>>> klucbGauss(0.1, 0.9)  
0.770820...

>>> klucbGauss(0.5, 0.4)  
0.947213...
>>> klucbGauss(0.5, 0.9)  
1.170820...

>>> klucbGauss(0.9, 0.4)  
1.347213...
>>> klucbGauss(0.9, 0.9)  
1.570820...

Warning

Using Policies.klUCB (and variants) with klucbGauss() is equivalent to use Policies.UCB, so prefer the simpler version.

Policies.kullback_cython.klucbPoisson()¶

KL-UCB index computation for Poisson distributions, using klucb().

Influence of x:

>>> klucbPoisson(0.1, 0.2)  
0.450523...
>>> klucbPoisson(0.5, 0.2)  
1.089376...
>>> klucbPoisson(0.9, 0.2)  
1.640112...

Influence of d:

>>> klucbPoisson(0.1, 0.4)  
0.693684...
>>> klucbPoisson(0.1, 0.9)  
1.252796...

>>> klucbPoisson(0.5, 0.4)  
1.422933...
>>> klucbPoisson(0.5, 0.9)  
2.122985...

>>> klucbPoisson(0.9, 0.4)  
2.033691...
>>> klucbPoisson(0.9, 0.9)  
2.831573...