Policies.Experimentals.UCBoost_faster_cython module

Optimized version of some utility functions for the UCBoost module, that should be compiled using Cython (http://docs.cython.org/).

Warning

This extension should be used with the setup.py script, by running:

$ python setup.py build_ext --inplace

You can also use [pyximport](http://docs.cython.org/en/latest/src/tutorial/cython_tutorial.html#pyximport-cython-compilation-for-developers) to import the kullback_cython module transparently:

>>> import pyximport; pyximport.install()  # instantaneous  
(None, <pyximport.pyximport.PyxImporter at 0x...>)
>>> from UCBoost_faster_cython import *     # takes about two seconds
Policies.Experimentals.UCBoost_faster_cython.biquadratic_distance()

The biquadratic distance, \(d_{bq}(p, q) := 2 (p - q)^2 + (4/9) * (p - q)^4\).

Policies.Experimentals.UCBoost_faster_cython.distance_t()

A shifted tangent line function of kullback_leibler_distance_on_mean().

\[d_t(p, q) = \frac{2 q}{p + 1} + p \log\left(\frac{p}{p + 1}\right) + \log\left(\frac{2}{\mathrm{e}(p + 1)}\right).\]

Warning

I think there might be a typo in the formula in the article, as this \(d_t\) does not seem to “depend enough on q” (just intuition).

Policies.Experimentals.UCBoost_faster_cython.hellinger_distance()

The Hellinger distance, \(d_{h}(p, q) := (\sqrt{p} - \sqrt{q})^2 + (\sqrt{1 - p} - \sqrt{1 - q})^2\).

Policies.Experimentals.UCBoost_faster_cython.kullback_leibler_distance_lowerbound()

Lower-bound on the Kullback-Leibler divergence for Bernoulli distributions. https://en.wikipedia.org/wiki/Bernoulli_distribution#Kullback.E2.80.93Leibler_divergence

\[d_{lb}(p, q) = p \log\left( p \right) + (1-p) \log\left(\frac{1-p}{1-q}\right).\]
Policies.Experimentals.UCBoost_faster_cython.kullback_leibler_distance_on_mean()

Kullback-Leibler divergence for Bernoulli distributions. https://en.wikipedia.org/wiki/Bernoulli_distribution#Kullback.E2.80.93Leibler_divergence

\[\mathrm{kl}(p, q) = \mathrm{KL}(\mathcal{B}(p), \mathcal{B}(q)) = p \log\left(\frac{p}{q}\right) + (1-p) \log\left(\frac{1-p}{1-q}\right).\]
Policies.Experimentals.UCBoost_faster_cython.min_solutions_pb_from_epsilon()

Minimum of the closed-form solutions of the following optimisation problems, for \(d = d_s^k\) approximation of \(d_{kl}\) and any \(\tau_1(p) \leq k \leq \tau_2(p)\):

\[\begin{split}P_1(d_s^k)(p, \delta): & \max_{q \in \Theta} q,\\ \text{such that } & d_s^k(p, q) \leq \delta.\end{split}\]
  • The solution is:

\[\begin{split}q^* &= q_k^{\boldsymbol{1}(\delta < d_{kl}(p, q_k))},\\ d_s^k &: (p, q) \mapsto d_{kl}(p, q_k) \boldsymbol{1}(q > q_k),\\ q_k &:= 1 - \left( 1 - \frac{\varepsilon}{1 + \varepsilon} \right)^k.\end{split}\]
  • \(\delta\) is the upperbound parameter on the semi-distance between input \(p\) and solution \(q^*\).

Policies.Experimentals.UCBoost_faster_cython.solution_pb_bq()

Closed-form solution of the following optimisation problem, for \(d = d_{bq}\) the biquadratic_distance() function:

\[\begin{split}P_1(d_{bq})(p, \delta): & \max_{q \in \Theta} q,\\ \text{such that } & d_{bq}(p, q) \leq \delta.\end{split}\]
  • The solution is:

\[q^* = \min(1, p + \sqrt{-\frac{9}{4} + \sqrt{\frac{81}{16} + \frac{9}{4} \delta}}).\]
  • \(\delta\) is the upperbound parameter on the semi-distance between input \(p\) and solution \(q^*\).

Policies.Experimentals.UCBoost_faster_cython.solution_pb_hellinger()

Closed-form solution of the following optimisation problem, for \(d = d_{h}\) the hellinger_distance() function:

\[\begin{split}P_1(d_h)(p, \delta): & \max_{q \in \Theta} q,\\ \text{such that } & d_h(p, q) \leq \delta.\end{split}\]
  • The solution is:

\[q^* = \left( (1 - \frac{\delta}{2}) \sqrt{p} + \sqrt{(1 - p) (\delta - \frac{\delta^2}{4})} \right)^{2 \times \boldsymbol{1}(\delta < 2 - 2 \sqrt{p})}.\]
  • \(\delta\) is the upperbound parameter on the semi-distance between input \(p\) and solution \(q^*\).

Policies.Experimentals.UCBoost_faster_cython.solution_pb_kllb()

Closed-form solution of the following optimisation problem, for \(d = d_{lb}\) the proposed lower-bound on the Kullback-Leibler binary distance (kullback_leibler_distance_lowerbound()) function:

\[\begin{split}P_1(d_{lb})(p, \delta): & \max_{q \in \Theta} q,\\ \text{such that } & d_{lb}(p, q) \leq \delta.\end{split}\]
  • The solution is:

\[q^* = 1 - (1 - p) \exp\left(\frac{p \log(p) - \delta}{1 - p}\right).\]
  • \(\delta\) is the upperbound parameter on the semi-distance between input \(p\) and solution \(q^*\).

Policies.Experimentals.UCBoost_faster_cython.solution_pb_sq()

Closed-form solution of the following optimisation problem, for \(d = d_{sq}\) the biquadratic_distance() function:

\[\begin{split}P_1(d_{sq})(p, \delta): & \max_{q \in \Theta} q,\\ \text{such that } & d_{sq}(p, q) \leq \delta.\end{split}\]
  • The solution is:

\[q^* = p + \sqrt{\frac{\delta}{2}}.\]
  • \(\delta\) is the upperbound parameter on the semi-distance between input \(p\) and solution \(q^*\).

Policies.Experimentals.UCBoost_faster_cython.solution_pb_t()

Closed-form solution of the following optimisation problem, for \(d = d_t\) a shifted tangent line function of kullback_leibler_distance_on_mean() (distance_t()) function:

\[\begin{split}P_1(d_t)(p, \delta): & \max_{q \in \Theta} q,\\ \text{such that } & d_t(p, q) \leq \delta.\end{split}\]
  • The solution is:

\[q^* = \min\left(1, \frac{p + 1}{2} \left( \delta - p \log\left(\frac{p}{p + 1}\right) - \log\left(\frac{2}{\mathrm{e} (p + 1)}\right) \right)\right).\]
  • \(\delta\) is the upperbound parameter on the semi-distance between input \(p\) and solution \(q^*\).

Policies.Experimentals.UCBoost_faster_cython.squadratic_distance()

The quadratic distance, \(d_{sq}(p, q) := 2 (p - q)^2\).