RNTI

MODULAD
Approche contextuelle par régression pour les tests A/B
In EGC 2018, vol. RNTI-E-34, pp.269-274
Abstract
In this work we devise a principled approach which mixes the contextual bandit framework with the learning of a stratification procedure. The proposed algorithm is able to balance contextual exploration and exploitation more efficiently than state-of-the-art bandit algorithms for finite time at cost of a controlled probability for a linear regret. Finally, the learned structure is easily interpretable by a human.