Repondération Préférentielle pour l'Apprentissage Biqualité
Abstract
This paper proposes an original and global vision of Weakly Supervised Learning, leading
to the design of generic approaches able to handle any kind of labeling noise. A new use case
called “Biquality Data” is introduced. It assumes that a small reliable dataset of correctly labeled
examples is available, in addition to an unreliable dataset comprising noisy examples. In
this framework we propose a new reweighting scheme capable of detecting uncorrupted examples
from the unreliable dataset. This algorithm allows learning classifieurs on both datasets.
Multiple experiments reproducing several types of labeling noise empirically demonstrate that
the proposed algorithm outperforms state-of-the-art competitors.