Découverte de labels dupliqués par l'exploration du treillis des classifieurs binaires
In EGC 2016, vol. RNTI-E-30, pp.255-266
Analysis of behavioral data represents today a big issue. Anyone generates activity and mobility traces. When traces are labeled by the user that generates it, models can be learned to accurately predict the user of an unknown trace. In online systems however, users may have several virtual identities, or duplicate labels. By ignoring them, the prediction accuracy drastically drops. In this article, we tackle this duplicate labels identification problem, and present an original approach that explores the lattice of binary classifiers. Each subset of labels is learned against the others, and constraints make possible to identify duplicate labels while pruning the search space. We experiment with data of the video game STARCRAFT 2. Results are of good quality and encouraging.