Apprentissage multi-labels et multi-tâches en continu pour données tabulaires: proposition d'un protocole de création de tâches et évaluation de classifieurs
Abstract
Recent progress has been made in the field of multi-label stream classification, where an instance
can be associated with several labels simultaneously. Most recent research has focused
on adapting models to the dynamic distribution of non-stationary data streams. However, continual
learning is not reduced to adaptation to concept drift: phenomena such as catastrophic
forgetting, forward and backward transfers appear when new classification tasks appear in the
data stream. The aim of this article is to develop a standardized evaluation protocol specifically
adapted to the study of these phenomena, in order to identify the most promising strategies for
this new problem of multi-label, multi-task learning on tabular data in a stream. This protocol
includes (i) the creation of multi-label and multi-task streams and (ii) an evaluation protocol
to measure (a) online performance, (b) phenomena related to continual learning and (c) resource
consumption. This protocol is used to compare 12 continual multi-label classification
strategies on 4 open literature datasets and 3 simulated datasets. This exploratory analysis has
enabled us to identify the promising nature of frugal neural networks coupled with data replay.