Exploitation des dépendances entre labels pour la classification multi-labels de textes par le biais de transformeurs
Abstract
We introduce a new approach to improve and adapt transformers for multi-label text classification. Dependencies between labels are an important factor in the multi-label context. Our proposed strategies take advantage of co-occurrences between labels. Our first approach consists in updating the final activation of each label by a weighted sum of all activations by these occurrence probabilities. The second proposed method consists in including the activations of all labels in the prediction. This is done using an approach similar to the ‘self-attention' mechanism. As the most known multi-label datasets tend to have a small cardinality, we propose a new dataset, called ‘arXiv-ACM', comprised of scientific abstracts from arXiv, tagged with their ACM keywords. We show that our approaches contribute to a performance gain, establishing a new state of the art for the studied datasets.