Utilisation de techniques de modélisation thématiques pour la détection de nouveauté dans des flux de données textuelles.
Abstract
With the advent of social networks and the multiplication of product messages about companies,
better understanding of customer feedback has become a key issue. Clustering techniques
and thematic modeling already allow to observe the main trends observed in this data. It
is interesting, from an anticipatory perspective, to observe the emerging themes and to identify
them before they grow in size. To solve this problem, we studied the use of LDA models to
detect documents related to these emerging themes. We tested three systems on several novelty
arrival scenarios in the data stream. We show that the thematic models allow to detect this
novelty but that it depends on the scenario considered.