Clustering topologique pour le flux de données
Abstract
Clustering data streams is becoming the most efficient way to cluster very large data sets. In this paper, we present a new approach, called G-Stream, for topological clustering of evolving data streams. The proposed method is an extension of the GNG (Growing Neural Gas) algorithm specially designed to manage data streams. G-Stream allows to discover incrementally clusters of arbitrary shape by making one pass over the data. The performance of the proposed algorithm is evaluated on both synthetic and real-world data sets.