Clustering de séries temporelles par construction de dictionnaire
Abstract
Clustering is a particular subset of the data analysis methods, which aims at researching
and discriminating groups (clusters) of similar observations in a dataset. Often these data
can be observed at different step forming time series as the evolution of a stock market value
or meteorological phenomena. In some time series, the sequences of observations exhibit
distinct and interpretable phases, which we call "regimes". For instance, a car speed can show
acceleration, cruise speed and breaking phases. In this article we propose a method dedicated
to the clustering of this particular kind of time series. It consists in the combination of three
steps: an individual segmentation of the time series, the construction of a common regimes
dictionary, and the final clustering of categorical sequences produced from the recoding of the
time series in this dictionary. We present the different advantages of this method and the results
obtained on several public datasets.