RNTI

MODULAD
Clustering par apprentissage de distance guidé par des préférences sur les attributs
In EGC 2016, vol. RNTI-E-30, pp.333-344
Abstract
In recent years many semi-supervised clustering methods have integrated constraints between pairs of objects or class of labels, so that the final partition is consistent with the needs of the user. However in some cases where the dimensions of studies are clearly defined, it seems appropriate to directly express constraints on the attributes to explore the data. Furthermore, such formulation would avoid the classic problems of the curse of dimensionality and the interpretation of the clusters. This article proposes to take into account the preferences of the user on the attributes to guide the learning of the distance for clustering. Specifically, we show how to parameterize the Euclidean distance with a diagonal matrix whose coefficients must be closest to the weight set by the user. This approach builds a compromise clustering between a data-driven and a user-driven solution. We observe experimentally that the addition of preferences may be essential to achieve a better clustering.