Post-traitement pour la classification probabiliste non supervisée sous contraintes
Abstract
Constrained clustering has received a lot of attention this last decade. It aims at integrating
expert knowledge beyond the classic must-link and cannot-link constraints. Most probabilistic systems model this by integrating in the optimization criterion a second term penalizing
the non satisfaction of constraints. They suffer from a lack of adaptability to various kinds
of constraints and they do not guarantee the satisfaction of all the constraints, even if these
constraints are hard and have to be satisfied. We propose a post-processing method that given
a matrix assigning to each point their probability of belonging to a cluster, find the best assignment satisfying all the constraints. This method can be applied to any probabilistic algorithms,
including deep clustering ones. Experiments show that when evaluated on a ground truth, our
method is competitive in terms of clustering quality with the more recent approaches while
being efficient.