RNTI

MODULAD
Fouille de motifs diversifiés : une approche basée sur la relaxation et l'échantillonnage
In SDC 2024, vol. RNTI-A-9, pp.33-70
Abstract
In this paper, we use constraint programming to efficiently mine a diverse set of closed patterns. Diversity is controlled through a threshold on the Jaccard similarity measure. We show that the Jaccard measure has no monotonicity property, which prevents usual pruning techniques and makes classical pattern mining unworkable. This is why we propose antimonotonic lower and upper bound relaxations, which allow effective pruning, with an efficient branching rule, boosting the whole search process. Finally, we show how to integrate our relaxation to sample diverse set of patterns. We demonstrate experimentally that our approach CLOSEDDIVERSITY significantly reduces the number of patterns and is very efficient in terms of running times, particularly on dense datasets. For the pattern sampling task, we show that SDIVJAX-2 constitutes a better compromise between running time and redundancy reduction.