RNTI

MODULAD
Quand les sous-groupes rencontrent les graduels : découverte de sous-groupes identifiant des corrélations exceptionnelles
In EGC 2019, vol. RNTI-E-35, pp.201-212
Abstract
Subgroup discovery (SD) is a mature field at the frontier of data mining and machine learn- ing. It gathers methods designed to find coherent subgroups of a dataset where one or more targets interact in an unusual way. Correlation model classes have already been defined to discover interesting subgroups when dealing with two numerical targets. However, in this supervised setting, the two numerical targets are fixed before the subgroup search. To make unsupervised exploration possible, we propose to search for arbitrary subsets of numerical tar- gets whose correlation is exceptional for an automatically found subgroup. We introduce the problem of rank-correlated subgroup discovery with an arbitrary subset of numerical targets. A rank-correlated subgroup is identified by both conditions on descriptive attributes, whether nu- meric or nominal, and a pattern on numeric attributes that captures (positive or negative) rank correlations. We define a new branch-and-bound algorithm that exploits some pruning proper- ties. An empirical study on several datasets demonstrates the efficiency and the effectiveness of the algorithm.