Découverte de sous-groupes à partir de données séquentielles par échantillonnage et optimisation locale
Abstract
Discovering rules that characterize classes remains difficult, especially within the context
of sequential data analysis. This has been nicely formalized within the subgroup discovery
setting and numerous algorithms have been proposed over the last 20 years. An exhaustive
enumeration strategy is generally intractable. Therefore, heuristic approaches are needed and
the reference framework relies on a beam search stategy and its run time parameters. We
propose a sampling method that samples patterns from the search space to support subgroup
discovery in labeled sequences of itemsets. Our approach enables the discovery of local optima
with respect to a quality measure though the method remains generic with respect to the chosen
quality measure. We do not have to set parameters and it is simple to implement. Our empir-
ical validation that includes a comparison with state-of-the-art algorithms exhibits interesting
results.