Découverte de sous-groupes avec les arbres de recherche de Monte Carlo
Abstract
Discovering descriptions that highly distinguish a class label from another is still a chal-
lenging task. Such patterns enable the building of intelligible classifiers and suggest hypothesis
that may explain the presence of a label. Subgroup Discovery (SD), a framework that formally
defines this pattern mining task, still faces two major issues: (i) to define appropriate quality
measures characterizing the singularity of a pattern; (ii) to choose an accurate heuristic search
space exploration when a complete enumeration is unfeasible. To date, the most efficient SD
algorithms are based on a beam search. The resulting pattern collection lacks however of di-
versity due to its greedy nature. We propose to use a recent exploration technique, Monte Carlo
Tree Search (MCTS). To the best of our knowledge, this is the first attempt to apply MCTS for
pattern mining. The exploitation/exploration trade-off and the power of random search leads
to any-time mining (a solution is available any-time and improves) that generally outperforms
beam search. Our empirical study on various benchmark and real-world datasets shows the
strength of our approach with several quality measures.