Echantillonnage de motifs avec une contrainte de fréquence
Abstract
Pattern sampling is a recent technique for discovering patterns that promotes interactivity with the user. Its principle is to randomly draw a pattern in proportion to its interestingness. Unfortunately, the draws can focus on a part of the search space with non-frequent but extremely numerous patterns. It would be possible to sample patterns and eliminate those that are not frequent, but the rejection rate is often too high. In this paper, we propose the first pattern sampling method with a minimum frequency constraint. It is based on (i) the deletion of the non-frequent items and (ii) the projection of the database on each item. We propose a generic method that removes all occurrences containing a non-frequent pair of items. Our experiments show that our method significantly reduces the rejection rate.