Découverte de motifs à la demande dans une base de données distribuée
Abstract
Only few pattern mining methods are dedicated to distributed databases. In fact, the cen-
tralization of data is often less expensive than the communication of all mined patterns. To cir-
cumvent this difficulty, this paper follows a parsimonious approach by sampling patterns. We
propose the algorithm DDSAMPLING that draws a pattern from a distributed database propor-
tionally to its interest. We demonstrate its accuracy and analyze its complexity. Experiments
show on several datasets its robustness against the failures of a site or the network.