Khiops: outil d'apprentissage supervisé automatique pour la fouille de grandes bases de données multi-tables
Abstract
Khiops is an automatic supervised classification tool for mining large multi-tables databases.
The predictive importance of input variables is evaluated by the mean of discretization
models in the numerical case and of value grouping models in the categorical case. In the
case of a multi-tables database, for exemple customers with their purchases, an analysis data
table instances × variables is produced using automatic feature construction. The supervised
classification model is a naive Bayes classifier, with variable selection and model averaging.
The tool is designed for the analysis of large databases, with millions of instances, tens of
thousands of variables and hundreds of millions of records in the secondary tables.