Application du coclustering à l'analyse exploratoire d'une table de données
Abstract
The cross-classification method is an unsupervised analysis technique that extracts the ex-
isting underlying structure between individuals and the variables in a data table as homoge-
neous blocks. This technique is limited to variables of the same type, either numerical or
categorical, and we propose to extend it by proposing a two-step methodology. In the first
step, all the variables are binarized according to a number of bins chosen by the analyst, by
discretization in equal frequency in the numerical case, or keeping the most frequent values
in the categorical case. The second step applies a coclustering method between the individ-
uals and the binary variables, leading to groups of individual and groups of variable parts.
We apply this methodology on several data sets and compare with the results of a multiple
correspondence analysis MCA applied to the same data.