RNTI

MODULAD
Clé de partition multi-attributs pour un partitionnement horizontal optimal des entrepôts de données NoSQL en colonnes
In EDA 2018, vol. RNTI-B-14, pp.89-104
Abstract
The column family NoSQL databases offer several storage techniques that are well adapted to data warehouses. Several scenarios are possible to develop column NoSQL data warehouses. In this paper, we propose a new method to build an efficient distributed data warehouse inside column family NoSQL DBMSs. Our method, named Balanced-CN-DW, is based on the association rules method that allows to obtain groups of frequently used attributes in the workload. Hence, the partition keys RowKey, necessary to distribute data onto the different cluster nodes, are composed of those attributes groupes. To evaluate our method, we use the TPC-DS benchmark within the NoSQL HBase DBMS and carry out several experiments by executing TPC-DS decision queries. The obtained results show that our data placement and distribution strategy increases the performance of our column NoSQL data warehouse model.