RNTI

MODULAD
Optimisation des performances dans les entrepôts de données NoSQL en colonnes
In EGC 2017, vol. RNTI-E-33, pp.69-80
Abstract
NoSQL Column Oriented model offer a flexible and highly non-normalized database schema. In this paper, we propose a method that transforms a relational data warehouse to a NoSQL one with distributed columns in a multi-node cluster. Our method is based on a strategy of grouping attributes from fact tables and dimensions, as families ´ columns. In this purpose, we used two algorithms, the first one is a meta-heuristic algorithm, in this case the Particle Swarm Optimization : PSO, and the second one is the k-means algorithm. To evaluate our method, we use TCP-DS benchmark. We conducted several tests to evaluate these algorithms in the generation of families of columns and data partitions in the NoSQL Column Oriented Hbase DBMS, with a MapReduce paradigm and Hadoop distributed system.