RNTI

MODULAD
SGIA : Stratégie Intelligente de Groupement pour Améliorer le Traitement des Requêtes OLAP en MapReduce
In EDA 2019, vol. RNTI-B-15, pp.93-108
Abstract
Enhancing OLAP query performance in a distributed system such as Hadoop and Spark is a challenging task. An OLAP query is composed of several operations, such as projection, filtering, join, and grouping operations. Each operation can be executed in the map or in the reduce phase with one or several Spark stages. While some operations, such as star join and filtering, can be enhanced by using a static partitioning technique and load balancing for the data since we have the prior knowledge of the load balancing decision. However, optimizing SGIA : Stratégie Intelligente de Groupement Group By and aggregate functions, requires in general, a dynamic technique of partitioning and distributing to make a good partition scheme of the reducer inputs since we can only pick up the relevant information at query runtime. In this paper, we propose a smart method, called SGIA, to balance on the fly the reducer inputs. We used a multi-agent system that can balance smartly the reducer loads for Group By task. Our experiments reveal that our proposal outperforms existing approaches in terms of query execution time.