Nouvelle stratégie pour le traitement distribué des processus décisionnels massifs dans un Big Data Warehouse
Abstract
This article deals with the optimization problem of the execution of massive analytical
processing on distributed data warehouses (ED) where the number of simultaneous queries is
counted by thousands. While taking as a starting point the techniques of optimization used in
the undistributed context, we propose a new strategy of selection and storage of materialized
views (MV) on distributed file system; then we handle the processing of the decisional queries
workload by using the MV. Our approach plays a role of mediator between the users and the
data warehouse to propose a better execution plan to their queries. The first results make us
believe that in a distributed environment, our approach improves more than 50% the execution
cost of a request compared to the system provided by default.