MOLAP and ROLAP

I currently work in an organisation that has a debate on whether to use MOLAP or ROLAP. But first of all: what is discussed here?
ROLAP and MOLAP are two different techniques to store data that are meant for OLAP analysis. In ROLAP the data are stored in tables in a relational database and each data search in OLAP is translated into a SQL query that is fired upon the relational database. In MOLAP, the data are stored in a proprietary structure that is fully optimised to return data from OLAP investigations.
Both system have their advantages. The big advantage of ROLAP is that one takes full advantage of an existing DBMS. The data are stored in a DBMS. Any SQL query may then use the computing power of a DBMS without the need of installing yet another programme. It is also true that in a modern data warehouse environment with an existing data warehouse, one may directly use the tables that reside in a data warehouse. Hence a ROLAP doesn’t need to copy the data in a proprietary structure to allow retrieving OLAP outcomes.
The big advantage of MOLAP is that a separate proprietary data structure is created that geared towards fast returning OLAP outcomes from an OLAP question. To do so, the data structure needs to anticipate on every possible OLAP question. This implies that with an increasing number of different outcomes, the proprietary data structure increases in size.
This shows that one may have a case to advocate MOLAP if:
– the number of different outcomes from OLAP is not too big (hence the proprietary data structure is not too big in size)

In a ROLAP structure, each OLAP question is translated into a SQL query. If the same OLAP is raised multiple times, multiple SQL queries will be fired. As these SQL queries are identical, it might be that the DBMS undertakes the same actions multiple times (assuming results are not cached). In a MOLAP structure, the possible OLAP outcomes are pregenerated in a proprietary data structure. Hence, complex calculations are not only doable, they return quickly, once they are stored in the MOLAP structure, as the are pregenerated. An update can be done after a fresh data load in the data warehouse. Once the data structure is ready, each OLAP question can be answered from that data structure. If the same OLAP question is raised, the same data are retrieved from that proprietary datastructure.
This shows that one has a case to advocate MOLAP if:
– the queries in the database are complicated; examples of such complex queries are

queries with a complex CASE statement,
CASE in the group by,
queries that generate running totals,
queries that process large chunks of data
queries with complex joins
etc.

– the same OLAP question is repeated multiple times
– the data are refreshed at large time intervals

One uses an OLAP structure to undertake an analysis that summarises data and, if required, allocates data into lower levels of detail. I realise that an OLAP structure is also used to create a framework to retrieve a series of reports. Within these reports, no summarisations or allocations are done. Hence, one may say that only a part of the OLAP functionality is used. That also means that the proprietary data structure that is created in MOLAP is only partly used if no summarisations are undertaken. In that case a MOLAP might not be the best solution.
This shows that one has a case to advocate MOLAP if:
– one undertakes many OLAP movements, like summarisation, consolidation and detailing.

The case for ROLAP can be made if we have:

ad hoc reports or infrequently used reports,
near real time reports,
high cardinal dimension tables
a request for non-summary reports
reports that are only require a simple SQL that returns results quickly

Breaking

Door tom

Gerelateerd bericht

Je miste

Flask and JSON

A webserver from the command line

Use the node.js server as restful app server

Reading a CSV file and translate into dataframe