Lists Home |
Date Index |
> Off topic, but since data warehousing comes up from time to time:
> what is the advantage of using an OLAP design vs a relational design?
> Is this advantage better or worse than a triple design?
OLAP is not really an alternative to relational; OLAP augments
relational. Think of it like this:
1988: People who want reports from their "operational RDBMS" or
"reporting RDBMS" are using wonderful "ad-hoc report generators" which
dynamically create SQL to query the tables. Smart people make copies of
the "operational data" into a "reporting RDBMS" and create some tables
that are pre-aggregated by common groupings like fiscal period, region,
etc. and report from these instead.
1990: Some really smart guys get the idea that these "aggregate
reporting tables" (still relational) could be managed and qeried
transparently. The ad-hoc SQL generator can pick the right tables based
on level of aggregation. Michael Saylor starts selling "DSS Agent" and
"OLAP" becomes mainstream. At this time, OLAP just means "a ton of
pre-aggregated relational tables and a query front-end that
intelligently selects among them".
1997: Some companies start making distinction between MOLAP and ROLAP
(multidimensional vs. relational OLAP). The MOLAP companies sell
solutions in which the aggregates are not stored relationally, but are
instead stored in a huge binary "cube" optimized for retrieval speed.
Even MOLAP is a bit of a misnomer, since the source data which is used
to build the cube is still relational.
2000: The OLAP in the world is roughly evenly divided between relational
2001: Microsoft proves that cubes in the Terabyte range are possible,
and are smaller and faster than the source relational data.
2004: The idea of a "cube" is not much different than an index, and the
leading database vendors are now integrating "cube" functionality deeply
in the relational engine. Normal SQL queries against the source
relational tables will automatically consult the "cube" when necessary;
the cube is maintained transparently and automatically, and so on.
(On the other hand, a triples table is very much like a
highly-denormalized fact-table. And such fact-tables are often used to
load up an OLAP cube. The raw triples are not very useful, and unless
you want an incredibly slow system, you will need to create lots of
indices to pre-join things.)