What is a Data Warehouse?
Do you know
any related sites?
Glossary of terms
Considerations for Using Aggregate Tables and OLAP

Many data warehousing architects and DBAs face the dual challenge of maximizing flexibility within their data warehousing environment, while at the same time, ensuring maximum performance. This challenge can be overcome by developing a strategy on how to manage aggregates, or pre-calculated data summarizations. The resulting aggregate strategy can play a critical role in a data warehousing initiative.

Aggregation refers to the gathering of information in separate sets from two or more sources. Often, this data is stored in a data warehouse in a summarized form. For example, you may wish to summarize the data by various time periods. Aggregates are used for two main reasons. One is to save storage space; data warehouses can get very large, and the use of aggregates greatly reduces the space needed to store data. The second reason is to improve the performance of business intelligence tools. When queries run faster, they take up less processing time and users get their information back more quickly.

Aggregate strategies typically rely on two technologies: OLAP and relational aggregate tables. While the most prevalent strategy to date has been to use aggregate tables, as they provide a practical, cost-effective way to improve query performance, many IT professionals are now realizing the performance and efficiency potential that OLAP technology provides.

OLAP technology was once thought of as being incompatible with front-end query tools. However, compatibility issues are becoming less frequent, particularly because OLAP data is becoming more and more compatible with relational data. There are many advantages to OLAP. In most instances, OLAP queries can be performed faster than relational queries. Also, OLAP enables querying flexibility, which allows any dimension, level or member to be returned to either a row or column in a result set. The speed and flexibility of OLAP makes it an attractive choice.

Whatever the technology you choose for your aggregate strategy, it's important to keep an open mind to both aggregate tables and OLAP. You may find that your data warehousing environment can perform more efficiently with both technologies in place. For example, you may be able to use aggregate tables for reporting and OLAP for analysis.

Also, relational and OLAP technologies are quite diverse. Therefore, there is no definite rule for when to use OLAP and when to use aggregate tables. It's important to develop a clear understanding of the aggregates you query most often, and the latency requirements for your aggregates. With this understanding, you will be able to better realize what combination of technology best suits your needs.

One last word on aggregation strategies: You need to keep in mind how granular you need the data to be to meet your business analysis objectives. Aggregation can result in the loss of time-sensitive linear data. If for instance, you're trying to compile a complete customer profile in order to better understand your customers, aggregation may not be the answer. In this case, you would likely want to store both detailed information and aggregated information. This approach may take up more storage space in the data warehouse, but it also gives your users maximum flexibility to look at all details related to customer relationships, while still achieving the best performance when looking at summarized data.