| Star and Snowflake Schemas - Unveiling the Difference
A database schema is a description of objects on a database which includes tables, views, indexes and synonyms. The most common schemas for a data warehouse are star schemas and snowflake schemas.
A star schema consists of a central data table, or fact table that is linked to one or more dimension tables. It is called a star schema because this model resembles a star, with points radiating from the center. The center of the star consists of one or more fact tables and the points of the star are the dimension tables, which contain information on particular attribute in the fact table. Built for simplicity and speed, the star schema provides both detailed and summarized data to users. This schema reduces the number of physical joins therefore allowing users to define hierarchies and easily navigate between tables. As well, the star schema helps increase performance as each dimension table can be heavily indexed without resulting in space ramifications at the database level.
For more complex scenarios in which fact tables are wide and cannot hold a lot of information, a snowflake schema can be used as an effective model. A snowflake schema is a more complex data warehouse model as it groups dimension data into multiple tables instead of one large table. While this saves space on the database, it increases the number of dimension tables and foreign key joins, resulting in more complex queries and reduced query performance. Snowflake schemas are often used for large-sized data warehouses and are primarily used for loading and replenishing data marts. The snowflake schema is not typically used for heavy end-user query workloads.
Both schemas have their advantages and disadvantages. Before deciding on which schema to implement, companies must assess their needs, the current and projected size of the data warehouse, as well as the data architecture. Only then can they determine which schema - star or snowflake - best fits their business needs.
|