Tuesday, June 11, 2024

Normalization in DBMS

Problems: Redundancy


Different kinds of Normal Forms:

1NF, 2NF,3NF etc


Data Modelling

Star Schema

Star schemas denormalize the data, which means adding redundant columns to some dimension tables to make querying and working with the data faster and easier. The purpose is to trade some redundancy (duplication of data) in the data model for increased query speed, by avoiding computationally expensive join operations.

In this model, the fact table is normalized but the dimensions tables are not. That is, data from the fact table exists only on the fact table, but dimensional tables may hold redundant data.


Resources: https://www.databricks.com/glossary/star-schema


snowflake schema:

snowflake schema is a multi-dimensional data model that is an extension of a star schema, where dimension tables are broken down into subdimensions. Snowflake schemas are commonly used for business intelligence and reporting in OLAP data warehouses, data marts, and relational databases.

In a snowflake schema, engineers break down individual dimension tables into logical subdimensions. This makes the data model more complex, but it can be easier for analysts to work with, especially for certain data types.

It's called a snowflake schema because its entity-relationship diagram (ERD) looks like a snowflake, as seen below.

A snowflake schema diagram with a central fact table that connects to multiple dimensional tables and subdimensional tables via foreign keys.

No comments:

Post a Comment