Data modelling
Normalization
normalized
reduce data(field) duplication
data storage is more efficient for write performance
simple queries run faster
data is scattered across multiple tables
denormalized
duplicate data across tables to improve read performance over write performance
data exists in 1 table, as opposed to multiple tables
Transactional Schema (OLTP)
normalized schema
used for OLTP - transactional workloads
Star Schema (OLAP)
easiest schema design for data warehouse/mart
data is organized into a central fact table that contains the measures of interest, surrounded by dimension tables that describe the attributes of the measures.
used for OLAP - analytical workloads
fact table: usually numeric values that can be aggregated
dimentional table: groups of hierarchies and descriptors that define the facts (attributes)
data is joined through FKs
data can be denormalized, to improve reporting reads
fact table examples: sales transactions, weather measurements
dimension table examples: client info, seller info, product info
Snowflake Schema (OLAP)
snowflake schema is a generalization of a star schema
similar to star schema, but dimensions are normalized into multiple related tables (linked through FKs)
some database developers compromise by creating an underlying snowflake schema with views built on top of it that perform many of the necessary joins to simulate a star schema.
Normalizing attributes results in storage savings, the tradeoff being additional complexity in source query joins.
Last updated
Was this helpful?