Software
  • Introduction
  • Code craft
    • Clean code
    • Reusable code
    • Refactoring
    • Code smells
  • Service craft
  • Software Principles
  • Big Data
    • Introduction
    • Data modelling
  • Patterns
    • Enterprise patterns
    • Software patterns
  • Agile & proj management
    • Scrum vs Kanban
    • Kanban
    • Scrum
  • Conflict management
  • Reading
  • Software Architecture
    • Architecture components
Powered by GitBook
On this page
  • Normalization
  • Transactional Schema (OLTP)
  • Star Schema (OLAP)
  • Snowflake Schema (OLAP)

Was this helpful?

  1. Big Data

Data modelling

PreviousIntroductionNextPatterns

Last updated 4 months ago

Was this helpful?

Normalization

  • normalized

    • reduce data(field) duplication

    • data storage is more efficient for write performance

    • simple queries run faster

    • data is scattered across multiple tables

  • denormalized

    • duplicate data across tables to improve read performance over write performance

    • data exists in 1 table, as opposed to multiple tables

  • check

Transactional Schema (OLTP)

  • normalized schema

  • used for OLTP - transactional workloads

Star Schema (OLAP)

  • easiest schema design for data warehouse/mart

  • data is organized into a central fact table that contains the measures of interest, surrounded by dimension tables that describe the attributes of the measures.

  • used for OLAP - analytical workloads

  • fact table: usually numeric values that can be aggregated

  • dimentional table: groups of hierarchies and descriptors that define the facts (attributes)

  • data is joined through FKs

  • data can be denormalized, to improve reporting reads

  • fact table examples: sales transactions, weather measurements

  • dimension table examples: client info, seller info, product info

Snowflake Schema (OLAP)

  • snowflake schema is a generalization of a star schema

similar to star schema, but dimensions are into multiple related tables (linked through FKs)

some database developers compromise by creating an underlying snowflake schema with built on top of it that perform many of the necessary joins to simulate a star schema.

Some multidimensional database modeling tools are optimized for snowflake schemas.

attributes results in storage savings, the tradeoff being additional complexity in source query joins.

https://en.wikipedia.org/wiki/Database_normalization#Normal_forms
normalized
views
OLAP
[3]
Normalizing