Skip to content

SCD (Slowly Changing Dimensions)

Slowly Changing Dimensions (SCD) are methods used in data warehouses to manage changes in data that happen slowly over time. For example, dimension tables store descriptive details like customer names, product categories, or employee roles, which can gradually change over time. SCD ensures these changes are managed effectively, preserving historical information and enabling accurate trend analysis.

There are different types of SCDs, each suited for specific scenarios:

SCD types

SCD Type 0: Retain Original Data

In Type 0, no changes are made to the original data, even when updates occur.

Example

Customer IDNameAddress
1AliceNew York
  • Even if Alice moves to Boston, the table still shows her address as New York.
  • Use Case: Historical consistency is prioritized, and updates are ignored.

SCD Type 1: Overwrite Data

In Type 1, old data is replaced with new data, and no history is preserved.

Before Update:

Customer IDNameAddress
1AliceNew York

After Update (Alice moves to Boston):

Customer IDNameAddress
1AliceNew York
  • Use Case: Only current data matters, such as updating product prices or customer details.

SCD Type 2: Add New Row

In Type 2, a new row is added for every change, and the original data is retained with a history of changes.

Before Update:

Customer IDNameAddressStart DateEnd Date
1AliceNew York2023-01-01NULL

After Update (Alice moves to Boston):

Customer IDNameAddressStart DateEnd Date
1AliceNew York2023-01-012023-06-30
1AliceBoston2023-07-01NULL
  • Use Case: Track a complete history of changes, such as customer address history or employee roles.

SCD Type 3: Add New Column

In Type 3, a new column is added to store the previous value while keeping the current value.

Example Table

Customer IDNameCurrent AddressPrevious Address
1AliceBostonNew York

After Update (Alice moves to London from Boston):

Customer IDNameCurrent AddressPrevious Address
1AliceLondonBoston
  • If Alice moves again, the current address updates, and the previous address becomes Boston.
  • Use Case: Track the most recent change without maintaining a full history.

SCD Type 4: Separate History Table

In Type 4, the main table stores only the current data, while a separate table records all historical changes.

Main Table (Current Data):

Customer IDNameCurrent Address
1AliceBoston

History Table:

Customer IDAddressStart DateEnd Date
1New York2023-01-012023-06-30
1Boston2023-07-01NULL
  • Use Case: Useful when the main table should remain lightweight, and detailed history is kept in a separate table.

SCD Type 6: Hybrid Approach

Type 6 combines elements of Types 1, 2, and 3. It stores the current value, a previous value column, and maintains full historical rows.

Example Table:

CustomerIDNameCurrent AddressPrevious AddressStartDateEnd Date
1AliceBostonNew York2023-07-01NULL

After Update (Alice moves to London from Boston ):

Customer IDNameCurrent AddressPrevious AddressStart DateEnd Date
1AliceLondonBoston2024-01-01NULL
1AliceBostonNew York2023-07-012023-12-31
1AliceNew YorkNULL2023-01-012023-06-30
  • Use Case: Ideal for scenarios requiring a mix of current state, recent changes, and full historical tracking.

In the next section, we will look at stage tables, which are used to temporarily store, clean, and transform raw data before loading it into the DWH (fact or dimension tables). Stage tables act as a temporary area to handle raw or intermediate data before it moves into the main DWH.