Type 2 SCDs
Type 2 Slowly Changing Dimension (SCD) is a data warehousing method used to track changes in data over time by preserving the change history. You can view the historical data for each record and their corresponding time range when it was valid.
If you're already familiar with SCDs, you may skip this section and move ahead to the next section.
1. Understanding Type 2 SCDs
Whenever an entity's data in any dimension changes, a new row is created instead of overwriting the old ones.
Each new row includes the updated information along with the start date and end date that indicate the validity period of the updated information.
There is an additional flag column that indicates the record that is currently active. This column can be of the checkbox type or a simple true/false or 0/1 column.
There can be one or more timestamp columns in addition to the start date to signify when a new record was created or made active.
This approach allows us to analyze the historical data and understand how the information has evolved over time.
2. Example for Type 2 SCDs
Let us consider an employee database where the employees' details, such as salary, incentive, job role, etc., can slowly change over time.
With SCDs, whenever there is a change, a new row with the updated details is created instead of overwriting the old one. This new row includes an effective start date that indicates when it was updated, or when the new details became live.
Meanwhile, the old row will have an end date marking when it was last valid. Likewise, every change made to an employee is tracked through a series of rows, each with its respective start and end dates.
Find below the database where SCDs are in place:
ID (Surrogate Key) | Emp.ID (Business Key) | Employee Name | Salary | RowValidFrom | RowValidTo |
---|---|---|---|---|---|
1 | C501 | John S | 70000 | 20-10-2023 | 23-10-2024 |
56 | C002 | Alan B | 75000 | 01-01-2024 | 31-12-2025 |
234 | C501 | John S | 80000 | 24-10-2024 | 31-12-2025 |
235 | C988 | Lily S | 65000 | 21-10-2024 | 31-12-2025 |
One table row represents an entity's attributes for a defined timespan. The whole lifespan of one business entity can be inferred from a list of such rows.
In the next section, we will discuss how to configure a database that can support SCDs and how you can assign the appropriate keys in EDITable.
Last updated