Yesterday I found a very nice script that allows processing records in a target table with a SCD-2 mechanism. The script is very elegant. It contains several logical steps that are knitted together in one SQL statement.
Th mechanism of SCD2 is as follows. First, a comparison is made between a source table and a target table. One may encounter two situations: one whereby a situation/ record is provided that is completely new and a situation/ record that is in reality an update from a previous situation.
When a new situation is provided, a new record must be inserted into the target table. When an update is provided, a record in the target table must be updated with an end date that indicates that the record can be seen as being a description of history. Moreover in case of an update a record must be inserted.
Let us look at an example. We start with:
Id stands for the identification of a situation, that can be updated. Naam is a field that can be updated. sk is a surrogate key that acts as a primary key. Startdate is the moment of an insert; enddate is the point in time until when the description is valid.
Let us assume, we have new data that look like
, we may derive that the final version will look like:
We see tom and ine being unchanged. The Stella record is finished with an end date (21-02-2018). Paula and Tessa are inserted with a startdate and an end date that is set at 31-12-99.
The script reads as:
MERGE INTO mergedoel p USING ( SELECT DECODE(s.scd_row_id,1,0,m.sk) as sk, ID, NAAM, m.scd_row_id FROM (SELECT dp.sk, sp.ID, sp.NAAM, CASE WHEN dp.id IS NULL THEN 1 WHEN (dp.naam != sp.naam) THEN 2 ELSE 0 END AS scd_row_id FROM mergebron sp LEFT JOIN mergedoel dp ON (sp.id = dp.id and dp.enddate > sysdate) ) m JOIN scd_row_type s ON (s.scd_row_id <= m.scd_row_id) ) mp ON (p.sk = mp.sk) when matched then update set P.enddate = sysdate when NOT matched then insert (P.ID,P.NAAM,P.SK, p.startdate , p.enddate) values (mp.id,mp.naam,S_MERGE.nextval,sysdate, to_date('2099-12-31 00:00:00','YYYY-MM-DD HH24:MI:SS')); commit;
and the tables that we need are:
CREATE TABLE SCD_ROW_TYPE ( SCD_ROW_ID NUMBER NOT NULL, SCD_ROW_DESCRIPTION VARCHAR2(20 CHAR) ); Insert into scd_row_type(scd_row_id,scd_row_description) values(1,'For Insert'); Insert into scd_row_type(scd_row_id,scd_row_description) values(2,'For Update'); CREATE TABLE "MERGEBRON" ( "ID" NUMBER NOT NULL ENABLE, "NAAM" VARCHAR2(20 BYTE) ); CREATE TABLE "MERGEDOEL" ( "ID" NUMBER NOT NULL ENABLE, "NAAM" VARCHAR2(20 BYTE), "SK" NUMBER NOT NULL ENABLE, "STARTDATE" DATE, "ENDDATE" DATE );
The logic of this procedure is that is a first step a comparison is made between the target table and the source table. If no match is found between the target table and source table (based on id), a record is inserted.
If a match is found but the descriptions differ, two actions will be taken: records with end date are set at the system date and a new record are inserted.