SCD-2 and the Oracle Merge statement

Yesterday I found a very nice script that allows processing records in a target table with a SCD-2 mechanism. The script is very elegant. It contains several logical steps that are knitted together in one SQL statement.

Th mechanism of SCD2 is as follows. First, a comparison is made between a source table and a target table. One may encounter two situations: one whereby a situation/ record is provided that is completely new and a situation/ record that is in reality an update from a previous situation.

When a new situation is provided, a new record must be inserted into the target table. When an update is provided, a record in the target table must be updated with an end date that indicates that the record can be seen as being a description of history. Moreover in case of an update a record must be inserted.

Let us look at an example. We start with:

Id stands for the identification of a situation, that can be updated. Naam is a field that can be updated. sk is a surrogate key that acts as a primary key. Startdate is the moment of an insert; enddate is the point in time until when the description is valid.

Let us assume, we have new data that look like

, we may derive that the final version will look like:

We see tom and ine being unchanged. The Stella record is finished with an end date (21-02-2018). Paula and Tessa are inserted with a startdate and an end date that is set at 31-12-99.

The script reads as:

MERGE INTO mergedoel p
USING 
 ( SELECT DECODE(s.scd_row_id,1,0,m.sk) as sk,
                   ID,
                   NAAM,
                   m.scd_row_id
   FROM    (SELECT dp.sk,
                   sp.ID,
                   sp.NAAM,
                   CASE
                     WHEN dp.id IS NULL
                       THEN 1
                     WHEN (dp.naam != sp.naam) 
                       THEN 2
                       ELSE 0 
                   END AS scd_row_id
                   FROM    mergebron sp
            LEFT JOIN mergedoel dp  
            ON (sp.id = dp.id and  dp.enddate > sysdate)
           ) m
   JOIN  scd_row_type s
   ON (s.scd_row_id <= m.scd_row_id)
 ) mp
ON  (p.sk = mp.sk)
when matched then
   update set P.enddate = sysdate
when NOT matched then
   insert (P.ID,P.NAAM,P.SK, p.startdate , p.enddate)
   values (mp.id,mp.naam,S_MERGE.nextval,sysdate, to_date('2099-12-31 00:00:00','YYYY-MM-DD HH24:MI:SS'));
  commit;

and the tables that we need are:

CREATE TABLE SCD_ROW_TYPE
(
SCD_ROW_ID NUMBER NOT NULL,
SCD_ROW_DESCRIPTION VARCHAR2(20 CHAR)
);
Insert into scd_row_type(scd_row_id,scd_row_description) values(1,'For Insert');
Insert into scd_row_type(scd_row_id,scd_row_description) values(2,'For Update');



CREATE TABLE "MERGEBRON" 
   (	"ID" NUMBER NOT NULL ENABLE, 
	"NAAM" VARCHAR2(20 BYTE)
   );


CREATE TABLE "MERGEDOEL" 
   (	"ID" NUMBER NOT NULL ENABLE, 
	"NAAM" VARCHAR2(20 BYTE), 
	"SK" NUMBER NOT NULL ENABLE, 
	"STARTDATE" DATE, 
	"ENDDATE" DATE
   );

The logic of this procedure is that is a first step a comparison is made between the target table and the source table. If no match is found between the target table and source table (based on id), a record is inserted.
If a match is found but the descriptions differ, two actions will be taken: records with end date are set at the system date and a new record are inserted.

Breaking

Door tom

Gerelateerd bericht

Je miste

Flask and JSON

A webserver from the command line

Use the node.js server as restful app server

Reading a CSV file and translate into dataframe