The idea behind real time loading a data warehouse is to propagate data directly after an event has happened in an underlying source database. Let us think of an invoice database. In most circumstances, the invoice database is loaded with records that represents invoices that are sent. During the night, a set of records in collected and they are sent as a file to a data warehouse.
An alternative is to propagate a message from the source database directly after a new record is inserted. This message is received by the data warehouse. It can then be subsequently inserted into the data warehouse.
In Oracle, we have a trigger mechanism to implement this idea.
The trigger is fired, once the source table is changed. This trigger looks like:
create or replace trigger INVOER.T$EMPLOYEESUITVOER after insert or update or delete on INVOER.EMPLOYEES for each row declare V_FLAG VARCHAR(1); V_EMPLOYEE_ID NUMBER(6); begin if inserting then V_EMPLOYEE_ID := :new.EMPLOYEE_ID; V_FLAG := 'I'; end if; if updating then V_EMPLOYEE_ID := :new.EMPLOYEE_ID; V_FLAG := 'U'; end if; if deleting then V_EMPLOYEE_ID := :old.EMPLOYEE_ID; V_FLAG := 'D'; end if; insert into UITVOER.J$EMPLOYEES ( JRN_SUBSCRIBER, JRN_CONSUMED, JRN_FLAG, JRN_DATE, EMPLOYEE_ID ) select USER, '0', V_FLAG, sysdate, V_EMPLOYEE_ID from dual ; end;
The idea behind this trigger is that each insert, delete or update generates a new record in a remote table in the staging area of a data warehouse. Here, a table (UITVOER.J$EMPLOYEES) is continuously updated with each update representing a change in the source database.
In this case, the table in the staging area only contains the primary key of the record that is changed. With only a minor modification, one could include all fields from the source table. A similar result can be achieved with a view where the UITVOER.J$EMPLOYEES is joined with the original table.
See below:
DROP VIEW UITVOER.JV$DEMPLOYEES; CREATE VIEW UITVOER.JV$DEMPLOYEES AS select decode(TARG.ROWID, null, 'D', 'I') AS JRN_FLAG, JRN.JRN_SUBSCRIBER AS JRN_SUBSCRIBER, JRN.JRN_DATE AS JRN_DATE, JRN.EMPLOYEE_ID AS EMPLOYEE_ID ,TARG.FIRST_NAME AS FIRST_NAME, TARG.LAST_NAME AS LAST_NAME, TARG.EMAIL AS EMAIL, TARG.PHONE_NUMBER AS PHONE_NUMBER, TARG.HIRE_DATE AS HIRE_DATE, TARG.JOB_ID AS JOB_ID, TARG.SALARY AS SALARY, TARG.COMMISSION_PCT AS COMMISSION_PCT, TARG.MANAGER_ID AS MANAGER_ID, TARG.DEPARTMENT_ID AS DEPARTMENT_ID from ( select L.JRN_SUBSCRIBER AS JRN_SUBSCRIBER, L.EMPLOYEE_ID AS EMPLOYEE_ID, L.JRN_CONSUMED as JRN_CONSUMED, L.JRN_DATE AS JRN_DATE from UITVOER.J$EMPLOYEES L ) JRN left join INVOER.EMPLOYEES TARG ON JRN.EMPLOYEE_ID = TARG.EMPLOYEE_ID and JRN.JRN_CONSUMED = 0;
Finally, this can be used to update the data warehouse.