temporal support patch

Поиск
Список
Период
Сортировка
От Miroslav Šimulčík
Тема temporal support patch
Дата
Msg-id CAHRNM69mdOtkPtHAqp=J=rn-LAFLpxMeCnPFLo4FFsb2BBfRVQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: temporal support patch  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: temporal support patch  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Re: temporal support patch  (Jaime Casanova <jaime@2ndquadrant.com>)
Re: temporal support patch  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
Hi all,

as a part of my master's thesis I have created temporal support patch for PostgreSQL. It enables the creation of special temporal tables with entries versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables don't cause permanent changes to entries, but create new versions of them. Thus user can easily get to the past states of the table.

Basic information on temporal databases can be found on http://en.wikipedia.org/wiki/Temporal_database

In field of temporal databases, there are only proprietary solution available. During the analysis I found these:
    - IBM DB2 10 for z/OS
    - Oracle 11g Workspace Manager
    - Teradata Database 13.10

Primary goal of my work was the creation of opensource solution, that is easy to use and is backward compatible with existing applications, so that the change of the original tables to temporal ones, does not require changes to applications that work with them. This patch is built on standard SQL/Temporal with some minor modifications inspired by commercial temporal database systems. Currently it only deals with transaction time support.

Here is simple description on how it works:

1. user can create transaction time table using modified CREATE TABLE command:

    CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME;

    This command automatically creates all objects required for transaction time support:

                  List of relations
      Schema |         Name         |   Type   |  Owner
     --------+----------------------+----------+----------
      public | person               | table    | tester
      public | person__entry_id_seq | sequence | tester
      public | person_hist          | table    | postgres


                                                  Table "public.person"
        Column   |            Type             |                                  Modifiers
     ------------+-----------------------------+------------------------------------------------------------------------------
      name       | character varying(50)       |
      _entry_id  | bigint                      | not null default nextval('person__entry_id_seq'::regclass)
      _sys_start | timestamp without time zone | not null default clock_timestamp()
      _sys_end   | timestamp without time zone | not null default '294276-12-31 23:59:59.999999'::timestamp without time zone
     Indexes:
         "person__entry_id_idx" btree (_entry_id)
         "person__sys_start__sys_end_idx" btree (_sys_start, _sys_end)


              Table "public.person_hist"
        Column   |            Type             | Modifiers
     ------------+-----------------------------+-----------
      name       | character varying(50)       |
      _entry_id  | bigint                      | not null
      _sys_start | timestamp without time zone | not null
      _sys_end   | timestamp without time zone | not null
     Indexes:
         "person_hist__entry_id_idx" btree (_entry_id)
         "person_hist__sys_start__sys_end_idx" btree (_sys_start, _sys_end)

    


    Table person stores current versions of entries. 3 additional columns are added:
        _entry_id - id of entry. It groups together different versions of entry.
        _sys_start - beginning of the version validity period (version creation timestamp).
        _sys_end - end of the version validity period.

    Table person_hist stores historical versions of entries. It has the same structure and indexes as the person table, but without any constraints and default values.

2. another way of creating transaction time table is adding transaction time support to existing standard table using ALTER command.

    CREATE TABLE person(name varchar(50));
    ALTER TABLE person ADD TRANSACTIONTIME;

3. INSERT entry

    INSERT INTO person VALUES('Jack');
    
    SELECT *, _entry_id, _sys_start, _sys_end FROM person;
     
     name | _entry_id |         _sys_start         |           _sys_end
    ------+-----------+----------------------------+------------------------------
     Jack |         1 | 2012-05-16 22:11:39.856916 | 294276-12-31 23:59:59.999999

4. UPDATE entry

    UPDATE person SET name = 'Tom';

    SELECT *, _entry_id, _sys_start, _sys_end FROM person;
   
     name | _entry_id |         _sys_start         |           _sys_end
    ------+-----------+----------------------------+------------------------------
     Tom  |         1 | 2012-05-16 22:11:44.736195 | 294276-12-31 23:59:59.999999
    
    SELECT * FROM person_hist;

     name | _entry_id |         _sys_start         |          _sys_end
    ------+-----------+----------------------------+----------------------------
     Jack |         1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194

5. DELETE entry

    DELETE FROM person;

    SELECT *, _entry_id, _sys_start, _sys_end FROM person;

     name | _entry_id | _sys_start | _sys_end
    ------+-----------+------------+----------

    SELECT * FROM person_hist;

     name | _entry_id |         _sys_start         |          _sys_end
    ------+-----------+----------------------------+----------------------------
     Jack |         1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194
     Tom  |         1 | 2012-05-16 22:11:44.736195 | 2012-05-16 22:14:33.875869

6. selecting entries

    INSERT INTO person VALUES('Mike');
    INSERT INTO person VALUES('Mike');

    --standard SELECT - operates only with current versions of entries
    SELECT * FROM person;

     name
    ------
     Mike
     Mike

    --special temporal SELECT which operates with all versions
    NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start, _sys_end FROM person;

     name | _entry_id |         _sys_start         |           _sys_end
    ------+-----------+----------------------------+------------------------------
     Mike |         3 | 2012-05-16 22:20:55.055671 | 294276-12-31 23:59:59.999999
     Mike |         2 | 2012-05-16 22:20:51.619475 | 294276-12-31 23:59:59.999999
     Jack |         1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194
     Tom  |         1 | 2012-05-16 22:11:44.736195 | 2012-05-16 22:14:33.875869

    --special temporal SELECT which operates with versions valid in specified time
    TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *, _entry_id, _sys_start, _sys_end FROM person;
    
     name | _entry_id |         _sys_start         |          _sys_end
    ------+-----------+----------------------------+----------------------------
     Jack |         1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194

    --it is also possible to set timestamp globally for session. All subsequent SELECTs without any temporal modifier will operate with versions valid in this time,
    SET history_timestamp TO '2012-05-16 22:11:39.856916';

    SELECT * FROM person;

     name
    ------
     Jack

    --to select only current versions when history_tiumestamp is set, CURRENT TRANSACTIONTIME have to be used with SELECT
    CURRENT TRANSACTIONTIME SELECT * FROM person;

     name
    ------
     Mike
     Mike
    


This is only a illustration of main functionality. Later I can create a document about the design and implementation details, but first I need to know if such temporal features as described here, could be added to future versions of PostgreSQL, after meeting all the requirements of a new patch.

Regards

Miroslav Simulcik

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Pre-alloc ListCell's optimization
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Draft release notes complete