Обсуждение: WIP: System Versioned Temporal Table

Поиск
Список
Период
Сортировка

WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:

Hi all ,

Temporal table is one of the main new features added in sql standard 2011. From that I will like to implement system versioned temporal table which allows to keep past and present data so old data can be queried. Am propose to implement it like below

CREATE

In create table only one table is create and both historical and current data will be store in it. In order to make history and current data co-exist row end time column will be added implicitly to primary key. Regarding performance one can partition the table by row end time column order to make history data didn't slowed performance.

INSERT

In insert row start time column and row end time column behave like a kind of generated stored column except they store current transaction time and highest value supported by the data type which is +infinity respectively.

DELETE and UPDATE

The old data is inserted with row end time column seated to current transaction time

SELECT

If the query didn’t contain a filter condition that include system time column, a filter condition will be added in early optimization that filter history data.

Attached is WIP patch that implemented just the above and done on top of commit b8e19b932a99a7eb5a. Temporal clause didn’t implemented yet so one can use regular filter condition for the time being

NOTE: I implement sql standard syntax except it is PERIOD FOR SYSTEM TIME rather than PERIOD FOR SYSTEM_TIME in CREATE TABLE statement and system time is not selected unless explicitly asked

Any enlightenment?

regards

Surafel

Вложения

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 23/10/2019 17:56, Surafel Temesgen wrote:
>
> Hi all ,
>
> Temporal table is one of the main new features added in sql standard
> 2011. From that I will like to implement system versioned temporal
> table which allows to keep past and present data so old data can be
> queried.
>

Excellent!  I've been wanting this feature for a long time now.  We're
the last major database to not have it.


I tried my hand at doing it in core, but ended up having better success
at an extension: https://github.com/xocolatl/periods/


> Am propose to implement it like below
>
> CREATE
>
> In create table only one table is create and both historical and
> current data will be store in it. In order to make history and current
> data co-exist row end time column will be added implicitly to primary
> key. Regarding performance one can partition the table by row end time
> column order to make history data didn't slowed performance.
>

If we're going to be implicitly adding stuff to the PK, we also need to
add that stuff to the other unique constraints, no?  And I think it
would be better to add both the start and the end column to these keys. 
Most of the temporal queries will be accessing both.


> INSERT
>
> In insert row start time column and row end time column behave like a
> kind of generated stored column except they store current transaction
> time and highest value supported by the data type which is +infinity
> respectively.
>

You're forcing these columns to be timestamp without time zone.  If
you're going to force a datatype here, it should absolutely be timestamp
with time zone.  However, I would like to see it handle both kinds of
timestamps as well as a simple date.


> DELETE and UPDATE
>
> The old data is inserted with row end time column seated to current
> transaction time
>

I don't see any error handling for transaction anomalies.  In READ
COMMITTED, you can easily end up with a case where the end time comes
before the start time.  I don't even see anything constraining start
time to be strictly inferior to the end time.  Such a constraint will be
necessary for application-time periods (which your patch doesn't address
at all but that's okay).


> SELECT
>
> If the query didn’t contain a filter condition that include system
> time column, a filter condition will be added in early optimization
> that filter history data.
>
> Attached is WIP patch that implemented just the above and done on top
> of commit b8e19b932a99a7eb5a. Temporal clause didn’t implemented yet
> so one can use regular filter condition for the time being
>
> NOTE: I implement sql standard syntax except it is PERIOD FOR SYSTEM
> TIME rather than PERIOD FOR SYSTEM_TIME in CREATE TABLE statement and
> system time is not selected unless explicitly asked
>

Why aren't you following the standard syntax here?


> Any enlightenment?
>

There are quite a lot of typos and other things that aren't written "the
Postgres way". But before I comment on any of that, I'd like to see the
features be implemented correctly according to the SQL standard.




Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:

hi Vik,
On Wed, Oct 23, 2019 at 9:02 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
 

If we're going to be implicitly adding stuff to the PK, we also need to
add that stuff to the other unique constraints, no?  And I think it
would be better to add both the start and the end column to these keys. 
Most of the temporal queries will be accessing both.

 
yes it have to be added to other constraint too but adding both system time 
to PK will violate constraint because it allow multiple data in current data 
 

Why aren't you following the standard syntax here?



because we do have TIME and SYSTEM_P as a key word and am not sure of whether
its a right thing to add other keyword that contain those two word concatenated 
 
> Any enlightenment?
>

There are quite a lot of typos and other things that aren't written "the
Postgres way". But before I comment on any of that, I'd like to see the
features be implemented correctly according to the SQL standard.

it is almost in sql standard syntax except the above small difference. i can correct it 
and post more complete patch soon. 

regards 
Surafel  

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 24/10/2019 16:54, Surafel Temesgen wrote:
>
> hi Vik,
> On Wed, Oct 23, 2019 at 9:02 PM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>  
>
>
>     If we're going to be implicitly adding stuff to the PK, we also
>     need to
>     add that stuff to the other unique constraints, no?  And I think it
>     would be better to add both the start and the end column to these
>     keys. 
>     Most of the temporal queries will be accessing both.
>
>  
> yes it have to be added to other constraint too but adding both system
> time 
> to PK will violate constraint because it allow multiple data in
> current data


I don't understand what you mean by this.


>  
>
>
>     Why aren't you following the standard syntax here?
>
>
>
> because we do have TIME and SYSTEM_P as a key word and am not sure of
> whether
> its a right thing to add other keyword that contain those two word
> concatenated


Yes, we have to do that.


>  
>  
>
>     > Any enlightenment?
>     >
>
>     There are quite a lot of typos and other things that aren't
>     written "the
>     Postgres way". But before I comment on any of that, I'd like to
>     see the
>     features be implemented correctly according to the SQL standard.
>
>
> it is almost in sql standard syntax except the above small difference.
> i can correct it 
> and post more complete patch soon.


I don't mean just the SQL syntax, I also mean the behavior.




Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Thu, Oct 24, 2019 at 6:49 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
On 24/10/2019 16:54, Surafel Temesgen wrote:
>
> hi Vik,
> On Wed, Oct 23, 2019 at 9:02 PM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>  
>
>
>     If we're going to be implicitly adding stuff to the PK, we also
>     need to
>     add that stuff to the other unique constraints, no?  And I think it
>     would be better to add both the start and the end column to these
>     keys. 
>     Most of the temporal queries will be accessing both.
>
>  
> yes it have to be added to other constraint too but adding both system
> time 
> to PK will violate constraint because it allow multiple data in
> current data


I don't understand what you mean by this.



The primary purpose of adding row end time to primary key is to allow duplicate value to be inserted into a table with keeping constraint in current data but it can be duplicated in history data. Adding row start time column to primary key will eliminate this uniqueness for current data which is not correct  

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 25/10/2019 11:56, Surafel Temesgen wrote:
>
>
> On Thu, Oct 24, 2019 at 6:49 PM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>
>     On 24/10/2019 16:54, Surafel Temesgen wrote:
>     >
>     > hi Vik,
>     > On Wed, Oct 23, 2019 at 9:02 PM Vik Fearing
>     > <vik.fearing@2ndquadrant.com
>     <mailto:vik.fearing@2ndquadrant.com>
>     <mailto:vik.fearing@2ndquadrant.com
>     <mailto:vik.fearing@2ndquadrant.com>>> wrote:
>     >  
>     >
>     >
>     >     If we're going to be implicitly adding stuff to the PK, we also
>     >     need to
>     >     add that stuff to the other unique constraints, no?  And I
>     think it
>     >     would be better to add both the start and the end column to
>     these
>     >     keys. 
>     >     Most of the temporal queries will be accessing both.
>     >
>     >  
>     > yes it have to be added to other constraint too but adding both
>     system
>     > time 
>     > to PK will violate constraint because it allow multiple data in
>     > current data
>
>
>     I don't understand what you mean by this.
>
>
>
> The primary purpose of adding row end time to primary key is to allow
> duplicate value to be inserted into a table with keeping constraint in
> current data but it can be duplicated in history data. Adding row
> start time column to primary key will eliminate this uniqueness for
> current data which is not correct  


How?  The primary/unique keys must always be unique at every point in time.




Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Fri, Oct 25, 2019 at 10:45 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
>
>     I don't understand what you mean by this.
>
>
>
> The primary purpose of adding row end time to primary key is to allow
> duplicate value to be inserted into a table with keeping constraint in
> current data but it can be duplicated in history data. Adding row
> start time column to primary key will eliminate this uniqueness for
> current data which is not correct  


How?  The primary/unique keys must always be unique at every point in time.

From user prospect it is acceptable to delete and reinsert a record with the same key value multiple time which means there will be multiple record with the same key value in a history data but there is only one values in current data as a table without system versioning do .I add row end time column to primary key to allow user supplied primary key values to be duplicated in history data which is acceptable 

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 28/10/2019 13:48, Surafel Temesgen wrote:
>
>
> On Fri, Oct 25, 2019 at 10:45 PM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>
>     >
>     >     I don't understand what you mean by this.
>     >
>     >
>     >
>     > The primary purpose of adding row end time to primary key is to
>     allow
>     > duplicate value to be inserted into a table with keeping
>     constraint in
>     > current data but it can be duplicated in history data. Adding row
>     > start time column to primary key will eliminate this uniqueness for
>     > current data which is not correct  
>
>
>     How?  The primary/unique keys must always be unique at every point
>     in time.
>
>
> From user prospect it is acceptable to delete and reinsert a record
> with the same key value multiple time which means there will be
> multiple record with the same key value in a history data but there is
> only one values in current data as a table without system versioning
> do .I add row end time column to primary key to allow user supplied
> primary key values to be duplicated in history data which is acceptable
>

Yes, I understand that.  I'm saying you should also add the row start
time column.




Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


Hi,
Attached is a complete patch and also contain a fix for your comments

regards
Surafel 
Вложения

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 01/01/2020 11:50, Surafel Temesgen wrote:
>
>
> Hi,
> Attached is a complete patch and also contain a fix for your comments
>

This does not compile against current head (0ce38730ac).


gram.y: error: shift/reduce conflicts: 6 found, 0 expected

-- 

Vik Fearing




Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Thu, Jan 2, 2020 at 12:12 AM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
This does not compile against current head (0ce38730ac).


gram.y: error: shift/reduce conflicts: 6 found, 0 expected


Rebased and conflict resolved i hope it build clean this time

regards
Surafel


Вложения

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 03/01/2020 11:57, Surafel Temesgen wrote:
>
>
> On Thu, Jan 2, 2020 at 12:12 AM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>
>     This does not compile against current head (0ce38730ac).
>
>
>     gram.y: error: shift/reduce conflicts: 6 found, 0 expected
>
>
> Rebased and conflict resolved i hope it build clean this time
>

It does but you haven't included your tests file so `make check` fails.


It seems clear to me that you haven't tested it at all anyway.  The
temporal conditions do not return the correct results, and the syntax is
wrong, too.  Also, none of my previous comments have been addressed
except for "system versioning" instead of "system_versioning".  Why?

-- 

Vik Fearing




Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Fri, Jan 3, 2020 at 4:22 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
>
> Rebased and conflict resolved i hope it build clean this time
>

It does but you haven't included your tests file so `make check` fails.



what tests file? i add system_versioned_table.sql and system_versioned_table.out
test files and it tested and pass on appveyor[1] only failed on travis 
because of warning. i will add more test 
 
It seems clear to me that you haven't tested it at all anyway.  The
temporal conditions do not return the correct results, and the syntax is
wrong, too.  Also, none of my previous comments have been addressed
except for "system versioning" instead of "system_versioning".  Why?


I also correct typo and add row end column time to unique
key that make it unique for current dataAs you mentioned
other comment is concerning about application-time periods
which the patch not addressing . i refer sql 2011 standard for
syntax can you tell me which syntax you find it wrong?

regards 
Surafel 

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Mon, Oct 28, 2019 at 6:36 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
On 28/10/2019 13:48, Surafel Temesgen wrote:
>
>
> On Fri, Oct 25, 2019 at 10:45 PM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>
>     >
>     >     I don't understand what you mean by this.
>     >
>     >
>     >
>     > The primary purpose of adding row end time to primary key is to
>     allow
>     > duplicate value to be inserted into a table with keeping
>     constraint in
>     > current data but it can be duplicated in history data. Adding row
>     > start time column to primary key will eliminate this uniqueness for
>     > current data which is not correct  
>
>
>     How?  The primary/unique keys must always be unique at every point
>     in time.
>
>
> From user prospect it is acceptable to delete and reinsert a record
> with the same key value multiple time which means there will be
> multiple record with the same key value in a history data but there is
> only one values in current data as a table without system versioning
> do .I add row end time column to primary key to allow user supplied
> primary key values to be duplicated in history data which is acceptable
>

Yes, I understand that.  I'm saying you should also add the row start
time column.


that allow the same primary key value row to be insert as long
as insertion time is different
 
regards 
Surafel 

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 05/01/2020 11:16, Surafel Temesgen wrote:
>
>
> On Fri, Jan 3, 2020 at 4:22 PM Vik Fearing
> <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
>
>     >
>     > Rebased and conflict resolved i hope it build clean this time
>     >
>
>     It does but you haven't included your tests file so `make check`
>     fails.
>
>
>
> what tests file?


Exactly.


> i add system_versioned_table.sql and system_versioned_table.out
> test files


Those are not included in the patch.


<checks again>


Okay, that was user error on my side.  I apologize.


>  
>
>     It seems clear to me that you haven't tested it at all anyway.  The
>     temporal conditions do not return the correct results, and the
>     syntax is
>     wrong, too.  Also, none of my previous comments have been addressed
>     except for "system versioning" instead of "system_versioning".  Why?
>
>
> I also correct typo and add row end column time to unique
> key that make it unique for current data. As you mentioned
> other comment is concerning about application-time periods
> which the patch not addressing .


- For performance, you must put the start column in the indexes also.

- You only handle timestamp when you should also handle timestamptz and
date.

- You don't throw 2201H for anomalies


> i refer sql 2011 standard for
> syntax can you tell me which syntax you find it wrong?


Okay, now that I see your tests, I understand why everything is broken. 
You only test FROM-TO and with a really wide interval.  There are no
tests for AS OF and no tests for BETWEEN-AND.


As for the syntax, you have:


select a from for stest0 system_time from '2000-01-01 00:00:00.00000' to
'infinity' ORDER BY a;


when you should have:


select a from stest0 for system_time from '2000-01-01 00:00:00.00000' to
'infinity' ORDER BY a;


That is, the FOR should be on the other side of the table name.


In addition, there are many rules in the standard that are not respected
here.  For example, this query works and should not:


CREATE TABLE t (system_time integer) WITH SYSTEM VERSIONING;


This syntax is not supported:


ALTER TABLE t
    ADD PERIOD FOR SYSTEM_TIME (s, e)
        ADD COLUMN s timestamp
        ADD COLUMN e timestamp;


psql's \d does not show that the table is system versioned, and doesn't
show the columns of the system_time period.


I can drop columns used in the period.


Please don't hesitate to take inspiration from my extension that does
this.  The extension is under the PostgreSQL license for that reason. 
Take from it whatever you need.

https://github.com/xocolatl/periods/

-- 

Vik Fearing




Re: WIP: System Versioned Temporal Table

От
legrand legrand
Дата:
Vik Fearing-4 wrote
> On 05/01/2020 11:16, Surafel Temesgen wrote:
>>
>>
>> On Fri, Jan 3, 2020 at 4:22 PM Vik Fearing
>> <

> vik.fearing@

>  <mailto:

> vik.fearing@

> >> wrote:
>>
>
> [...]
>
> You only test FROM-TO and with a really wide interval.  There are no
> tests for AS OF and no tests for BETWEEN-AND.
>
>
> As for the syntax, you have:
>
>
> select a from for stest0 system_time from '2000-01-01 00:00:00.00000' to
> 'infinity' ORDER BY a;
>
>
> when you should have:
>
>
> select a from stest0 for system_time from '2000-01-01 00:00:00.00000' to
> 'infinity' ORDER BY a;
>
>
> That is, the FOR should be on the other side of the table name.
>
> [...]
>
> Vik Fearing

Hello,

I though that standard syntax was "AS OF SYSTEM TIME"
as discussed here

https://www.postgresql.org/message-id/flat/A254CDC3-D308-4822-8928-8CC584E0CC71%40elusive.cx#06c5dbffd5cfb9a20cdeec7a54dc657f
, also explaining how to parse such a syntax .

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 05/01/2020 16:01, legrand legrand wrote:
>
>> As for the syntax, you have:
>>
>>
>> select a from for stest0 system_time from '2000-01-01 00:00:00.00000' to
>> 'infinity' ORDER BY a;
>>
>>
>> when you should have:
>>
>>
>> select a from stest0 for system_time from '2000-01-01 00:00:00.00000' to
>> 'infinity' ORDER BY a;
>>
>>
>> That is, the FOR should be on the other side of the table name.
>>
>> [...] 
>>
>> Vik Fearing
> Hello,
>
> I though that standard syntax was "AS OF SYSTEM TIME"
> as discussed here
>
https://www.postgresql.org/message-id/flat/A254CDC3-D308-4822-8928-8CC584E0CC71%40elusive.cx#06c5dbffd5cfb9a20cdeec7a54dc657f
> , also explaining how to parse such a syntax .


No, that is incorrect.  The standard syntax is:


    FROM tablename FOR SYSTEM_TIME AS OF '...'

    FROM tablename FOR SYSTEM_TIME BETWEEN '...' AND '...'

    FROM tablename FOR SYSTEM_TIME FROM '...' TO '...'

-- 

Vik Fearing




Re: WIP: System Versioned Temporal Table

От
legrand legrand
Дата:
Vik Fearing-4 wrote
> On 05/01/2020 16:01, legrand legrand wrote:
>
>
> No, that is incorrect.  The standard syntax is:
>
>
>     FROM tablename FOR SYSTEM_TIME AS OF '...'
>
>     FROM tablename FOR SYSTEM_TIME BETWEEN '...' AND '...'
>
>     FROM tablename FOR SYSTEM_TIME FROM '...' TO '...'
>
> --
>
> Vik Fearing

oups, I re-read links and docs and I'm wrong.
Sorry for the noise

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: WIP: System Versioned Temporal Table

От
Kyotaro Horiguchi
Дата:
Hello.

Isn't this patch somehow broken?

At Mon, 28 Oct 2019 16:36:09 +0100, Vik Fearing <vik.fearing@2ndquadrant.com> wrote in
> On 28/10/2019 13:48, Surafel Temesgen wrote:
> >
> >
> > On Fri, Oct 25, 2019 at 10:45 PM Vik Fearing
> > <vik.fearing@2ndquadrant.com <mailto:vik.fearing@2ndquadrant.com>> wrote:
> >
> >     >
> >     >     I don't understand what you mean by this.
> >     >
> >     >
> >     >
> >     > The primary purpose of adding row end time to primary key is to
> >     allow
> >     > duplicate value to be inserted into a table with keeping
> >     constraint in
> >     > current data but it can be duplicated in history data. Adding row
> >     > start time column to primary key will eliminate this uniqueness for
> >     > current data which is not correct  
> >
> >
> >     How?  The primary/unique keys must always be unique at every point
> >     in time.
> >
> >
> > From user prospect it is acceptable to delete and reinsert a record
> > with the same key value multiple time which means there will be
> > multiple record with the same key value in a history data but there is
> > only one values in current data as a table without system versioning
> > do .I add row end time column to primary key to allow user supplied
> > primary key values to be duplicated in history data which is acceptable
> >
>
> Yes, I understand that.  I'm saying you should also add the row start
> time column.

I think that the start and end timestamps represent the period where
that version of the row was active. So UPDATE should modify the start
timestamp of the new version to the same value with the end timestamp
of the old version to the updated time. Thus, I don't think adding
start timestamp to PK doesn't work as expected. That hinders us from
rejecting rows with the same user-defined unique key because start
timestams is different each time of inserts. I think what Surafel is
doing in the current patch is correct. Only end_timestamp = +infinity
rejects another non-historical (= live) version with the same
user-defined unique key.

I'm not sure why the patch starts from "0002, but anyway it applied
on e369f37086. Then I ran make distclean, ./configure then make all,
make install, initdb and started server after all of them.

First, I tried to create a temporal table.

When I used timestamptz as the type of versioning columns, ALTER,
CREATE commands ended with server crash.

 "CREATE TABLE t (a int, s timestamptz GENERATED ALWAYS AS ROW START, e timestamptz GENERATED ALWAYS AS ROW END);"
 (CREATE TABLE t (a int);)
 "ALTER TABLE t ADD COLUMN s timestamptz GENERATED ALWAYS AS ROW START"
 "ALTER TABLE t ADD COLUMN s timestamptz GENERATED ALWAYS AS ROW START, ADD COLUMN e timestamptz GENERATED ALWAYS AS
ROWEND" 

If I added the start/end timestamp columns to an existing table, it
returns uncertain error.

 ALTER TABLE t ADD COLUMN s timestamp(6) GENERATED ALWAYS AS ROW START;
 ERROR:  column "s" contains null values
 ALTER TABLE t ADD COLUMN s timestamp(6) GENERATED ALWAYS AS ROW START, ADD COLUMN e timestamp(6) GENERATED ALWAYS AS
ROWEND; 
 ERROR:  column "s" contains null values


When I defined only start column, SELECT on the table crashed.

 "CREATE TABLE t (s timestamp(6) GENERATED ALWAYS AS ROW START);"
 "SELECT * from t;"
 (crashed)

The following command ended with ERROR which I cannot understand the
cause, but I expected the command to be accepted.

  ALTER TABLE t ADD COLUMN start timestamp(6) GENERATED ALWAYS AS ROW START, ADD COLUMN end timestamp(6) GENERATED
ALWAYSAS ROW END; 
  ERROR:  syntax error at or near "end"

I didin't examined further but the syntax part doesn't seem designed
well, and the body part seems vulnerable to unexpected input.


I ran a few queries:

SELECT * shows the timestamp columns, don't we need to hide the period
timestamp columns from this query?

I think UPDATE needs to update the start timestamp, but it doesn't. As
the result the timestamps doesn't represent the correct lifetime of
the row version and we wouldn't be able to pick up correct versions of
a row that exprerienced updates. (I didn't confirmed that because I
couldn't do "FOR SYSTEM_TIME AS OF" query due to syntax error..)

(Sorry in advance for possible pointless comments due to my lack of
access to the SQL11 standard.)  If we have the period-timestamp
columns before the last two columns, INSERT in a common way on the
table fails, which doesn't seem to me to be expected behavior:

 CREATE TABLE t (s timestamp(6) GENERATED ALWAYS AS ROW START, e timestamp(6) GENERATED ALWAYS AS ROW END, a int) WITH
SYSTEMVERSIONING; 
 INSERT INTO t (SELECT a FROM generate_series(0, 99) a);
 ERROR:  column "s" is of type timestamp without time zone but expression is of type integer

Some queries using SYSTEM_TIME which I think should be accepted ends
with error. Is the grammar part missing something?

 SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-07 09:57:55';
 ERROR:  syntax error at or near "system_time"
 LINE 1: SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-07 09:57:55';

 SELECT * FROM t FOR SYSTEM_TIME BETWEEN '2020-01-07 0:00:00' AND '2020-01-08 0:00:00';
 ERROR:  syntax error at or near "system_time"
 LINE 1: SELECT * FROM t FOR SYSTEM_TIME BETWEEN '2020-01-07 0:00:00'...


Other random comments (sorry for it not being comprehensive):

The patch at worst loops over all columns at every parse time. It is
quite ineffecient if there are many columns. We can store the column
indexes in relcache.

If I'm not missing anything, alter table doesn't properly modify
existing data in the target table. AddSystemVersioning should fill in
start/end_timestamp with proper values and DropSystemVersioning shuold
remove rows no longer useful.


+makeAndExpr(Node *lexpr, Node *rexpr, int location)

 I believe that planner flattenes nested AND/ORs in
 eval_const_expressions(). Shouldn't we leave the work to the planner?


+makeConstraint(ConstrType type)

We didn't use such a function to make that kind of nodes. Maybe we
should use makeNode directly, or we need to change similar coding into
that using the function. Addition to that, setting .location to 1 is
wrong. "Unknown" location is -1.

Separately from that, temporal clauses is not restriction of a
table. So it seems wrong to me to use constraint mechamism for this
purpose.

+makeSystemColumnDef(char *name)

"system column (or attribute)" is a column specially treated outside
of tuple descriptor. The temporal-period columns are not system
columns in that sense.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 05/01/2020 13:50, Vik Fearing wrote:
> Okay, now that I see your tests, I understand why everything is broken. 
> You only test FROM-TO and with a really wide interval.  There are no
> tests for AS OF and no tests for BETWEEN-AND.


I have started working on some better test cases for you.  The attached
.sql and .out tests should pass, and they are some of the tests that
I'll be putting your next version through.  There are many more tests
that need to be added.


Once all the desired functionality is there, I'll start reviewing the
code itself.


Keep up the good work, and let me know if I can do anything to help you.

-- 

Vik Fearing


Вложения

Re: WIP: System Versioned Temporal Table

От
David Steele
Дата:
Hi Surafel,

On 1/3/20 5:57 AM, Surafel Temesgen wrote:
> Rebased and conflict resolved i hope it build clean this time

This patch no longer applies according to cfbot and there are a number 
of review comments that don't seem to have been addressed yet.

The patch is not exactly new for this CF but since the first version was 
posted 2020-01-01 and there have been no updates (except a rebase) since 
then it comes pretty close.

Were you planning to work on this for PG13?  If so we'll need to see a 
rebased and updated patch pretty soon.  My recommendation is that we 
move this patch to PG14.

Regards,
-- 
-David
david@pgmasters.net



Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 03/03/2020 19:33, David Steele wrote:
> Hi Surafel,
> 
> On 1/3/20 5:57 AM, Surafel Temesgen wrote:
>> Rebased and conflict resolved i hope it build clean this time
> 
> This patch no longer applies according to cfbot and there are a number
> of review comments that don't seem to have been addressed yet.
> 
> The patch is not exactly new for this CF but since the first version was
> posted 2020-01-01 and there have been no updates (except a rebase) since
> then it comes pretty close.
> 
> Were you planning to work on this for PG13?  If so we'll need to see a
> rebased and updated patch pretty soon.  My recommendation is that we
> move this patch to PG14.

I strongly second that motion.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hi,
Thank you very much looking at it
On Tue, Jan 7, 2020 at 1:33 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Hello.

Isn't this patch somehow broken?


First, I tried to create a temporal table.

When I used timestamptz as the type of versioning columns, ALTER,
CREATE commands ended with server crash.

 "CREATE TABLE t (a int, s timestamptz GENERATED ALWAYS AS ROW START, e timestamptz GENERATED ALWAYS AS ROW END);"
 (CREATE TABLE t (a int);)
 "ALTER TABLE t ADD COLUMN s timestamptz GENERATED ALWAYS AS ROW START"
 "ALTER TABLE t ADD COLUMN s timestamptz GENERATED ALWAYS AS ROW START, ADD COLUMN e timestamptz GENERATED ALWAYS AS ROW END"

If I added the start/end timestamp columns to an existing table, it
returns uncertain error.

 ALTER TABLE t ADD COLUMN s timestamp(6) GENERATED ALWAYS AS ROW START;
 ERROR:  column "s" contains null values
 ALTER TABLE t ADD COLUMN s timestamp(6) GENERATED ALWAYS AS ROW START, ADD COLUMN e timestamp(6) GENERATED ALWAYS AS ROW END;
 ERROR:  column "s" contains null values


When I defined only start column, SELECT on the table crashed.

 "CREATE TABLE t (s timestamp(6) GENERATED ALWAYS AS ROW START);"
 "SELECT * from t;"
 (crashed)

The following command ended with ERROR which I cannot understand the
cause, but I expected the command to be accepted.


Fixed

  ALTER TABLE t ADD COLUMN start timestamp(6) GENERATED ALWAYS AS ROW START, ADD COLUMN end timestamp(6) GENERATED ALWAYS AS ROW END;
  ERROR:  syntax error at or near "end"


end is a keyword
 
I didin't examined further but the syntax part doesn't seem designed
well, and the body part seems vulnerable to unexpected input.


I ran a few queries:

SELECT * shows the timestamp columns, don't we need to hide the period
timestamp columns from this query?


The sql standard didn't dictate hiding the column but i agree hiding it by
default is good thing because this columns are used by the system
to classified the data and not  needed in user side frequently. I can
change to that if we have consensus
 
I think UPDATE needs to update the start timestamp, but it doesn't. As
the result the timestamps doesn't represent the correct lifetime of
the row version and we wouldn't be able to pick up correct versions of
a row that exprerienced updates. (I didn't confirmed that because I
couldn't do "FOR SYSTEM_TIME AS OF" query due to syntax error..)


Right. It have to set both system time for inserted row and set row end time for
deleted row. I fix it
 
(Sorry in advance for possible pointless comments due to my lack of
access to the SQL11 standard.)  If we have the period-timestamp
columns before the last two columns, INSERT in a common way on the
table fails, which doesn't seem to me to be expected behavior:

 CREATE TABLE t (s timestamp(6) GENERATED ALWAYS AS ROW START, e timestamp(6) GENERATED ALWAYS AS ROW END, a int) WITH SYSTEM VERSIONING;
 INSERT INTO t (SELECT a FROM generate_series(0, 99) a);
 ERROR:  column "s" is of type timestamp without time zone but expression is of type integer


Its the same without the patch too
CREATE TABLE t (s timestamptz , e timestamptz, a int);
INSERT INTO t (SELECT a FROM generate_series(0, 99) a);
ERROR:  column "s" is of type timestamp with time zone but expression is of type integer
LINE 1: INSERT INTO t (SELECT a FROM generate_series(0, 99) a);
 
Some queries using SYSTEM_TIME which I think should be accepted ends
with error. Is the grammar part missing something?

 SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-07 09:57:55';
 ERROR:  syntax error at or near "system_time"
 LINE 1: SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-07 09:57:55';

 SELECT * FROM t FOR SYSTEM_TIME BETWEEN '2020-01-07 0:00:00' AND '2020-01-08 0:00:00';
 ERROR:  syntax error at or near "system_time"
 LINE 1: SELECT * FROM t FOR SYSTEM_TIME BETWEEN '2020-01-07 0:00:00'...


fixed


Other random comments (sorry for it not being comprehensive):

The patch at worst loops over all columns at every parse time. It is
quite ineffecient if there are many columns. We can store the column
indexes in relcache.

 
but its only for system versioned table.
 
If I'm not missing anything, alter table doesn't properly modify
existing data in the target table. AddSystemVersioning should fill in
start/end_timestamp with proper values and DropSystemVersioning shuold
remove rows no longer useful.


fixed


+makeAndExpr(Node *lexpr, Node *rexpr, int location)

 I believe that planner flattenes nested AND/ORs in
 eval_const_expressions(). Shouldn't we leave the work to the planner?


 
filter clause is added using makeAndExpr and planner can flat that if it sees fit
 
+makeConstraint(ConstrType type)

We didn't use such a function to make that kind of nodes. Maybe we
should use makeNode directly, or we need to change similar coding into
that using the function. Addition to that, setting .location to 1 is
wrong. "Unknown" location is -1.

done


Separately from that, temporal clauses is not restriction of a
table. So it seems wrong to me to use constraint mechamism for this
purpose.


we use constraint mechanism for similar thing like default value and
generated column
 
+makeSystemColumnDef(char *name)

"system column (or attribute)" is a column specially treated outside
of tuple descriptor. The temporal-period columns are not system
columns in that sense.

changed to makeTemporalColumnDef and use timestamptz for all
versioning purpose.

Attach is the patch that fix the above and uses Vik's regression test

regards
Surafel
Вложения

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hi,

On Tue, Mar 3, 2020 at 9:33 PM David Steele <david@pgmasters.net> wrote:
Hi Surafel,

On 1/3/20 5:57 AM, Surafel Temesgen wrote:
> Rebased and conflict resolved i hope it build clean this time

This patch no longer applies according to cfbot and there are a number
of review comments that don't seem to have been addressed yet.

The patch is not exactly new for this CF but since the first version was
posted 2020-01-01 and there have been no updates (except a rebase) since
then it comes pretty close.

Were you planning to work on this for PG13?  If so we'll need to see a
rebased and updated patch pretty soon.  My recommendation is that we
move this patch to PG14.


I agree with moving to  PG14 . Its hard to make it to PG13.

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
David Steele
Дата:
On 3/10/20 9:00 AM, Surafel Temesgen wrote:
> On Tue, Mar 3, 2020 at 9:33 PM David Steele <david@pgmasters.net 
> <mailto:david@pgmasters.net>> wrote:
> 
>     The patch is not exactly new for this CF but since the first version
>     was
>     posted 2020-01-01 and there have been no updates (except a rebase)
>     since
>     then it comes pretty close.
> 
>     Were you planning to work on this for PG13?  If so we'll need to see a
>     rebased and updated patch pretty soon.  My recommendation is that we
>     move this patch to PG14.
> 
> I agree with moving to  PG14 . Its hard to make it to PG13.

The target version is now PG14.

Regards,
-- 
-David
david@pgmasters.net



Re: WIP: System Versioned Temporal Table

От
Eli Marmor
Дата:
Hi Surafel and the rest,

I'm the owner of the Israeli meetup group of PostgreSQL, and I'm interested in Temporality and have been trying for several years a few ways to add it to PostgreSQL
(all of them through extensions and external ways).
I'm happy that this is done by you internally (and a little bit disappointed that it's delayed again and again, but that's life...).
I'll be happy to join this effort.
I can't promise that I'll succeed to contribute anything, but first I want to play with it a little.
To save me several hours, can you advise me what is the best way to install it?
Which exact version of PG should I apply this patch to?

Thanks in advance, and thanks for your great work!
Eli

On Tue, Mar 31, 2020 at 10:04 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
Hi,

On Tue, Mar 3, 2020 at 9:33 PM David Steele <david@pgmasters.net> wrote:
Hi Surafel,

On 1/3/20 5:57 AM, Surafel Temesgen wrote:
> Rebased and conflict resolved i hope it build clean this time

This patch no longer applies according to cfbot and there are a number
of review comments that don't seem to have been addressed yet.

The patch is not exactly new for this CF but since the first version was
posted 2020-01-01 and there have been no updates (except a rebase) since
then it comes pretty close.

Were you planning to work on this for PG13?  If so we'll need to see a
rebased and updated patch pretty soon.  My recommendation is that we
move this patch to PG14.


I agree with moving to  PG14 . Its hard to make it to PG13.

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Tue, Mar 31, 2020 at 10:12 PM Eli Marmor <eli@netmask.it> wrote:
Hi Surafel and the rest,

I'm the owner of the Israeli meetup group of PostgreSQL, and I'm interested in Temporality and have been trying for several years a few ways to add it to PostgreSQL
(all of them through extensions and external ways).
I'm happy that this is done by you internally (and a little bit disappointed that it's delayed again and again, but that's life...).
I'll be happy to join this effort.
I can't promise that I'll succeed to contribute anything, but first I want to play with it a little.
To save me several hours, can you advise me what is the best way to install it?
Which exact version of PG should I apply this patch to?

Thanks in advance, and thanks for your great work!
Eli



Hey Eli,
Sorry for my late reply. reviewing it is greatly appreciated. I attach rebased patch.
Please use git repo and it will work on current HEAD

regards
Surafel
Вложения

Re: WIP: System Versioned Temporal Table

От
Rémi Lapeyre
Дата:
Hi, thanks for working on this. I had planned to work on it and I’m looking forward to this natively in Postgres.

The patch builds with the following warnings:

plancat.c:2368:18: warning: variable 'name' is used uninitialized whenever 'for' loop exits because its condition is
false[-Wsometimes-uninitialized] 
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2379:9: note: uninitialized use occurs here
        return name;
               ^~~~
plancat.c:2368:18: note: remove the condition if it is always true
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2363:15: note: initialize the variable 'name' to silence this warning
        char       *name;
                        ^
                         = NULL
plancat.c:2396:18: warning: variable 'name' is used uninitialized whenever 'for' loop exits because its condition is
false[-Wsometimes-uninitialized] 
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2407:9: note: uninitialized use occurs here
        return name;
               ^~~~
plancat.c:2396:18: note: remove the condition if it is always true
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2391:15: note: initialize the variable 'name' to silence this warning
        char       *name;
                        ^
                         = NULL
2 warnings generated.


make check pass without issues, but make check-world fails for postgres_fdw, the diff is attached.


Before going further in the review, I’m a bit surprised by the quantity of code needed here. In
https://github.com/xocolatl/periodsthere is far less code and I would have expected the same here. For example, are the
changesto copy necessary or would it be possible to have a first patch the only implement the minimal changes required
forthis feature? 



Thanks a lot!

Rémi








Вложения

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hey Rémi,
Thank you for looking at it

On Sat, Jul 18, 2020 at 7:05 PM Rémi Lapeyre <remi.lapeyre@lenstra.fr> wrote:
Hi, thanks for working on this. I had planned to work on it and I’m looking forward to this natively in Postgres.

The patch builds with the following warnings:

plancat.c:2368:18: warning: variable 'name' is used uninitialized whenever 'for' loop exits because its condition is false [-Wsometimes-uninitialized]
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2379:9: note: uninitialized use occurs here
        return name;
               ^~~~
plancat.c:2368:18: note: remove the condition if it is always true
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2363:15: note: initialize the variable 'name' to silence this warning
        char       *name;
                        ^
                         = NULL
plancat.c:2396:18: warning: variable 'name' is used uninitialized whenever 'for' loop exits because its condition is false [-Wsometimes-uninitialized]
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2407:9: note: uninitialized use occurs here
        return name;
               ^~~~
plancat.c:2396:18: note: remove the condition if it is always true
        for (int i = 0; i < natts; i++)
                        ^~~~~~~~~
plancat.c:2391:15: note: initialize the variable 'name' to silence this warning
        char       *name;
                        ^
                         = NULL
2 warnings generated.




I wonder why my compiler didn't show me this

make check pass without issues, but make check-world fails for postgres_fdw, the diff is attached.


 Okay thanks the attached patch contains a fix for both issue


Before going further in the review, I’m a bit surprised by the quantity of code needed here. In https://github.com/xocolatl/periods there is far less code and I would have expected the same here. For example, are the changes to copy necessary or would it be possible to have a first patch the only implement the minimal changes required for this feature?


Yes  there not many c code in there because most of the logice is written in SQL

regards
Surafel
Вложения

Re: WIP: System Versioned Temporal Table

От
Michael Paquier
Дата:
On Tue, Jul 21, 2020 at 05:32:44PM +0300, Surafel Temesgen wrote:
> Thank you for looking at it

The patch is failing to apply.  Could you send a rebase please?
--
Michael

Вложения

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hi Michael,

On Tue, Sep 29, 2020 at 8:44 AM Michael Paquier <michael@paquier.xyz> wrote:

The patch is failing to apply.  Could you send a rebase please?

Attached is a rebased one.

regards
Surafel
Вложения

Re: WIP: System Versioned Temporal Table

От
Georgios Kokolatos
Дата:
Hi,

just a quick comment that this patch fails on the cfbot.

Cheers,
//Georgios

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:

Attached is a rebased one.
regards
Surafel
Вложения

Re: WIP: System Versioned Temporal Table

От
Ryan Lambert
Дата:
On Thu, Nov 19, 2020 at 11:04 AM Surafel Temesgen <surafel3000@gmail.com> wrote:

Attached is a rebased one.
regards
Surafel

Thank you for your work on this!  The v7 patch fails on the current master branch.  Error from make:

gram.y:16695:1: error: static declaration of ‘makeAndExpr’ follows non-static declaration
 makeAndExpr(Node *lexpr, Node *rexpr, int location)
 ^~~~~~~~~~~
In file included from gram.y:58:0:
../../../src/include/nodes/makefuncs.h:108:14: note: previous declaration of ‘makeAndExpr’ was here
 extern Node *makeAndExpr(Node *lexpr, Node *rexpr, int location);
              ^~~~~~~~~~~
gram.y:16695:1: warning: ‘makeAndExpr’ defined but not used [-Wunused-function]
 makeAndExpr(Node *lexpr, Node *rexpr, int location)
 ^~~~~~~~~~~


The docs have two instances of "EndtTime" that should be "EndTime". 

Ryan Lambert

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hi Ryan,

On Fri, Dec 18, 2020 at 10:28 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
On Thu, Nov 19, 2020 at 11:04 AM Surafel Temesgen <surafel3000@gmail.com> wrote:

The docs have two instances of "EndtTime" that should be "EndTime". 

Since my first language is not english i'm glad you find only this error
on doc. I will send rebased pach soon 

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Masahiko Sawada
Дата:
Hi Surafel,

On Tue, Dec 22, 2020 at 3:01 AM Surafel Temesgen <surafel3000@gmail.com> wrote:
>
> Hi Ryan,
>
> On Fri, Dec 18, 2020 at 10:28 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>>
>> On Thu, Nov 19, 2020 at 11:04 AM Surafel Temesgen <surafel3000@gmail.com> wrote:
>>
>> The docs have two instances of "EndtTime" that should be "EndTime".
>
>
> Since my first language is not english i'm glad you find only this error
> on doc. I will send rebased pach soon
>

The patch is not submitted yet. Are you planning to submit the updated
patch? Please also note the v7 patch cannot be applied to the current
HEAD. I'm switching the patch as Waiting on Author.

Regards,

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Mon, Jan 4, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Please also note the v7 patch cannot be applied to the current HEAD. I'm switching the patch as Waiting on Author.

Surafel, please say whether you are working on this or not. If you
need help, let us know.

On Tue, 7 Jan 2020 at 10:33, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> I think that the start and end timestamps represent the period where
> that version of the row was active. So UPDATE should modify the start
> timestamp of the new version to the same value with the end timestamp
> of the old version to the updated time. Thus, I don't think adding
> start timestamp to PK doesn't work as expected. That hinders us from
> rejecting rows with the same user-defined unique key because start
> timestamps is different each time of inserts. I think what Surafel is
> doing in the current patch is correct. Only end_timestamp = +infinity
> rejects another non-historical (= live) version with the same
> user-defined unique key.

The end_time needs to be updated when a row is updated, so it cannot
form part of the PK. If you try to force that to happen, then logical
replication will not work with system versioned tables, which would be
a bad thing. So *only* start_time should be added to the PK to make
this work. (A later comment also says the start_time needs to be
updated, which makes no sense!)

On Wed, 23 Oct 2019 at 21:03, Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
> I don't see any error handling for transaction anomalies.  In READ
> COMMITTED, you can easily end up with a case where the end time comes
> before the start time.  I don't even see anything constraining start
> time to be strictly inferior to the end time.  Such a constraint will be
> necessary for application-time periods (which your patch doesn't address
> at all but that's okay).

I don't see how it can have meaning to have an end_time earlier than a
start_time, so yes that should be checked. Having said that, if we use
a statement timestamp on row insertion then, yes, the end_time could
be earlier than start time, so that is just wrong. Ideally we would
use commit timestamp and fill the values in later. So the only thing
that makes sense for me is to use the dynamic time of insertion while
we hold the buffer lock, otherwise we will get anomalies.

The work looks interesting and I will be doing a longer review.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Thu, Jan 7, 2021 at 5:59 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Mon, Jan 4, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Please also note the v7 patch cannot be applied to the current HEAD. I'm switching the patch as Waiting on Author.
>
> Surafel, please say whether you are working on this or not. If you
> need help, let us know.

I've minimally rebased the patch to current head so that it compiles
and passes current make check.

From here, I will add further docs and tests to enhance review and
discover issues.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Вложения

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> I've minimally rebased the patch to current head so that it compiles
> and passes current make check.

Full version attached

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Вложения

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, Jan 8, 2021 at 7:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> > I've minimally rebased the patch to current head so that it compiles
> > and passes current make check.
>
> Full version attached

New version attached with improved error messages, some additional
docs and a review of tests.

* UPDATE doesn't set EndTime correctly, so err... the patch doesn't
work on this aspect.
Everything else does actually work, AFAICS, so we "just" need a way to
update the END_TIME column in place...
So input from other Hackers/Committers needed on this point to see
what is acceptable.

* Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved

* No discussion, comments or tests around freezing and whether that
causes issues here

* What happens if you ask for a future time?
It will give an inconsistent result as it scans, so we should refuse a
query for time > current_timestamp.

* ALTER TABLE needs some work, it's a bit klugey at the moment and
needs extra tests.
Should refuse DROP COLUMN on a system time column

* Do StartTime and EndTime show in SELECT *? Currently, yes. Would
guess we wouldn't want them to, not sure what standard says.

* The syntax changes in gram.y probably need some coralling

Overall, it's a pretty good patch and worth working on more. I will
consider a recommendation on what to do with this.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Вложения

Re: WIP: System Versioned Temporal Table

От
Andrew Dunstan
Дата:
On 1/8/21 7:33 AM, Simon Riggs wrote:
>
> * What happens if you ask for a future time?
> It will give an inconsistent result as it scans, so we should refuse a
> query for time > current_timestamp.


That seems like a significant limitation. Can we fix it instead of
refusing the query?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: WIP: System Versioned Temporal Table

От
Ryan Lambert
Дата:
On Fri, Jan 8, 2021 at 5:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Fri, Jan 8, 2021 at 7:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> > I've minimally rebased the patch to current head so that it compiles
> > and passes current make check.
>
> Full version attached

New version attached with improved error messages, some additional
docs and a review of tests.


The v10 patch fails to make on the current master branch (15b824da).  Error:

make[2]: Entering directory '/var/lib/postgresql/git/postgresql/src/backend/parser'
'/usr/bin/perl' ./check_keywords.pl gram.y ../../../src/include/parser/kwlist.h
/usr/bin/bison -Wno-deprecated  -d -o gram.c gram.y
gram.y:3685.55-56: error: $4 of ‘ColConstraintElem’ has no declared type
  n->contype = ($4)->contype;
                                                       ^^
gram.y:3687.56-57: error: $4 of ‘ColConstraintElem’ has no declared type
  n->raw_expr = ($4)->raw_expr;
                                                        ^^
gram.y:3734.41-42: error: $$ of ‘generated_type’ has no declared type
  $$ = n;
                                         ^^
gram.y:3741.41-42: error: $$ of ‘generated_type’ has no declared type
  $$ = n;
                                         ^^
gram.y:3748.41-42: error: $$ of ‘generated_type’ has no declared type
  $$ = n;
                                         ^^
../../../src/Makefile.global:750: recipe for target 'gram.c' failed
make[2]: *** [gram.c] Error 1
make[2]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend/parser'
Makefile:137: recipe for target 'parser/gram.h' failed
make[1]: *** [parser/gram.h] Error 2
make[1]: Leaving directory '/var/lib/postgresql/git/postgresql/src/backend'
src/Makefile.global:389: recipe for target 'submake-generated-headers' failed
make: *** [submake-generated-headers] Error 2


* UPDATE doesn't set EndTime correctly, so err... the patch doesn't
work on this aspect.
Everything else does actually work, AFAICS, so we "just" need a way to
update the END_TIME column in place...
So input from other Hackers/Committers needed on this point to see
what is acceptable.

* Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved

* No discussion, comments or tests around freezing and whether that
causes issues here

* What happens if you ask for a future time?
It will give an inconsistent result as it scans, so we should refuse a
query for time > current_timestamp. 
* ALTER TABLE needs some work, it's a bit klugey at the moment and
needs extra tests.
Should refuse DROP COLUMN on a system time column

* Do StartTime and EndTime show in SELECT *? Currently, yes. Would
guess we wouldn't want them to, not sure what standard says.


I prefer to have them hidden by default.  This was mentioned up-thread with no decision, it seems the standard is ambiguous.  MS SQL appears to have flip-flopped on this decision [1].

> SELECT * shows the timestamp columns, don't we need to hide the period
> timestamp columns from this query?
>
>
The sql standard didn't dictate hiding the column but i agree hiding it by
default is good thing because this columns are used by the system
to classified the data and not needed in user side frequently. I can
change to that if we have consensus


 
* The syntax changes in gram.y probably need some coralling

Overall, it's a pretty good patch and worth working on more. I will
consider a recommendation on what to do with this.

--
Simon Riggs                http://www.EnterpriseDB.com/

I am increasingly interested in this feature and have heard others asking for this type of functionality.  I'll do my best to continue reviewing and testing.

Thanks,

Ryan Lambert


Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, Jan 8, 2021 at 4:50 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>
> On Fri, Jan 8, 2021 at 5:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>>
>> On Fri, Jan 8, 2021 at 7:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>> >
>> > On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>> >
>> > > I've minimally rebased the patch to current head so that it compiles
>> > > and passes current make check.
>> >
>> > Full version attached
>>
>> New version attached with improved error messages, some additional
>> docs and a review of tests.
>>
>
> The v10 patch fails to make on the current master branch (15b824da).  Error:

Updated v11 with additional docs and some rewording of messages/tests
to use "system versioning" correctly.

No changes on the points previously raised.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Вложения

Re: WIP: System Versioned Temporal Table

От
Ryan Lambert
Дата:
On Fri, Jan 8, 2021 at 11:38 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Fri, Jan 8, 2021 at 4:50 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>
> On Fri, Jan 8, 2021 at 5:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>>
>> On Fri, Jan 8, 2021 at 7:34 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>> >
>> > On Fri, Jan 8, 2021 at 7:13 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>> >
>> > > I've minimally rebased the patch to current head so that it compiles
>> > > and passes current make check.
>> >
>> > Full version attached
>>
>> New version attached with improved error messages, some additional
>> docs and a review of tests.
>>
>
> The v10 patch fails to make on the current master branch (15b824da).  Error:

Updated v11 with additional docs and some rewording of messages/tests
to use "system versioning" correctly.

No changes on the points previously raised.

--
Simon Riggs                http://www.EnterpriseDB.com/


Thank you!  The v11 applies and installs.  I tried a simple test, unfortunately it appears the versioning is not working. The initial value is not preserved through an update and a new row does not appear to be created.

CREATE TABLE t
(
    id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    v BIGINT NOT NULL
)
WITH SYSTEM VERSIONING
;

Verify start/end time columns created.

t=# \d t
                                        Table "public.t"
  Column   |           Type           | Collation | Nullable |             Default              
-----------+--------------------------+-----------+----------+----------------------------------
 id        | bigint                   |           | not null | generated by default as identity
 v         | bigint                   |           | not null |
 StartTime | timestamp with time zone |           | not null | generated always as row start
 EndTime   | timestamp with time zone |           | not null | generated always as row end
Indexes:
    "t_pkey" PRIMARY KEY, btree (id, "EndTime")


Add a row and check the timestamps set as expected.


INSERT INTO t (v) VALUES (1);

 SELECT * FROM t;
 id | v |           StartTime           | EndTime  
----+---+-------------------------------+----------
  1 | 1 | 2021-01-08 20:56:20.848097+00 | infinity

Update the row.

UPDATE t SET v = -1;

The value for v updated but StartTime is the same.


SELECT * FROM t;
 id | v  |           StartTime           | EndTime  
----+----+-------------------------------+----------
  1 | -1 | 2021-01-08 20:56:20.848097+00 | infinity


Querying the table for all versions only returns the single updated row (v = -1) with the original row StartTime.  The original value has disappeared entirely it seems.

SELECT * FROM t
FOR SYSTEM_TIME FROM '-infinity' TO 'infinity';


I also created a non-versioned table and later added the columns using ALTER TABLE and encountered the same behavior.


Ryan Lambert



 

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, Jan 8, 2021 at 9:19 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:

>> Updated v11 with additional docs and some rewording of messages/tests
>> to use "system versioning" correctly.
>>
>> No changes on the points previously raised.
>>
> Thank you!  The v11 applies and installs.  I tried a simple test, unfortunately it appears the versioning is not
working.The initial value is not preserved through an update and a new row does not appear to be created.
 

Agreed. I already noted this in my earlier review comments.

I will send in a new version with additional tests so we can see more
clearly that the tests fail on the present patch.

I will post more on this next week.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Sat, Jan 9, 2021 at 10:39 AM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> On Fri, Jan 8, 2021 at 9:19 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>
> >> Updated v11 with additional docs and some rewording of messages/tests
> >> to use "system versioning" correctly.
> >>
> >> No changes on the points previously raised.
> >>
> > Thank you!  The v11 applies and installs.  I tried a simple test, unfortunately it appears the versioning is not
working.The initial value is not preserved through an update and a new row does not appear to be created.
 
>
> Agreed. I already noted this in my earlier review comments.

I'm pleased to note that UPDATE-not-working was a glitch, possibly in
an earlier patch merge. That now works as advertised.

I've added fairly clear SGML docs to explain how the current patch
works, which should assist wider review.

Also moved test SQL around a bit, renamed some things in code for
readability, but not done any structural changes.

This is looking much better now... with the TODO/issues list now
looking like this...

* Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved.
Probably need to add a test that end_timestamp > start_timestamp or ERROR,
which effectively enforces serializability.

* No discussion, comments or tests around freezing and whether that
causes issues here

* What happens if you ask for a future time?
It will give an inconsistent result as it scans, so we should refuse a
query for time > current_timestamp.

* ALTER TABLE needs some work, it's a bit klugey at the moment and
needs extra tests.
Should refuse DROP COLUMN on a system time column, but currently doesn't

* UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't

* Do StartTime and EndTime show in SELECT *? Currently, yes. Would
guess we wouldn't want them to, not sure what standard says.

From here, the plan would be to set this to "Ready For Committer" in
about a week. That is not the same thing as me saying it is
ready-for-commit, but we need some more eyes on this patch to decide
if it is something we want and, if so, are the code changes cool.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Вложения

Re: WIP: System Versioned Temporal Table

От
Ryan Lambert
Дата:
On Mon, Jan 11, 2021 at 7:02 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Sat, Jan 9, 2021 at 10:39 AM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> On Fri, Jan 8, 2021 at 9:19 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>
> >> Updated v11 with additional docs and some rewording of messages/tests
> >> to use "system versioning" correctly.
> >>
> >> No changes on the points previously raised.
> >>
> > Thank you!  The v11 applies and installs.  I tried a simple test, unfortunately it appears the versioning is not working. The initial value is not preserved through an update and a new row does not appear to be created.
>
> Agreed. I already noted this in my earlier review comments.

I'm pleased to note that UPDATE-not-working was a glitch, possibly in
an earlier patch merge. That now works as advertised.

It is working as expected now, Thank you! 
 
I've added fairly clear SGML docs to explain how the current patch
works, which should assist wider review.

The default column names changed to start_timestamp and end_timestamp.  A number of places in the docs still refer to StartTime and EndTime.  I prefer the new names without MixedCase.
 

Also moved test SQL around a bit, renamed some things in code for
readability, but not done any structural changes.

This is looking much better now... with the TODO/issues list now
looking like this...

* Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved.
Probably need to add a test that end_timestamp > start_timestamp or ERROR,
which effectively enforces serializability.

* No discussion, comments or tests around freezing and whether that
causes issues here

* What happens if you ask for a future time?
It will give an inconsistent result as it scans, so we should refuse a
query for time > current_timestamp.
* ALTER TABLE needs some work, it's a bit klugey at the moment and
needs extra tests.
Should refuse DROP COLUMN on a system time column, but currently doesn't

* UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't

* Do StartTime and EndTime show in SELECT *? Currently, yes. Would
guess we wouldn't want them to, not sure what standard says.

From here, the plan would be to set this to "Ready For Committer" in
about a week. That is not the same thing as me saying it is
ready-for-commit, but we need some more eyes on this patch to decide
if it is something we want and, if so, are the code changes cool.

 
Should I invest time now into further testing with more production-like scenarios on this patch?  Or would it be better to wait on putting effort into that until it has had more review?  I don't have much to offer for help on your current todo list.

Thanks, 

Ryan Lambert


Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:

Hi Andrew,
On Fri, Jan 8, 2021 at 4:38 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 1/8/21 7:33 AM, Simon Riggs wrote:
>
> * What happens if you ask for a future time?
> It will give an inconsistent result as it scans, so we should refuse a
> query for time > current_timestamp.


That seems like a significant limitation. Can we fix it instead of
refusing the query?



Querying  a table without system versioning with a value of non existent
data returns no record rather than error out or have other behavior. i don't
understand the needs for special treatment here

regards
Surafel 


Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hi Simon,
Thank you for all the work you does

On Mon, Jan 11, 2021 at 5:02 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:


* Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved.
Probably need to add a test that end_timestamp > start_timestamp or ERROR,
which effectively enforces serializability.



This scenario doesn't happen. There are no possibility of a record being deleted or updated before inserting
 
* No discussion, comments or tests around freezing and whether that
causes issues here


This feature introduced no new issue regarding freezing. Adding
the doc about the table size growth because of a retention of old record seems
enough for me
 

* ALTER TABLE needs some work, it's a bit klugey at the moment and
needs extra tests.
Should refuse DROP COLUMN on a system time column, but currently doesn't

* UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't


okay i will fix it
 
regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:
Hi Ryan

On Fri, Jan 8, 2021 at 7:50 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
I prefer to have them hidden by default.  This was mentioned up-thread with no decision, it seems the standard is ambiguous.  MS SQL appears to have flip-flopped on this decision [1].


I will change it to hidden by default if there are no objection

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Thu, Jan 14, 2021 at 5:46 PM Surafel Temesgen <surafel3000@gmail.com> wrote:

> On Fri, Jan 8, 2021 at 7:50 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>>
>> I prefer to have them hidden by default.  This was mentioned up-thread with no decision, it seems the standard is
ambiguous. MS SQL appears to have flip-flopped on this decision [1].
 

I think the default should be like this:

SELECT * FROM foo FOR SYSTEM_TIME AS OF ...
should NOT include the Start and End timestamp columns
because this acts like a normal query just with a different snapshot timestamp

SELECT * FROM foo FOR SYSTEM_TIME BETWEEN x AND y
SHOULD include the Start and End timestamp columns
since this form of query can include multiple row versions for the
same row, so it makes sense to see the validity times

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Thu, Jan 14, 2021 at 5:42 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
>
> Hi Simon,
> Thank you for all the work you does

No problem.

> On Mon, Jan 11, 2021 at 5:02 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>>
>>
>>
>> * Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved.
>> Probably need to add a test that end_timestamp > start_timestamp or ERROR,
>> which effectively enforces serializability.
>>
>
>
> This scenario doesn't happen.

Yes, I think it can. The current situation is that the Start or End is
set to the Transaction Start Timestamp.
So if t2 starts before t1, then if t1 creates a row and t2 deletes it
then we will have start=t1 end=t2, but t2<t1
Your tests don't show that because it must happen concurrently.
We need to add an isolation test to show this, or to prove it doesn't happen.

> There are no possibility of a record being deleted or updated before inserting

Agreed, but that was not the point.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Ryan Lambert
Дата:

On Thu, Jan 14, 2021 at 2:22 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Thu, Jan 14, 2021 at 5:46 PM Surafel Temesgen <surafel3000@gmail.com> wrote:

> On Fri, Jan 8, 2021 at 7:50 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>>
>> I prefer to have them hidden by default.  This was mentioned up-thread with no decision, it seems the standard is ambiguous.  MS SQL appears to have flip-flopped on this decision [1].

I think the default should be like this:

SELECT * FROM foo FOR SYSTEM_TIME AS OF ...
should NOT include the Start and End timestamp columns
because this acts like a normal query just with a different snapshot timestamp

SELECT * FROM foo FOR SYSTEM_TIME BETWEEN x AND y
SHOULD include the Start and End timestamp columns
since this form of query can include multiple row versions for the
same row, so it makes sense to see the validity times

+1

Ryan Lambert 

Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Fri, Jan 15, 2021 at 12:27 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:

Yes, I think it can. The current situation is that the Start or End is
set to the Transaction Start Timestamp.
So if t2 starts before t1, then if t1 creates a row and t2 deletes it
then we will have start=t1 end=t2, but t2<t1
Your tests don't show that because it must happen concurrently.
We need to add an isolation test to show this, or to prove it doesn't happen.



Does MVCC allow that? i am not expert on MVCC but i don't
think t2 can see the row create by translation started before
itself

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, Jan 15, 2021 at 4:46 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
>
>
>
> On Fri, Jan 15, 2021 at 12:27 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>>
>>
>> Yes, I think it can. The current situation is that the Start or End is
>> set to the Transaction Start Timestamp.
>> So if t2 starts before t1, then if t1 creates a row and t2 deletes it
>> then we will have start=t1 end=t2, but t2<t1
>> Your tests don't show that because it must happen concurrently.
>> We need to add an isolation test to show this, or to prove it doesn't happen.
>>
>
>
> Does MVCC allow that? i am not expert on MVCC but i don't
> think t2 can see the row create by translation started before
> itself

Yeh. Read Committed mode can see anything committed prior to the start
of the current statement, but UPDATEs always update the latest version
even if they can't see it.

Anyway, isolationtester spec file needed to test this. See
src/test/isolation and examples in specs/ directory

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Fri, Jan 15, 2021 at 12:22 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
SELECT * FROM foo FOR SYSTEM_TIME AS OF ...
should NOT include the Start and End timestamp columns
because this acts like a normal query just with a different snapshot timestamp

SELECT * FROM foo FOR SYSTEM_TIME BETWEEN x AND y
SHOULD include the Start and End timestamp columns
since this form of query can include multiple row versions for the
same row, so it makes sense to see the validity times


One disadvantage of returning system time columns is it
breaks upward compatibility. if an existing application wants to
switch to system versioning it will break.

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, Jan 15, 2021 at 4:56 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
>
>
>
> On Fri, Jan 15, 2021 at 12:22 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>>
>> SELECT * FROM foo FOR SYSTEM_TIME AS OF ...
>> should NOT include the Start and End timestamp columns
>> because this acts like a normal query just with a different snapshot timestamp
>>
>> SELECT * FROM foo FOR SYSTEM_TIME BETWEEN x AND y
>> SHOULD include the Start and End timestamp columns
>> since this form of query can include multiple row versions for the
>> same row, so it makes sense to see the validity times
>>
>
> One disadvantage of returning system time columns is it
> breaks upward compatibility. if an existing application wants to
> switch to system versioning it will break.

There are no existing applications, so for PostgreSQL, it wouldn't be an issue.

If you mean compatibility with other databases, that might be an
argument to do what others have done. What have other databases done
for SELECT * ?

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
legrand legrand
Дата:
Hello,

it seems that Oracle (11R2) doesn't add the Start and End timestamp columns 
and permit statement like

select * from tt
union
select * from tt
AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '6' SECOND)
minus 
select * from tt
VERSIONS BETWEEN TIMESTAMP (SYSTIMESTAMP - INTERVAL '6' second) and
SYSTIMESTAMP;

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/14/21 10:22 PM, Simon Riggs wrote:
> On Thu, Jan 14, 2021 at 5:46 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
> 
>> On Fri, Jan 8, 2021 at 7:50 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:
>>>
>>> I prefer to have them hidden by default.  This was mentioned up-thread with no decision, it seems the standard is
ambiguous. MS SQL appears to have flip-flopped on this decision [1].
 
> 
> I think the default should be like this:
> 
> SELECT * FROM foo FOR SYSTEM_TIME AS OF ...
> should NOT include the Start and End timestamp columns
> because this acts like a normal query just with a different snapshot timestamp
> 
> SELECT * FROM foo FOR SYSTEM_TIME BETWEEN x AND y
> SHOULD include the Start and End timestamp columns
> since this form of query can include multiple row versions for the
> same row, so it makes sense to see the validity times


I don't read the standard as being ambiguous about this at all.  The
columns should be shown just like any other column of the table.

I am not opposed to being able to set an attribute on columns allowing
them to be excluded from "*" but that is irrelevant to this patch.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/14/21 6:42 PM, Surafel Temesgen wrote:
> Hi Simon,
> Thank you for all the work you does
> 
> On Mon, Jan 11, 2021 at 5:02 PM Simon Riggs <simon.riggs@enterprisedb.com>
> wrote:
> 
>>
>>
>> * Anomalies around use of CURRENT_TIMESTAMP are not discussed or resolved.
>> Probably need to add a test that end_timestamp > start_timestamp or ERROR,
>> which effectively enforces serializability.
>>
>>
> 
> This scenario doesn't happen.

It *does* happen and the standard even provides a specific error code
for it (2201H).

Please look at my extension for this feature which implements all the
requirements of the standard (except syntax grammar, of course).
https://github.com/xocolatl/periods/
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Fri, Jan 15, 2021 at 8:02 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:

There are no existing applications, so for PostgreSQL, it wouldn't be an issue.


Yes we don't have but the main function of ALTER TABLE foo ADD SYSTEM VERSIONING
is to add system versioning functionality to existing application

regards
Surafel  

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/16/21 7:39 PM, Surafel Temesgen wrote:
> On Fri, Jan 15, 2021 at 8:02 PM Simon Riggs <simon.riggs@enterprisedb.com>
> wrote:
> 
>>
>> There are no existing applications, so for PostgreSQL, it wouldn't be an
>> issue.
>>
>>
> Yes we don't have but the main function of ALTER TABLE foo ADD SYSTEM
> VERSIONING
> is to add system versioning functionality to existing application

I haven't looked at this patch in a while, but I hope that ALTER TABLE t
ADD SYSTEM VERSIONING is not adding any columns.  That is a bug if it does.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Sat, Jan 16, 2021 at 10:12 PM Vik Fearing <vik@postgresfriends.org> wrote:

I haven't looked at this patch in a while, but I hope that ALTER TABLE t
ADD SYSTEM VERSIONING is not adding any columns.  That is a bug if it does.


Yes, that is how I implement it. I don't understand how it became a bug?

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/17/21 5:46 PM, Surafel Temesgen wrote:
> On Sat, Jan 16, 2021 at 10:12 PM Vik Fearing <vik@postgresfriends.org>
> wrote:
> 
>>
>> I haven't looked at this patch in a while, but I hope that ALTER TABLE t
>> ADD SYSTEM VERSIONING is not adding any columns.  That is a bug if it does.
>>
>>
> Yes, that is how I implement it. I don't understand how it became a bug?

This is not good, and I see that DROP SYSTEM VERSIONING also removes
these columns which is even worse.  Please read the standard that you
are trying to implement!

I will do a more thorough review of the functionalities in this patch
(not necessarily the code) this week.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Mon, Jan 18, 2021 at 1:43 AM Vik Fearing <vik@postgresfriends.org> wrote:

This is not good, and I see that DROP SYSTEM VERSIONING also removes
these columns which is even worse.  Please read the standard that you
are trying to implement!


The standard states the function of ALTER TABLE ADD SYSTEM VERSIONING
as  "Alter a regular persistent base table to a system-versioned table" and
system versioned table is described in the standard by two generated
stored constraint columns and implemented as such.
 
I will do a more thorough review of the functionalities in this patch
(not necessarily the code) this week.


Please do

regards
Surafel

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/11/21 3:02 PM, Simon Riggs wrote:
> * UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't

I'm still in the weeds of reviewing this patch, but why should this
fail?  It should not fail.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Tue, Jan 26, 2021 at 11:33 AM Vik Fearing <vik@postgresfriends.org> wrote:
>
> On 1/11/21 3:02 PM, Simon Riggs wrote:
> > * UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't
>
> I'm still in the weeds of reviewing this patch, but why should this
> fail?  It should not fail.

It should not be possible for the user to change the start or end
timestamp of a system_time time range, by definition.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/26/21 1:16 PM, Simon Riggs wrote:
> On Tue, Jan 26, 2021 at 11:33 AM Vik Fearing <vik@postgresfriends.org> wrote:
>>
>> On 1/11/21 3:02 PM, Simon Riggs wrote:
>>> * UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't
>>
>> I'm still in the weeds of reviewing this patch, but why should this
>> fail?  It should not fail.
> 
> It should not be possible for the user to change the start or end
> timestamp of a system_time time range, by definition.

Correct, but setting it to DEFAULT is not changing it.

See also SQL:2016 11.5 <default clause> General Rule 3.a.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Tue, Jan 26, 2021 at 12:51 PM Vik Fearing <vik@postgresfriends.org> wrote:
>
> On 1/26/21 1:16 PM, Simon Riggs wrote:
> > On Tue, Jan 26, 2021 at 11:33 AM Vik Fearing <vik@postgresfriends.org> wrote:
> >>
> >> On 1/11/21 3:02 PM, Simon Riggs wrote:
> >>> * UPDATE foo SET start_timestamp = DEFAULT should fail but currently doesn't
> >>
> >> I'm still in the weeds of reviewing this patch, but why should this
> >> fail?  It should not fail.
> >
> > It should not be possible for the user to change the start or end
> > timestamp of a system_time time range, by definition.
>
> Correct, but setting it to DEFAULT is not changing it.
>
> See also SQL:2016 11.5 <default clause> General Rule 3.a.

Thanks for pointing this out. Identity columns don't currently work that way.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Surafel Temesgen
Дата:


On Tue, Jan 26, 2021 at 2:33 PM Vik Fearing <vik@postgresfriends.org> wrote:
I'm still in the weeds of reviewing this patch, but why should this
fail?  It should not fail.

Attached is rebased patch that include isolation test

regards
Surafel
Вложения

RE: WIP: System Versioned Temporal Table

От
easteregg@verfriemelt.org
Дата:
hi,

i tested the temporal patch ( https://commitfest.postgresql.org/26/2316/ ) with the current 14devel applied ontop of
ef3d461without any conflicts. 
i build with no special options passed to ./configure and noticed, that the postgresql-client-13 from the debian
repositoriescrashes with the \d command 

to reproduce the issue:

  CREATE TABLE test (
    id int PRIMARY KEY generated ALWAYS AS IDENTITY,
    name text NOT NULL,
    start_timestamp timestamp with time zone GENERATED ALWAYS AS ROW START,
    end_timestamp timestamp with time zone GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (start_timestamp, end_timestamp)
  );

  \d test

it failes after outputting the table informations with this backtrace:

  free(): invalid pointer
  [1]    587783 abort (core dumped)  psql -X -U easteregg -h localhost postgres

  (gdb) bt 50
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
  #1  0x00007f21a62e0537 in __GI_abort () at abort.c:79
  #2  0x00007f21a6339768 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f21a6447e31 "%s\n") at
../sysdeps/posix/libc_fatal.c:155
  #3  0x00007f21a6340a5a in malloc_printerr (str=str@entry=0x7f21a644605e "free(): invalid pointer") at malloc.c:5347
  #4  0x00007f21a6341c14 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
  #5  0x000055c9fa47b602 in printTableCleanup (content=content@entry=0x7ffece7e41c0) at
./build/../src/fe_utils/print.c:3250
  #6  0x000055c9fa444aa3 in describeOneTableDetails (schemaname=<optimized out>, schemaname@entry=0x55c9fbebfee6
"public",relationname=<optimized out>, oid=oid@entry=0x55c9fbebfee0 "16436", verbose=verbose@entry=false) at
./build/../src/bin/psql/describe.c:3337
  #7  0x000055c9fa4490c9 in describeTableDetails (pattern=pattern@entry=0x55c9fbebf540 "abk",
verbose=verbose@entry=false,showSystem=<optimized out>) at ./build/../src/bin/psql/describe.c:1421 
  #8  0x000055c9fa4372ff in exec_command_d (scan_state=scan_state@entry=0x55c9fbebd130,
active_branch=active_branch@entry=true,cmd=cmd@entry=0x55c9fbebf430 "d") at ./build/../src/bin/psql/command.c:722 
  #9  0x000055c9fa43ae2b in exec_command (previous_buf=0x55c9fbebd3a0, query_buf=0x55c9fbebd270, cstack=0x55c9fbebd250,
scan_state=0x55c9fbebd130,cmd=0x55c9fbebf430 "d") at ./build/../src/bin/psql/command.c:317 
  #10 HandleSlashCmds (scan_state=scan_state@entry=0x55c9fbebd130, cstack=cstack@entry=0x55c9fbebd250,
query_buf=0x55c9fbebd270,previous_buf=0x55c9fbebd3a0) at ./build/../src/bin/psql/command.c:220 
  #11 0x000055c9fa4539e0 in MainLoop (source=0x7f21a6479980 <_IO_2_1_stdin_>) at ./build/../src/bin/psql/mainloop.c:502
  #12 0x000055c9fa433d64 in main (argc=<optimized out>, argv=0x7ffece7e47f8) at ./build/../src/bin/psql/startup.c:441

the client is this version:

  apt-cache policy postgresql-client-13
  postgresql-client-13:
    Installed: 13.1-1.pgdg+2+b3
    Candidate: 13.1-1.pgdg+2+b3
    Version table:
   *** 13.1-1.pgdg+2+b3 100
          100 http://apt.postgresql.org/pub/repos/apt sid-pgdg-testing/main amd64 Packages
          100 /var/lib/dpkg/status

the the 14devel version from my build or a selfcompiled REL_13_STABLE client will not crash.
i was wondering if this might pose a security concern.


i am a bit out of my depths here, but would be glad to help, if any informations are missing
with kind regards,
richard



Re: WIP: System Versioned Temporal Table

От
vignesh C
Дата:
On Mon, Mar 8, 2021 at 11:04 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
>
>
>
> On Thu, Feb 25, 2021 at 3:28 PM Li Japin <japinli@hotmail.com> wrote:
>>
>>
>> On Jan 27, 2021, at 12:39 AM, Surafel Temesgen <surafel3000@gmail.com> wrote:
>>
>>
>>
>> On Tue, Jan 26, 2021 at 2:33 PM Vik Fearing <vik@postgresfriends.org> wrote:
>>>
>>> I'm still in the weeds of reviewing this patch, but why should this
>>> fail?  It should not fail.
>>
>>
>> Attached is rebased patch that include isolation test
>>
>>
>> Thanks for updating the patch.  However it cannot apply to master (e5d8a9990).
>>
>> Here are some comments on system-versioning-temporal-table_2021_v13.patch.
>>
>> +</programlisting>
>> +    When system versioning is specified two columns are added which
>> +    record the start timestamp and end timestamp of each row verson.
>>
>> verson -> version
>>
>> +    By default, the column names will be StartTime and EndTime, though
>> +    you can specify different names if you choose.
>>
>> In fact, it is start_time and end_time, not StartTime and EndTime.
>> I think it's better to use <literal> label around start_time and end_time.
>>
>> +    column will be automatically added to the Primary Key of the
>> +    table.
>>
>> Should we mention the unique constraints?
>>
>> +    The system versioning period end column will be added to the
>> +    Primary Key of the table as a way of ensuring that concurrent
>> +    INSERTs conflict correctly.
>>
>> Same as above.
>>
>> Since the get_row_start_time_col_name() and get_row_end_time_col_name()
>> are similar, IMO we can pass a flag to get StartTime/EndTime column name,
>> thought?
>>
>> --
>> Regrads,
>> Japin Li.
>> ChengDu WenWu Information Technology Co.,Ltd.
>>
> The patch (system-versioning-temporal-table_2021_v13.patch) does not apply successfully.
>
> http://cfbot.cputube.org/patch_32_2316.log
>
> Hunk #1 FAILED at 80.
> 1 out of 1 hunk FAILED -- saving rejects to file src/test/regress/parallel_schedule.rej
> patching file src/test/regress/serial_schedule
> Hunk #1 succeeded at 126 (offset -1 lines).
>
>
> Therefore it is a minor change so I rebased the patch, please take a look at that.

The patch does not apply on Head anymore, could you rebase and post a
patch. I'm changing the status to "Waiting for Author".

Regards,
Vignesh



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Wed, 14 Jul 2021 at 12:48, vignesh C <vignesh21@gmail.com> wrote:

> The patch does not apply on Head anymore, could you rebase and post a
> patch. I'm changing the status to "Waiting for Author".

OK, so I've rebased the patch against current master to take it to v15.

I've then worked on the patch some myself to make v16 (attached),
adding these things:

* Add code, docs and test to remove the potential anomaly where
endtime < starttime, using the sqlstate 2201H as pointed out by Vik
* Add code and tests to handle multiple changes in a transaction
correctly, according to SQL Std
* Add code and tests to make Foreign Keys behave correctly, according to SQL Std
* Fixed nascent bug in relcache setup code
* Various small fixes from Japin's review - thanks! I've used
starttime and endtime as default column names
* Additional tests and docs to show that the functionality works with
or without PKs on the table

I am now satisfied that the patch does not have any implementation
anomalies in behavioral design, but it is still a long way short in
code architecture.

There are various aspects still needing work. This is not yet ready
for Commit, but it is appropriate now to ask for initial design
guidance on architecture and code placement by a Committer, so I am
setting this to Ready For Committer, in the hope that we get the
review in SeptCF and a later version can be submitted for later commit
in JanCF. With the right input, this patch is about a person-month
away from being ready, assuming we don't hit any blocking issues.

Major Known Issues
* SQLStd says that it should not be possible to update historical
rows, but those tests show we fail to prevent that and there is code
marked NOT_USED in those areas
* The code is structured poorly around
parse-analyze/rewriter/optimizer/executor and that needs positive
design recommendations, rather than critical review
* Joins currently fail because of the botched way WHERE clauses are
added, resulting in duplicate names
* Views probably don't work, but there are no tests
* CREATE TABLE (LIKE foo) doesn't correctly copy across all features -
test for that added, with test failure accepted for now
* ALTER TABLE is still incomplete and also broken; I suggest we remove
that for the first version of the patch to reduce patch size for an
initial commit.

Minor Known Issues
* Logical replication needs some minor work, no tests yet
* pg_dump support looks like it exists and might work easily, but
there are no tests yet
* Correlation names don't work in FROM clause - shift/reduce errors
from double use of AS
* Add test and code to prevent triggers referencing period cols in the
WHEN clause
* No tests yet to prove you can't set various parameters/settings on
the period time start/end cols
* Code needs some cleanup in a few places
* Not really sure what value is added by
lock-update-delete-system-versioned.spec

* IMHO we should make the PK definition use "endtime DESC", so that
the current version is always the first row found in the PK for any
key, since historical indexes will grow bigger over time

There are no expected issues with integration with MERGE, since SQLStd
explains how to handle that.

Other reviews are welcome.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Вложения

Re: WIP: System Versioned Temporal Table

От
Daniel Westermann
Дата:
Hi,

quick note: The documentation for this patch mentions:

 The <literal>starttime</literal> column
+    will be automatically added to the Primary Key of the table.

A quick tests shows that the endtime column is added instead:

postgres=# create table t1 ( a int primary key generated always as identity, b text ) with system versioning;
CREATE TABLE
postgres=# \d t1
                                      Table "public.t1"
  Column   |           Type           | Collation | Nullable |            Default            
-----------+--------------------------+-----------+----------+-------------------------------
 a         | integer                  |           | not null | generated always as identity
 b         | text                     |           |          | 
 starttime | timestamp with time zone |           | not null | generated always as row start
 endtime   | timestamp with time zone |           | not null | generated always as row end
Indexes:
    "t1_pkey" PRIMARY KEY, btree (a, endtime)

Regards
Daniel

Re: WIP: System Versioned Temporal Table

От
Jaime Casanova
Дата:
Hi,

This doesn't pass tests because of lack of some file. Can we fix that please and send the patch again?

On Tue, Aug 10, 2021 at 7:20 AM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Wed, 14 Jul 2021 at 12:48, vignesh C <vignesh21@gmail.com> wrote:

> The patch does not apply on Head anymore, could you rebase and post a
> patch. I'm changing the status to "Waiting for Author".

OK, so I've rebased the patch against current master to take it to v15.

I've then worked on the patch some myself to make v16 (attached),
adding these things:

* Add code, docs and test to remove the potential anomaly where
endtime < starttime, using the sqlstate 2201H as pointed out by Vik
* Add code and tests to handle multiple changes in a transaction
correctly, according to SQL Std
* Add code and tests to make Foreign Keys behave correctly, according to SQL Std
* Fixed nascent bug in relcache setup code
* Various small fixes from Japin's review - thanks! I've used
starttime and endtime as default column names
* Additional tests and docs to show that the functionality works with
or without PKs on the table

I am now satisfied that the patch does not have any implementation
anomalies in behavioral design, but it is still a long way short in
code architecture.

There are various aspects still needing work. This is not yet ready
for Commit, but it is appropriate now to ask for initial design
guidance on architecture and code placement by a Committer, so I am
setting this to Ready For Committer, in the hope that we get the
review in SeptCF and a later version can be submitted for later commit
in JanCF. With the right input, this patch is about a person-month
away from being ready, assuming we don't hit any blocking issues.

Major Known Issues
* SQLStd says that it should not be possible to update historical
rows, but those tests show we fail to prevent that and there is code
marked NOT_USED in those areas
* The code is structured poorly around
parse-analyze/rewriter/optimizer/executor and that needs positive
design recommendations, rather than critical review
* Joins currently fail because of the botched way WHERE clauses are
added, resulting in duplicate names
* Views probably don't work, but there are no tests
* CREATE TABLE (LIKE foo) doesn't correctly copy across all features -
test for that added, with test failure accepted for now
* ALTER TABLE is still incomplete and also broken; I suggest we remove
that for the first version of the patch to reduce patch size for an
initial commit.

Minor Known Issues
* Logical replication needs some minor work, no tests yet
* pg_dump support looks like it exists and might work easily, but
there are no tests yet
* Correlation names don't work in FROM clause - shift/reduce errors
from double use of AS
* Add test and code to prevent triggers referencing period cols in the
WHEN clause
* No tests yet to prove you can't set various parameters/settings on
the period time start/end cols
* Code needs some cleanup in a few places
* Not really sure what value is added by
lock-update-delete-system-versioned.spec

* IMHO we should make the PK definition use "endtime DESC", so that
the current version is always the first row found in the PK for any
key, since historical indexes will grow bigger over time

There are no expected issues with integration with MERGE, since SQLStd
explains how to handle that.

Other reviews are welcome.

--
Simon Riggs                http://www.EnterpriseDB.com/


--

Re: WIP: System Versioned Temporal Table

От
Jaime Casanova
Дата:
On Tue, Aug 10, 2021 at 01:20:14PM +0100, Simon Riggs wrote:
> On Wed, 14 Jul 2021 at 12:48, vignesh C <vignesh21@gmail.com> wrote:
> 
> > The patch does not apply on Head anymore, could you rebase and post a
> > patch. I'm changing the status to "Waiting for Author".
> 
> OK, so I've rebased the patch against current master to take it to v15.
> 
> I've then worked on the patch some myself to make v16 (attached),
> adding these things:
> 

Hi Simon,

This one doesn't apply nor compile anymore.
Can we expect a rebase soon?


-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Fri, 10 Sept 2021 at 19:30, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
>
> On Tue, Aug 10, 2021 at 01:20:14PM +0100, Simon Riggs wrote:
> > On Wed, 14 Jul 2021 at 12:48, vignesh C <vignesh21@gmail.com> wrote:
> >
> > > The patch does not apply on Head anymore, could you rebase and post a
> > > patch. I'm changing the status to "Waiting for Author".
> >
> > OK, so I've rebased the patch against current master to take it to v15.
> >
> > I've then worked on the patch some myself to make v16 (attached),
> > adding these things:
> >
>
> Hi Simon,
>
> This one doesn't apply nor compile anymore.
> Can we expect a rebase soon?

Hi Jaime,

Sorry for not replying.

Yes, I will rebase again to assist the design input I have requested.
Please expect that on Sep 15.

Cheers

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Corey Huinker
Дата:
On Sun, Sep 12, 2021 at 12:02 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
On Fri, 10 Sept 2021 at 19:30, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
>
> On Tue, Aug 10, 2021 at 01:20:14PM +0100, Simon Riggs wrote:
> > On Wed, 14 Jul 2021 at 12:48, vignesh C <vignesh21@gmail.com> wrote:
> >
> > > The patch does not apply on Head anymore, could you rebase and post a
> > > patch. I'm changing the status to "Waiting for Author".
> >
> > OK, so I've rebased the patch against current master to take it to v15.
> >
> > I've then worked on the patch some myself to make v16 (attached),
> > adding these things:
> >
>
> Hi Simon,
>
> This one doesn't apply nor compile anymore.
> Can we expect a rebase soon?

Hi Jaime,

Sorry for not replying.

Yes, I will rebase again to assist the design input I have requested.
Please expect that on Sep 15.

Cheers

--
Simon Riggs                http://www.EnterpriseDB.com/



I've been interested in this patch, especially with how it will interoperate with the work on application periods in https://www.postgresql.org/message-id/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com . I've written up a few observations and questions in that thread, and wanted to do the same here, as the questions are a bit narrower but no less interesting.

1. Much of what I have read about temporal tables seemed to imply or almost assume that system temporal tables would be implemented as two actual separate tables. Indeed, SQLServer appears to do it that way [1] with syntax like

WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory));

Q 1.1. Was that implementation considered and if so, what made this implementation more appealing?

2. The endtime column constraint which enforces GENERATED ALWAYS AS ROW END seems like it would have appeal outside of system versioning, as a lot of tables have a last_updated column, and it would be nice if it could handle itself and not rely on fallible application programmers or require trigger overhead.

Q 2.1. Is that something we could break out into its own patch?

3. It is possible to have bi-temporal tables (having both a system_time period and a named application period) as described in [2], the specific example was

CREATE TABLE Emp(
  ENo INTEGER,
  EStart DATE,
  EEnd DATE,
  EDept INTEGER,
  PERIOD FOR EPeriod (EStart, EEnd),
  Sys_start TIMESTAMP(12) GENERATED ALWAYS AS ROW START,
  Sys_end TIMESTAMP(12) GENERATED ALWAYS AS ROW END,
  EName VARCHAR(30),
  PERIOD FOR SYSTEM_TIME(Sys_start, Sys_end),
  PRIMARY KEY (ENo, EPeriod WITHOUT OVERLAPS),
  FOREIGN KEY (Edept, PERIOD EPeriod) REFERENCES Dept (DNo, PERIOD DPeriod)
) WITH SYSTEM VERSIONING

What's interesting here is that in the case of a bitemporal table, it was the application period that got the defined primary key. The paper went on that only the _current_ rows of the table needed to be unique for, as it wasn't possible to create rows with past system temporal values. This sounds like a partial index to me, and luckily postgres can do referential integrity on any unique index, not just primary keys. In light of the assumption of a history side-table, I guess I shouldn't be surprised.

Q 3.1. Do you think that it would be possible to implement system versioning with just a unique index?
Q 3.2. Are there any barriers to using a partial index as the hitch for a foreign key? Would it be any different than the implied "and endtime = 'infinity'" that's already being done?

4. The choice of 'infinity' seemed like a good one initially - it's not null so it can be used in a primary key, it's not some hackish magic date like SQLServer's '9999-12-31 23:59:59.9999999'. However, it may not jibe as well with application versioning, which is built very heavily upon range types (and multirange types), and those ranges are capable of saying that a record is valid for an unbounded amount of time in the future, that's represented with NULL, not infinity. It could be awkward to have the system endtime be infinity and the application period endtime be NULL.

Q 4.1. Do you have any thoughts about how to resolve this?

5. System versioning columns were indicated with additional columns in pg_attribute.

Q 5.1. If you were to implement application versioning yourself, would you just add additional columns to pg_attribute for those?

6. The current effort to implement application versioning creates an INFORMATION_SCHEMA view called PERIODS. I wasn't aware of this one before but there seems to be precedent for it existing.

Q 6.1. Would system versioning belong in such a view?

7. This is a trifle, but the documentation is inconsistent about starttime vs StartTime and endtime vs EndTime.

8. Overall, I'm really excited about both of these efforts, and I'm looking for ways to combine the efforts, perhaps starting with a patch that implements the SQL syntax, but raises not-implemented errors, and each effort could then build off of that.

Re: WIP: System Versioned Temporal Table

От
Corey Huinker
Дата:


1. Much of what I have read about temporal tables seemed to imply or almost assume that system temporal tables would be implemented as two actual separate tables. Indeed, SQLServer appears to do it that way [1] with syntax like

WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory));

Q 1.1. Was that implementation considered and if so, what made this implementation more appealing?


I've been digging some more on this point, and I've reached the conclusion that a separate history table is the better implementation. It would make the act of removing system versioning into little more than a DROP TABLE, plus adjusting the base table to reflect that it is no longer system versioned.

What do you think of this method:

1. The regular table remains unchanged, but a pg_class attribute named "relissystemversioned" would be set to true
2. I'm unsure if the standard allows dropping a column from a table while it is system versioned, and the purpose behind system versioning makes me believe the answer is a strong "no" and requiring DROP COLUMN to fail on relissystemversioned = 't' seems pretty straightforward.
3. The history table would be given a default name of $FOO_history (space permitting), but could be overridden with the history_table option.
4. The history table would have relkind = 'h'
5. The history table will only have rows that are not current, so it is created empty.
6. As such, the table is effectively append-only, in a way that vacuum can actually leverage, and likewise the fill factor of such a table should never be less than 100.
7. The history table could only be updated only via system defined triggers (insert,update,delete, alter to add columns), or row migration similar to that found in partitioning. It seems like this would work as the two tables working as partitions of the same table, but presently we can't have multi-parent partitions.
8. The history table would be indexed the same as the base table, except that all unique indexes would be made non-unique, and an index of pk + start_time + end_time would be added
9. The primary key of the base table would remain the existing pk vals, and would basically function normally, with triggers to carry forth changes to the history table. The net effect of this is that the end_time value of all rows in the main table would always be the chosen "current" value (infinity, null, 9999-12-31, etc) and as such might not actually _need_ to be stored.
10. Queries that omit the FOR SYSTEM_TIME clause, as well as ones that use FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, would simply use the base table directly with no quals to add.
11. Queries that use FOR SYSTEM_TIME and not FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, then the query would do a union of the base table and the history table with quals applied to both.
12. It's a fair question whether the history table would be something that could be queried directly. I'm inclined to say no, because that allows for things like SELECT FOR UPDATE, which of course we'd have to reject.
13. If a history table is directly referenceable, then SELECT permission can be granted or revoked as normal, but all insert/update/delete/truncate options would raise an error.
14. DROP SYSTEM VERSIONING from a table would be quite straightforward - the history table would be dropped along with the triggers that reference it, setting relissystemversioned = 'f' on the base table.

I think this would have some key advantages:

1. MVCC bloat is no worse than it was before.
2. No changes whatsoever to referential integrity.
3. DROP SYSTEM VERSIONING becomes an O(1) operation.

Thoughts?

I'm going to be making a similar proposal to the people doing the application time effort, but I'm very much hoping that we can reach some consensus and combine efforts.

 


 

Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Sun, 19 Sept 2021 at 01:16, Corey Huinker <corey.huinker@gmail.com> wrote:
>>
>> 1. Much of what I have read about temporal tables seemed to imply or almost assume that system temporal tables would
beimplemented as two actual separate tables. Indeed, SQLServer appears to do it that way [1] with syntax like 
>>
>> WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory));
>>
>>
>> Q 1.1. Was that implementation considered and if so, what made this implementation more appealing?
>>
>
> I've been digging some more on this point, and I've reached the conclusion that a separate history table is the
betterimplementation. It would make the act of removing system versioning into little more than a DROP TABLE, plus
adjustingthe base table to reflect that it is no longer system versioned. 

Thanks for giving this a lot of thought. When you asked the question
the first time you hadn't discussed how that might work, but now we
have something to discuss.

> 10. Queries that omit the FOR SYSTEM_TIME clause, as well as ones that use FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP,
wouldsimply use the base table directly with no quals to add. 
> 11. Queries that use FOR SYSTEM_TIME and not FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, then the query would do a union
ofthe base table and the history table with quals applied to both. 


> 14. DROP SYSTEM VERSIONING from a table would be quite straightforward - the history table would be dropped along
withthe triggers that reference it, setting relissystemversioned = 'f' on the base table. 
>
> I think this would have some key advantages:
>
> 1. MVCC bloat is no worse than it was before.

The number of row versions stored in the database is the same for
both, just it would be split across two tables in this form.

> 2. No changes whatsoever to referential integrity.

The changes were fairly minor, but I see your thinking about indexes
as a simplification.

> 3. DROP SYSTEM VERSIONING becomes an O(1) operation.

It isn't top of mind to make this work well. The whole purpose of the
history is to keep it, not to be able to drop it quickly.


> Thoughts?

There are 3 implementation routes that I see, so let me explain so
that others can join the discussion.

1. Putting all data in one table. This makes DROP SYSTEM VERSIONING
effectively impossible. It requires access to the table to be
rewritten to add in historical quals for non-historical access and it
requires some push-ups around indexes. (The current patch adds the
historic quals by kludging the parser, which is wrong place, since it
doesn't work for joins etc.. However, given that issue, the rest seems
to follow on naturally).

2. Putting data in a side table. This makes DROP SYSTEM VERSIONING
fairly trivial, but it complicates many DDL commands (please make a
list?) and requires the optimizer to know about this and cater to it,
possibly complicating plans. Neither issue is insurmountable, but it
becomes more intrusive.

The current patch could go in either of the first 2 directions with
further work.

3. Let the Table Access Method handle it. I call this out separately
since it avoids making changes to the rest of Postgres, which might be
a good thing, with the right TAM implementation.

My preferred approach would be to do this "for free" in the table
access method, but we're a long way from this in terms of actual
implementation. When Corey  suggested earlier that we just put the
syntax in there, this was the direction I was thinking.

After waiting a day since I wrote the above, I think we should go with
(2) as Corey suggests, at least for now, and we can always add (3)
later.

--
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Hannu Krosing
Дата:
A side table has the nice additional benefit that we can very easily
version the *table structure* so when we ALTER TABLE and the table
structure changes we just make a new side table with now-currents
structure.

Also we may want different set of indexes on historic table(s) for
whatever reason

And we may even want to partition history tables for speed, storage
cost  or just to drop very ancient history

-----
Hannu Krosing
Google Cloud - We have a long list of planned contributions and we are hiring.
Contact me if interested.

On Sun, Sep 19, 2021 at 8:32 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> On Sun, 19 Sept 2021 at 01:16, Corey Huinker <corey.huinker@gmail.com> wrote:
> >>
> >> 1. Much of what I have read about temporal tables seemed to imply or almost assume that system temporal tables
wouldbe implemented as two actual separate tables. Indeed, SQLServer appears to do it that way [1] with syntax like 
> >>
> >> WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory));
> >>
> >>
> >> Q 1.1. Was that implementation considered and if so, what made this implementation more appealing?
> >>
> >
> > I've been digging some more on this point, and I've reached the conclusion that a separate history table is the
betterimplementation. It would make the act of removing system versioning into little more than a DROP TABLE, plus
adjustingthe base table to reflect that it is no longer system versioned. 
>
> Thanks for giving this a lot of thought. When you asked the question
> the first time you hadn't discussed how that might work, but now we
> have something to discuss.
>
> > 10. Queries that omit the FOR SYSTEM_TIME clause, as well as ones that use FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP,
wouldsimply use the base table directly with no quals to add. 
> > 11. Queries that use FOR SYSTEM_TIME and not FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, then the query would do a
unionof the base table and the history table with quals applied to both. 
>
>
> > 14. DROP SYSTEM VERSIONING from a table would be quite straightforward - the history table would be dropped along
withthe triggers that reference it, setting relissystemversioned = 'f' on the base table. 
> >
> > I think this would have some key advantages:
> >
> > 1. MVCC bloat is no worse than it was before.
>
> The number of row versions stored in the database is the same for
> both, just it would be split across two tables in this form.
>
> > 2. No changes whatsoever to referential integrity.
>
> The changes were fairly minor, but I see your thinking about indexes
> as a simplification.
>
> > 3. DROP SYSTEM VERSIONING becomes an O(1) operation.
>
> It isn't top of mind to make this work well. The whole purpose of the
> history is to keep it, not to be able to drop it quickly.
>
>
> > Thoughts?
>
> There are 3 implementation routes that I see, so let me explain so
> that others can join the discussion.
>
> 1. Putting all data in one table. This makes DROP SYSTEM VERSIONING
> effectively impossible. It requires access to the table to be
> rewritten to add in historical quals for non-historical access and it
> requires some push-ups around indexes. (The current patch adds the
> historic quals by kludging the parser, which is wrong place, since it
> doesn't work for joins etc.. However, given that issue, the rest seems
> to follow on naturally).
>
> 2. Putting data in a side table. This makes DROP SYSTEM VERSIONING
> fairly trivial, but it complicates many DDL commands (please make a
> list?) and requires the optimizer to know about this and cater to it,
> possibly complicating plans. Neither issue is insurmountable, but it
> becomes more intrusive.
>
> The current patch could go in either of the first 2 directions with
> further work.
>
> 3. Let the Table Access Method handle it. I call this out separately
> since it avoids making changes to the rest of Postgres, which might be
> a good thing, with the right TAM implementation.
>
> My preferred approach would be to do this "for free" in the table
> access method, but we're a long way from this in terms of actual
> implementation. When Corey  suggested earlier that we just put the
> syntax in there, this was the direction I was thinking.
>
> After waiting a day since I wrote the above, I think we should go with
> (2) as Corey suggests, at least for now, and we can always add (3)
> later.
>
> --
> Simon Riggs                http://www.EnterpriseDB.com/
>
>



Re: WIP: System Versioned Temporal Table

От
Corey Huinker
Дата:
Thanks for giving this a lot of thought. When you asked the question
the first time you hadn't discussed how that might work, but now we
have something to discuss.

My ultimate goal is to unify this effort with the application period effort. Step 1 in that was to understand what each was doing and why they were doing it. If you check out the other thread, you'll see a highly similar message that I sent over there.
 
There are 3 implementation routes that I see, so let me explain so
that others can join the discussion.

1. Putting all data in one table. This makes DROP SYSTEM VERSIONING
effectively impossible. It requires access to the table to be
rewritten to add in historical quals for non-historical access and it
requires some push-ups around indexes. (The current patch adds the
historic quals by kludging the parser, which is wrong place, since it
doesn't work for joins etc.. However, given that issue, the rest seems
to follow on naturally).

2. Putting data in a side table. This makes DROP SYSTEM VERSIONING
fairly trivial, but it complicates many DDL commands (please make a
list?) and requires the optimizer to know about this and cater to it,
possibly complicating plans. Neither issue is insurmountable, but it
becomes more intrusive.

The current patch could go in either of the first 2 directions with
further work.

3. Let the Table Access Method handle it. I call this out separately
since it avoids making changes to the rest of Postgres, which might be
a good thing, with the right TAM implementation.

I'd like to hear more about this idea number 3.

I could see value in allowing the history table to be a foreign table, perhaps writing to csv/parquet/whatever files, and that sort of setup could be persuasive to a regulator who wants extra-double-secret-proof that auditing cannot be tampered with. But with that we'd have to give up the relkind idea, which itself was going to be a cheap way to prevent updates outside of the system triggers.


Re: WIP: System Versioned Temporal Table

От
Corey Huinker
Дата:
On Sun, Sep 19, 2021 at 3:12 PM Hannu Krosing <hannuk@google.com> wrote:
A side table has the nice additional benefit that we can very easily
version the *table structure* so when we ALTER TABLE and the table
structure changes we just make a new side table with now-currents
structure.

It's true that would allow for perfect capture of changes to the table structure, but how would you query the thing?

If a system versioned table was created with a column foo that is type float, and then we dropped that column, how would we ever query what the value of foo was in the past?

Would the columns returned from SELECT * change based on the timeframe requested?

If we then later added another column that happened to also be named foo but now was type JSONB, would we change the datatype returned based on the time period being queried? 

Is the change in structure a system versioning which itself must be captured?
 
Also we may want different set of indexes on historic table(s) for
whatever reason

+1
 

And we may even want to partition history tables for speed, storage
cost  or just to drop very ancient history

+1 

Re: WIP: System Versioned Temporal Table

От
Hannu Krosing
Дата:
On Mon, Sep 20, 2021 at 7:09 AM Corey Huinker <corey.huinker@gmail.com> wrote:
>
> On Sun, Sep 19, 2021 at 3:12 PM Hannu Krosing <hannuk@google.com> wrote:
>>
>> A side table has the nice additional benefit that we can very easily
>> version the *table structure* so when we ALTER TABLE and the table
>> structure changes we just make a new side table with now-currents
>> structure.
>
>
> It's true that would allow for perfect capture of changes to the table structure, but how would you query the thing?
>
> If a system versioned table was created with a column foo that is type float, and then we dropped that column, how
wouldwe ever query what the value of foo was in the past?
 


We can query that thing only in tables AS OF the time when the column
was still there.

We probably could get away with pretending the dropped columns to be
NULL in newer versions (and the versions before the column was added)

Even more tricky case would be changing the column data type.

>
> Would the columns returned from SELECT * change based on the timeframe requested?


If we want to emulate real table history, then it should.

But the * thing was not really specified well even for original
PostgreSQL inheritance.

Maybe we could do SELECT (* AS OF 'yesterday afternoon'::timestamp) FROM ... :)

> If we then later added another column that happened to also be named foo but now was type JSONB, would we change the
datatypereturned based on the time period being queried?
 

Many databases do allow returning multiple result sets, and actually
the PostgreSQL wire *protocol* also theoretically supports this, just
that it is not supported by any current client library.

With current libraries it would be possible to return a dynamic
version of  row_to_json(t.*) which changes based on actual historical
table structure

> Is the change in structure a system versioning which itself must be captured?

We do capture it (kind of) for logical decoding. That is, we decode
according to the structure in effect at the time of row creation,
though we currently miss the actual DDL itself.


So there is a lot to figure out and define, but at least storing the
history in a separate table gives a good foundation to build upon.



-----
Hannu Krosing
Google Cloud - We have a long list of planned contributions and we are hiring.
Contact me if interested.



Re: WIP: System Versioned Temporal Table

От
Daniel Gustafsson
Дата:
> On 19 Sep 2021, at 20:32, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> My preferred approach would be to do this "for free" in the table
> access method, but we're a long way from this in terms of actual
> implementation. When Corey  suggested earlier that we just put the
> syntax in there, this was the direction I was thinking.
> 
> After waiting a day since I wrote the above, I think we should go with
> (2) as Corey suggests, at least for now, and we can always add (3)
> later.

This patch no longer applies, are there plans on implementing the approaches
discussed above, or should we close this entry and open a new one when a
freshly baked pathed is ready?

--
Daniel Gustafsson        https://vmware.com/




Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 11/15/21 10:47 AM, Daniel Gustafsson wrote:
>> On 19 Sep 2021, at 20:32, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> 
>> My preferred approach would be to do this "for free" in the table
>> access method, but we're a long way from this in terms of actual
>> implementation. When Corey  suggested earlier that we just put the
>> syntax in there, this was the direction I was thinking.
>>
>> After waiting a day since I wrote the above, I think we should go with
>> (2) as Corey suggests, at least for now, and we can always add (3)
>> later.
> 
> This patch no longer applies, are there plans on implementing the approaches
> discussed above, or should we close this entry and open a new one when a
> freshly baked pathed is ready?

I spent a lot of time a while ago trying to fix this patch (not just
rebase it), and I think it should just be rejected, unfortunately.

The design decisions are just too flawed, and it conflates system_time
periods with system versioning which is very wrong.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Simon Riggs
Дата:
On Mon, 15 Nov 2021 at 09:47, Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 19 Sep 2021, at 20:32, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> > My preferred approach would be to do this "for free" in the table
> > access method, but we're a long way from this in terms of actual
> > implementation. When Corey  suggested earlier that we just put the
> > syntax in there, this was the direction I was thinking.
> >
> > After waiting a day since I wrote the above, I think we should go with
> > (2) as Corey suggests, at least for now, and we can always add (3)
> > later.
>
> This patch no longer applies, are there plans on implementing the approaches
> discussed above, or should we close this entry and open a new one when a
> freshly baked pathed is ready?

As I mentioned upthread, there are at least 2 different ways forward
(1) and (2), both of which need further work. I don't think that
additional work is impossible, but it is weeks of work, not days and
it needs to be done in collaboration with other thoughts on other
threads Corey refers to.

I have no plans on taking this patch further, but will give some help
to anyone that wishes to do that.

I suggest we Return with Feedback.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: WIP: System Versioned Temporal Table

От
Daniel Gustafsson
Дата:
> On 15 Nov 2021, at 11:50, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> I have no plans on taking this patch further, but will give some help
> to anyone that wishes to do that.
> 
> I suggest we Return with Feedback.

Fair enough, done that way.

--
Daniel Gustafsson        https://vmware.com/




Re: WIP: System Versioned Temporal Table

От
Trevor Gross
Дата:
Chiming in as a user, not so much a developer - I've been using system
versioned tables in MariaDB for about half a year now, would just like
to add some feedback about what they did right and wrong and how PG
could learn from their mistakes & successes.

> 2. Putting data in a side table. This makes DROP SYSTEM VERSIONING
> fairly trivial, but it complicates many DDL commands (please make a
> list?) and requires the optimizer to know about this and cater to it,
> possibly complicating plans. Neither issue is insurmountable, but it
> becomes more intrusive.

I'd vouch for this being the way to go; you completely sidestep issues
like partitioning, unique constraints, optimization, etc. Especially
true when 90% of the time, SELECTs will only be looking at
currently-active data. MDB seems to have gone with the single-table
approach (unless you partition) and I've run into a bug where I can't
add a unique constraint because historical data fails.

#### System versioning & Application versioning
I saw that there is an intent to harmonize system versioning with
application versioning. Haven't read the AV thread so not positive if
that meant intending to split tables by application versioning and
system versioning both: to me it seems like maybe it would be good to
use a separate table for SV, but keep AV in the same table. Reasons
include:

- ISO states only one AV config per table, but there's no reason this
always has to be the case; maybe you're storing products that are
active for a period of time, EOL for a period of time, and obsolete
for a period of time. If ISO sometime decides >1 AV config is OK,
there would be a mess trying to split that into tables.
- DB users who are allowed to change AV items likely won't be allowed
to rewrite history by changing SV items. My proposed schema would keep
these separate.
- Table schemas change, and all (SV active) AV items would logically
need to fit the active schema or be updated to do so. Different story
for SV, nothing there should ever need to be changed.
- Partitioning for AV tables isn't as clear as with SV and is likely
better to be user-defined

Sorry for acronyms, SV=system versioning, AV=application versioning

In general, I think AV should be treated literally as extra rows in
the main DB, plus the extra PK element and shortcut functions. SV
though, needs to have a lot more nuance.

#### ALTER TABLE
On to ideas about how ALTER TABLE could work. I don't think the
question was ever answered "Do schema changes need to be tracked?" I'm
generally in favor of saying that it should be possible to recreate
the table exactly as it was, schema and all, at a specific period of
time (perhaps for a view) using a fancy combination of SELECT ... AS
and such - but it doesn't need to be straightforward. In any case, no
data should ever be deleted by ALTER TABLE. As someone pointed out
earlier, speed and storage space of ALTER TABLE are likely low
considerations for system versioned tables.

- ADD COLUMN easy, add the column to both the current and historical
table, all null in historical
- DROP COLUMN delete the column from the current table. Historical is
difficult, because what happens if a new column with the same name is
added? Maybe `DROP COLUMN col1` would rename col1 to _col1_1642929683
(epoch time) in the historical table or something like that.
- RENAME COLUMN is a bit tricky too - from a usability standpoint, the
historical table should be renamed as well. A quick thought is maybe
`RENAME col1 TO new_name` would perform the rename in the historical
table, but also create _col1_1642929683 as an alias to new_name to
track that there was a change. I don't think there would be any name
violations in the history table because there would never be a column
name in history that isn't in current (because of the rename described
with DROP).
- Changing column data type: ouch. This needs to be mainly planned for
cases where data types are incompatible, possibly optimized for times
when they are compatible. Seems like another _col1_1642929683 rename
would be in order, and a new col1 created with the new datatype, and a
historical SELECT would automatically merge the two. Possible
optimization: if the old type fits into the new type, just change the
data type in history and make _col1_1642929683 an alias to it.
- Change defaults, nullability, constraints, etc: I think these can
safely be done for the current table only. Realistically, historical
tables could probably skip all checks, always (except their tuple PK),
since trying to enforce them would just be opening the door to bugs.
Trying to think of any times this isn't true.
- FKs: I'm generally in the same boat as above, thinking that these
don't need to affect historical tables. Section 2.5 in the paper I
link below discusses period joins, but I don't think any special
behavior is needed for now. Perhaps references could be kept in
history but not enforced
- Changing PK / adding/removing more columns to PK: Annoying and not
easily dealt with. Maybe just disallow
- Triggers: no affect on historical
- DROP TABLE bye bye, history & all

Things like row level security add extra complication but can probably
be disregarded. Maybe just have a `select history` permission or
similar.

An interesting idea could be to automatically add system versioning to
information_schema whenever it is added to a table. This would provide
a way to easily query historical DDL. It would also help solve how to
keep historical FKs. This would make it possible to perfectly recreate
system versioned parts of your database at any period of time, schema
and data both.

#### Partitioning
Allowing for partitioning and automatic rotation seems like a good
idea, should be possible with current syntax but maybe worth adding
some shortcuts like maria has.

#### Permissions
- MDB has the new 'delete history' schema privilege that defines who
can delete historical data before a certain time or drop system
versioning, seems like a good idea to implement. They also require
`@@system_versioning_alter_history=keep;` to be set before doing
anything ALTER TABLE; doesn't do much outside of serving as a reminder
that changing system versioned tables can be dangerous.¯\_(ツ)_/¯
- This part sucks and goes against everything ISO is going for, but
IMO there needs to be a way to insert/update/delete historical data.
Maybe there needs to be a new superduperuser role to do it and you
need to type the table name backwards to verify you want to insert,
but situations like data migration, fixing incorrectly stored data, or
removing accidental sensitive information demand it. This isn't a
priority though, and basic system versioning can be shipped without
it.

#### Misc
- Seems like a good idea to include MDB's option to exclude columns
from versioning (`WITHOUT SYSTEM VERSIONING` as a column argument).
This is relatively nuanced and I'm not sure if it's officially part of
ISO, but probably helpful for frequently updating small data in rows
with BLOBs. Easy enough to implement, just forget the column in the
historical table.
- I thought I saw somewhere that somebody was discussing adding both
row_start and row_end to the PK. Why would this be? Row_end should be
all that's needed to keep unique, but maybe I misread.

#### Links
- I haven't seen it linked here yet but this paper does a phenomenal
deep dive into SV and AV
https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf
- It's not perfect, but MDB's system versioning is pretty well thought
out. You get a good idea of their thought process going through this
page, worth a read
https://mariadb.com/kb/en/system-versioned-tables/#excluding-columns-from-versioning

#### Finally, the end
There's a heck of a lot of thought that could go into this thing,
probably worth making sure there's a formal agreement on what to be
done before coding starts (PGEP for postgres enhancement proposal,
like PEP? Not sure if something like that exists but it probably
should.). Large parts of the existing patch could likely be reused for
whatever is decided.

Best,
Trevor


On Sun, Jan 23, 2022 at 2:47 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 15 Nov 2021, at 11:50, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> > I have no plans on taking this patch further, but will give some help
> > to anyone that wishes to do that.
> >
> > I suggest we Return with Feedback.
>
> Fair enough, done that way.
>
> --
> Daniel Gustafsson               https://vmware.com/
>
>
>
>
>



Re: WIP: System Versioned Temporal Table

От
Corey Huinker
Дата:

> 2. Putting data in a side table. This makes DROP SYSTEM VERSIONING
> fairly trivial, but it complicates many DDL commands (please make a
> list?) and requires the optimizer to know about this and cater to it,
> possibly complicating plans. Neither issue is insurmountable, but it
> becomes more intrusive.

I'd vouch for this being the way to go; you completely sidestep issues
like partitioning, unique constraints, optimization, etc. Especially
true when 90% of the time, SELECTs will only be looking at
currently-active data. MDB seems to have gone with the single-table
approach (unless you partition) and I've run into a bug where I can't
add a unique constraint because historical data fails.

#### System versioning & Application versioning
I saw that there is an intent to harmonize system versioning with
application versioning. Haven't read the AV thread so not positive if
that meant intending to split tables by application versioning and
system versioning both: to me it seems like maybe it would be good to
use a separate table for SV, but keep AV in the same table. Reasons
include:

The proposed AV uses just one table.
 
- ISO states only one AV config per table, but there's no reason this
always has to be the case; maybe you're storing products that are
active for a period of time, EOL for a period of time, and obsolete
for a period of time. If ISO sometime decides >1 AV config is OK,
there would be a mess trying to split that into tables.

The proposed AV (so far) allows for that.
 
- DB users who are allowed to change AV items likely won't be allowed
to rewrite history by changing SV items. My proposed schema would keep
these separate.
- Table schemas change, and all (SV active) AV items would logically
need to fit the active schema or be updated to do so. Different story
for SV, nothing there should ever need to be changed.

Yeah, there's a mess (which you state below) about what happens if you create a table and then rename a column, or drop a column and add a same-named column back of another type at a later date, etc. In theory, this means that the valid set of columns and their types changes according to the time range specified. I may not be remembering correctly, but Vik stated that the SQL spec seemed to imply that you had to track all those things.
 
- Partitioning for AV tables isn't as clear as with SV and is likely
better to be user-defined

So this was something I was asking various parties about at PgConf NYC just a few weeks ago. I am supposing that the main reason for SV is a regulatory concern, what tolerance to regulators have for the ability to manipulate the SV side-table? Is it possible to directly insert rows into one? If not, then moving rows into a new partition becomes impossible, and you'd be stuck with the partitioning strategy (if any) that you defined at SV creation time.

The feedback I got was "well, you're already a superuser, if a regulator had a problem with that then they would have required that the SV table's storage be outside the server, either a foreign table, a csv foreign data wrapper of some sort, or a trigger writing to a non-db storage (which wouldn't even need SV).

From that, I concluded that every single AV partition would have it's own SV table, which could in turn be partitioned. In a sense, it might be helpful to think of the SV tables as partitions of the main table, and the period definition would effectively be the constraint that prunes the SV partition.
 

Sorry for acronyms, SV=system versioning, AV=application versioning

In general, I think AV should be treated literally as extra rows in
the main DB, plus the extra PK element and shortcut functions. SV
though, needs to have a lot more nuance.

#### ALTER TABLE
On to ideas about how ALTER TABLE could work. I don't think the
question was ever answered "Do schema changes need to be tracked?" I'm
generally in favor of saying that it should be possible to recreate
the table exactly as it was, schema and all, at a specific period of
time (perhaps for a view) using a fancy combination of SELECT ... AS
and such - but it doesn't need to be straightforward. In any case, no
data should ever be deleted by ALTER TABLE. As someone pointed out
earlier, speed and storage space of ALTER TABLE are likely low
considerations for system versioned tables.

- ADD COLUMN easy, add the column to both the current and historical
table, all null in historical
- DROP COLUMN delete the column from the current table. Historical is
difficult, because what happens if a new column with the same name is
added? Maybe `DROP COLUMN col1` would rename col1 to _col1_1642929683
(epoch time) in the historical table or something like that.
- RENAME COLUMN is a bit tricky too - from a usability standpoint, the
historical table should be renamed as well. A quick thought is maybe
`RENAME col1 TO new_name` would perform the rename in the historical
table, but also create _col1_1642929683 as an alias to new_name to
track that there was a change. I don't think there would be any name
violations in the history table because there would never be a column
name in history that isn't in current (because of the rename described
with DROP).
- Changing column data type: ouch. This needs to be mainly planned for
cases where data types are incompatible, possibly optimized for times
when they are compatible. Seems like another _col1_1642929683 rename
would be in order, and a new col1 created with the new datatype, and a
historical SELECT would automatically merge the two. Possible
optimization: if the old type fits into the new type, just change the
data type in history and make _col1_1642929683 an alias to it.
- Change defaults, nullability, constraints, etc: I think these can
safely be done for the current table only. Realistically, historical
tables could probably skip all checks, always (except their tuple PK),
since trying to enforce them would just be opening the door to bugs.
Trying to think of any times this isn't true.
- FKs: I'm generally in the same boat as above, thinking that these
don't need to affect historical tables. Section 2.5 in the paper I
link below discusses period joins, but I don't think any special
behavior is needed for now. Perhaps references could be kept in
history but not enforced
- Changing PK / adding/removing more columns to PK: Annoying and not
easily dealt with. Maybe just disallow
- Triggers: no affect on historical
- DROP TABLE bye bye, history & all

You seem to have covered all the bases, and the only way I can think to sensibly track all of those things is to allow for multiple SV tables, and every time the main table is altered, you simply start fresh with a new, empty SV table. You'd probably also slap a constraint on the previous SV table to reflect the fact that no rows newer than X will ever be entered there, which would further aid constraint exclusion.
 
Things like row level security add extra complication but can probably
be disregarded. Maybe just have a `select history` permission or
similar.

+1 
 
An interesting idea could be to automatically add system versioning to
information_schema whenever it is added to a table. This would provide
a way to easily query historical DDL. It would also help solve how to
keep historical FKs. This would make it possible to perfectly recreate
system versioned parts of your database at any period of time, schema
and data both.

Interesting...
 
#### Misc
- Seems like a good idea to include MDB's option to exclude columns
from versioning (`WITHOUT SYSTEM VERSIONING` as a column argument).
This is relatively nuanced and I'm not sure if it's officially part of
ISO, but probably helpful for frequently updating small data in rows
with BLOBs. Easy enough to implement, just forget the column in the
historical table.

First I've heard of it. Others will know more.
 
- I thought I saw somewhere that somebody was discussing adding both
row_start and row_end to the PK. Why would this be? Row_end should be
all that's needed to keep unique, but maybe I misread.

I don't think they need to be part of the PK at all. The main table has the PK that it knows of, and the SV tables are indexed independently.

In fact, I don't think row_end needs to be an actual stored value in the main table, because it will never be anything other than null. How we represent such an attribute is another question, but the answer to that possibly ties into how we implement the virtual side of GENERATED ALWAYS AS...
 

#### Links
- I haven't seen it linked here yet but this paper does a phenomenal
deep dive into SV and AV
https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf

My link is different but this seems to be the same PDF that has been cited earlier for both SV and AV.
 
- It's not perfect, but MDB's system versioning is pretty well thought
out. You get a good idea of their thought process going through this
page, worth a read
https://mariadb.com/kb/en/system-versioned-tables/#excluding-columns-from-versioning

#### Finally, the end
There's a heck of a lot of thought that could go into this thing,
probably worth making sure there's a formal agreement on what to be
done before coding starts (PGEP for postgres enhancement proposal,
like PEP? Not sure if something like that exists but it probably
should.). Large parts of the existing patch could likely be reused for
whatever is decided.

Thanks for the input, it helps us get some momentum on this.

 

Re: WIP: System Versioned Temporal Table

От
Vik Fearing
Дата:
On 1/24/22 00:16, Corey Huinker wrote:
>> - Table schemas change, and all (SV active) AV items would logically
>> need to fit the active schema or be updated to do so. Different story
>> for SV, nothing there should ever need to be changed.
>>
> Yeah, there's a mess (which you state below) about what happens if you
> create a table and then rename a column, or drop a column and add a
> same-named column back of another type at a later date, etc. In theory,
> this means that the valid set of columns and their types changes according
> to the time range specified. I may not be remembering correctly, but Vik
> stated that the SQL spec seemed to imply that you had to track all those
> things.

The spec does not allow schema changes at all on a a system versioned 
table, except to change the system versioning itself.
-- 
Vik Fearing



Re: WIP: System Versioned Temporal Table

От
Jean Baro
Дата:
Would these best practices be applicable by PostgreSQL to help avoid breaking changes for temporal tables?


Thanks

On Tue, Feb 15, 2022 at 5:08 PM Vik Fearing <vik@postgresfriends.org> wrote:
On 1/24/22 00:16, Corey Huinker wrote:
>> - Table schemas change, and all (SV active) AV items would logically
>> need to fit the active schema or be updated to do so. Different story
>> for SV, nothing there should ever need to be changed.
>>
> Yeah, there's a mess (which you state below) about what happens if you
> create a table and then rename a column, or drop a column and add a
> same-named column back of another type at a later date, etc. In theory,
> this means that the valid set of columns and their types changes according
> to the time range specified. I may not be remembering correctly, but Vik
> stated that the SQL spec seemed to imply that you had to track all those
> things.

The spec does not allow schema changes at all on a a system versioned
table, except to change the system versioning itself.
--
Vik Fearing




Re: WIP: System Versioned Temporal Table

От
Corey Huinker
Дата:

The spec does not allow schema changes at all on a a system versioned
table, except to change the system versioning itself.


That would greatly simplify things!