WIP: Deferrable unique constraints

Поиск
Список
Период
Сортировка
От Dean Rasheed
Тема WIP: Deferrable unique constraints
Дата
Msg-id 8e2dbb700907071138y4ebe75cw81879aa513cf82a3@mail.gmail.com
обсуждение исходный текст
Ответы Re: WIP: Deferrable unique constraints  (Jeff Davis <pgsql@j-davis.com>)
Re: WIP: Deferrable unique constraints  (Jeff Davis <pgsql@j-davis.com>)
Re: WIP: Deferrable unique constraints  (Dean Rasheed <dean.a.rasheed@googlemail.com>)
Re: WIP: Deferrable unique constraints  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
This is a feature I would find useful, and I wanted to learn more
about the database internals, so I thought I would give it a go as an
exercise.

I added 2 new index insert modes to support deferrable. During the
initial index insert, if the constraint is deferrable, it does a
"partial" check of uniqueness. This is a non-blocking check which
allows duplicates in, but can distinguish between definitely unique
and potentially non-unique. For potential uniqueness violations a
deferred trigger is queued to do a full check at the end of the
statement or transaction, or when SET CONSTRAINTS is called. The
trigger then replays the insert in a "fake insert" mode, which doesn't
actually insert anything - it just checks that what is already there
is unique, waiting for other transactions if necessary.

This approach works well if the number of potential conflicts is
small. For example, if uniqueness is not actually violated, there is
no significant penalty for the deferred check, since no triggers are
queued:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 79.131 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11403.906 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.145 ms
pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated
UPDATE 1000000
Time: 21258.245 ms
pgdevel=# COMMIT;
COMMIT
Time: 83.267 ms

compared with:

pgdevel=# CREATE TABLE foo(a int UNIQUE);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 110.277 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11196.600 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.146 ms
pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated
UPDATE 1000000
Time: 21054.430 ms
pgdevel=# COMMIT;
COMMIT
Time: 186.174 ms

However, if there are lots of temporarily non-unique values, the
performance impact is much greater, since the unique check is repeated
for each row at commit time:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 64.244 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11621.438 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.146 ms
pgdevel=# UPDATE foo SET a=a+2; -- Uniqueness temporarily violated
UPDATE 1000000
Time: 21807.128 ms
pgdevel=# COMMIT;
COMMIT
Time: 14974.916 ms

And similarly for an "immediate" check:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY IMMEDIATE);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 44.477 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 12083.089 ms
pgdevel=# UPDATE foo SET a=a+2;
UPDATE 1000000
Time: 37173.210 ms

Also, as for FKs and other queued triggers, this doesn't scale because
the queue is held in memory.

Curing the scalability problem by spooling the queue to disk shouldn't
be too hard to do, but that doesn't address the problem that if a
significant proportion of rows from the table need to be checked, it
is far quicker to scan the whole index once than check row by row.

In fact, with this data on my machine, a single scan takes around
0.5sec, so on that basis a full scan would be quicker if more than
around 3% of the data is flagged as potentially non-unique, although
the planner might be a better judge of that.

I can't see a way of coding something that can switch over to a full
scan, without first recording all the potential violations. Then I'm
not entirely sure how best to do the switch-over logic.

- Dean

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: information_schema.columns changes needed for OLEDB