WIP: Deferrable unique constraints

Поиск

Список

Период

Сортировка

От	Dean Rasheed
Тема	WIP: Deferrable unique constraints
Дата	7 июля 2009 г. 18:40:18
Msg-id	8e2dbb700907071138y4ebe75cw81879aa513cf82a3@mail.gmail.com обсуждение исходный текст
Ответы	Re: WIP: Deferrable unique constraints Re: WIP: Deferrable unique constraints Re: WIP: Deferrable unique constraints Re: WIP: Deferrable unique constraints
Список	pgsql-hackers

Дерево обсуждения

This is a feature I would find useful, and I wanted to learn more
about the database internals, so I thought I would give it a go as an
exercise.

I added 2 new index insert modes to support deferrable. During the
initial index insert, if the constraint is deferrable, it does a
"partial" check of uniqueness. This is a non-blocking check which
allows duplicates in, but can distinguish between definitely unique
and potentially non-unique. For potential uniqueness violations a
deferred trigger is queued to do a full check at the end of the
statement or transaction, or when SET CONSTRAINTS is called. The
trigger then replays the insert in a "fake insert" mode, which doesn't
actually insert anything - it just checks that what is already there
is unique, waiting for other transactions if necessary.

This approach works well if the number of potential conflicts is
small. For example, if uniqueness is not actually violated, there is
no significant penalty for the deferred check, since no triggers are
queued:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 79.131 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11403.906 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.145 ms
pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated
UPDATE 1000000
Time: 21258.245 ms
pgdevel=# COMMIT;
COMMIT
Time: 83.267 ms

compared with:

pgdevel=# CREATE TABLE foo(a int UNIQUE);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 110.277 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11196.600 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.146 ms
pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated
UPDATE 1000000
Time: 21054.430 ms
pgdevel=# COMMIT;
COMMIT
Time: 186.174 ms

However, if there are lots of temporarily non-unique values, the
performance impact is much greater, since the unique check is repeated
for each row at commit time:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 64.244 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11621.438 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.146 ms
pgdevel=# UPDATE foo SET a=a+2; -- Uniqueness temporarily violated
UPDATE 1000000
Time: 21807.128 ms
pgdevel=# COMMIT;
COMMIT
Time: 14974.916 ms

And similarly for an "immediate" check:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY IMMEDIATE);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 44.477 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 12083.089 ms
pgdevel=# UPDATE foo SET a=a+2;
UPDATE 1000000
Time: 37173.210 ms

Also, as for FKs and other queued triggers, this doesn't scale because
the queue is held in memory.

Curing the scalability problem by spooling the queue to disk shouldn't
be too hard to do, but that doesn't address the problem that if a
significant proportion of rows from the table need to be checked, it
is far quicker to scan the whole index once than check row by row.

In fact, with this data on my machine, a single scan takes around
0.5sec, so on that basis a full scan would be quicker if more than
around 3% of the data is flagged as potentially non-unique, although
the planner might be a better judge of that.

I can't see a way of coding something that can switch over to a full
scan, without first recording all the potential violations. Then I'm
not entirely sure how best to do the switch-over logic.

- Dean

Вложения

deferrable_unique.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 07 июля 2009 г., 18:25:52
Сообщение: Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby

Следующее

От: Peter Eisentraut
Дата: 07 июля 2009 г., 18:43:14
Сообщение: Re: information_schema.columns changes needed for OLEDB

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

WIP: Deferrable unique constraints

Вложения

Предыдущее

Следующее