Optimising Foreign Key checks

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Optimising Foreign Key checks
Дата
Msg-id CA+U5nMLM1DaHBC6JXtUMfcG6f7FgV5mPSpufO7GRnbFKkF2f7g@mail.gmail.com
обсуждение исходный текст
Ответы Re: Optimising Foreign Key checks  (Noah Misch <noah@leadboat.com>)
Re: Optimising Foreign Key checks  (Greg Stark <stark@mit.edu>)
Re: Optimising Foreign Key checks  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
FK checks can be expensive, especially when loading large volumes of
data into an existing table or partition. A couple of ideas for
improving performance are discussed here:

1. Use Case: Bulk loading
COPY pgbench_accounts;  --> references pgbench_branches with many
repeated values

Proposal: Transactions that need multiple checks can be optimised by
simply LOCKing the whole referenced table, once. We can then hold the
referenced table as a Hash, like we do with a Hash Join (its almost
exactly the same thing). This works in two ways: it speeds up checks
and it also reduces the locking overhead.

This would require explicit permission of the user, which would be
given by a new table parameter, set on the referenced table.
 WITH (foreign_key_lock_level = row | table)

Setting this would lock out changes on that table, so would only be
suitable for read-mostly tables. But that is exactly the most
frequently referenced table in a FK anyway, "reference tables", so the
optimisation is appropriate in probably the majority of cases.

2. Use Case: Transactional repetition
BEGIN;
INSERT INTO order VALUES (ordid, ....)
INSERT INTO order_line VALUES (ordid, 1, .....)
INSERT INTO order_line VALUES (ordid, 2, .....)
INSERT INTO order_line VALUES (ordid, 3, .....)
INSERT INTO order_line VALUES (ordid, 4, .....)
...
COMMIT;
The inserts into order_line repeatedly execute checks against the same
ordid. Deferring and then de-duplicating the checks would optimise the
transaction.

Proposal: De-duplicate multiple checks against same value. This would
be implemented by keeping a hash of rows that we had already either
inserted and/or locked as the transaction progresses, so we can use
the hash to avoid queuing up after triggers.

We could also use this technique to de-duplicate checks within a
single statement.

In both cases we are building up a local hash table with values and
then using those values to avoid queuing constraint triggers. So code
is similar for both.

Thoughts?

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Deferring transaction wraparound
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)