Delay locking partitions during query execution

Поиск
Список
Период
Сортировка
От David Rowley
Тема Delay locking partitions during query execution
Дата
Msg-id CAKJS1f_kfRQ3ZpjQyHC7=PK9vrhxiHBQFZ+hc0JCwwnRKkF3hg@mail.gmail.com
обсуждение исходный текст
Ответы Re: Delay locking partitions during query execution  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: Delay locking partitions during query execution  (David Rowley <david.rowley@2ndquadrant.com>)
Re: Delay locking partitions during query execution  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
Over on [1] I'm proposing to delay locking partitions of a partitioned
table that's the target of an INSERT or UPDATE command until we first
route a tuple to the partition.   Currently, we go and lock all
partitions, even if we just insert a single tuple to a single
partition. The patch in [1] improves the situation when there are many
partitions and only a few tuples to route to just to a few partitions.

Over here and along similar lines to the above, but this time I'd like
to take this even further and change things so we don't lock *any*
partitions during AcquireExecutorLocks() and instead just lock them
when we first access them with ExecGetRangeTableRelation().  This
improves the situation when many partitions get run-time pruned, as
we'll never bother locking those at all since we'll never call
ExecGetRangeTableRelation() on them. We'll of course still lock the
partitioned table so that plan invalidation works correctly.

This does make the locking order less well defined, but I'm already
proposing similar in [1] and over there I've mentioned that I can't
quite see any huge issues with doing that.  We already don't lock all
partitions inside AcquireExecutorLocks() during INSERT with VALUES
anyway.

The attached patch implements this delayed locking.  A quick benchmark
shows a pretty good performance improvement when there are a large
number of partitions and run-time pruning prunes almost all of them.

Setup:
create table hashp (a int) partition by hash(a);
select 'create table hashp'|| x::text || ' partition of hashp for
values with (modulus 10000, remainder ' || x ::text || ');' from
generate_Series(0,9999) x;
\gexec
insert into hashp select generate_Series(1,1000000)

bench.sql:
\set p_a 13315
select * from hashp where a = :p_a;

Master: 10000 parts

$ pgbench -n -f bench.sql -M prepared -T 60 postgres
tps = 108.882749 (excluding connections establishing)
tps = 108.245437 (excluding connections establishing)

delaylock: 10000 parts

$ pgbench -n -f bench.sql -M prepared -T 60 postgres
tps = 1068.289505 (excluding connections establishing)
tps = 1092.797015 (excluding connections establishing)

More could be done to make this quite a bit faster again, but that
mostly requires the range table coming directly from the planner as an
array and marking which array elements require locking with a
Bitmapset. This'll save having to loop over the entire large array
that mostly does not need anything locked. Similar can be done for
ExecCheckRTPerms(), but that's also for another day. With those
changes and some further tweaks done on the Append/MergeAppend code,
tps is about 22k on my machine, which is just slightly slower than
with an equivalent non-partitioned table.

I'll add the attached patch to the January commitfest

[1] https://www.postgresql.org/message-id/CAKJS1f-%3DFnMqmQP6qitkD%2BxEddxw22ySLP-0xFk3JAqUX2yfMw%40mail.gmail.com

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dave Cramer
Дата:
Сообщение: Re: Reviving the "Stopping logical replication protocol" patch fromVladimir Gordichuk
Следующее
От: Ildus Kurbangaliev
Дата:
Сообщение: Re: [HACKERS] Custom compression methods