generic plans and "initial" pruning

Поиск
Список
Период
Сортировка
От Amit Langote
Тема generic plans and "initial" pruning
Дата
Msg-id CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
обсуждение исходный текст
Ответы Re: generic plans and "initial" pruning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Re: generic plans and "initial" pruning  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Executing generic plans involving partitions is known to become slower
as partition count grows due to a number of bottlenecks, with
AcquireExecutorLocks() showing at the top in profiles.

Previous attempt at solving that problem was by David Rowley [1],
where he proposed delaying locking of *all* partitions appearing under
an Append/MergeAppend until "initial" pruning is done during the
executor initialization phase.  A problem with that approach that he
has described in [2] is that leaving partitions unlocked can lead to
race conditions where the Plan node belonging to a partition can be
invalidated when a concurrent session successfully alters the
partition between AcquireExecutorLocks() saying the plan is okay to
execute and then actually executing it.

However, using an idea that Robert suggested to me off-list a little
while back, it seems possible to determine the set of partitions that
we can safely skip locking.  The idea is to look at the "initial" or
"pre-execution" pruning instructions contained in a given Append or
MergeAppend node when AcquireExecutorLocks() is collecting the
relations to lock and consider relations from only those sub-nodes
that survive performing those instructions.   I've attempted
implementing that idea in the attached patch.

Note that "initial" pruning steps are now performed twice when
executing generic plans: once in AcquireExecutorLocks() to find
partitions to be locked, and a 2nd time in ExecInit[Merge]Append() to
determine the set of partition sub-nodes to be initialized for
execution, though I wasn't able to come up with a good idea to avoid
this duplication.

Using the following benchmark setup:

pgbench testdb -i --partitions=$nparts > /dev/null 2>&1
pgbench -n testdb -S -T 30 -Mprepared

And plan_cache_mode = force_generic_plan,

I get following numbers:

HEAD:

32      tps = 20561.776403 (without initial connection time)
64      tps = 12553.131423 (without initial connection time)
128     tps = 13330.365696 (without initial connection time)
256     tps = 8605.723120 (without initial connection time)
512     tps = 4435.951139 (without initial connection time)
1024    tps = 2346.902973 (without initial connection time)
2048    tps = 1334.680971 (without initial connection time)

Patched:

32      tps = 27554.156077 (without initial connection time)
64      tps = 27531.161310 (without initial connection time)
128     tps = 27138.305677 (without initial connection time)
256     tps = 25825.467724 (without initial connection time)
512     tps = 19864.386305 (without initial connection time)
1024    tps = 18742.668944 (without initial connection time)
2048    tps = 16312.412704 (without initial connection time)

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CAKJS1f_kfRQ3ZpjQyHC7=PK9vrhxiHBQFZ+hc0JCwwnRKkF3hg@mail.gmail.com

[2] https://www.postgresql.org/message-id/CAKJS1f99JNe%2Bsw5E3qWmS%2BHeLMFaAhehKO67J1Ym3pXv0XBsxw%40mail.gmail.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Proposal: sslmode=tls-only
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: row filtering for logical replication