Re: Speeding up creating UPDATE/DELETE generic plan for partitionedtable into a lot

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Speeding up creating UPDATE/DELETE generic plan for partitionedtable into a lot
Дата
Msg-id CAKJS1f-y1HQK+VjG7=C==vGcLnzxjN8ysD5NmaN8Wh4=VsYipw@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Speeding up creating UPDATE/DELETE generic plan for partitionedtable into a lot  ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Список pgsql-hackers
On Fri, 28 Dec 2018 at 20:36, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
> Although I may say the same thing as you, I think a natural idea would be to create a generic plan gradually.  The
startingsimple question is "why do we have to touch all partitions at first?"  That is, can we behave like this:
 
>
> * PREPARE just creates an aggregation plan node (e.g. Append for SELECT, Update for UPDATE).  It doesn't create any
planfor particular partitions.  Say, call this a parent generic plan node.
 
> * EXECUTE creates a generic plan for specific partitions if they don't exist yet, and attach them to the parent
genericplan node.
 

I imagine the place to start looking would be around why planning is
so slow for that many partitions. There are many inefficient data
structures used in the planner that perform linear searches over
things like equivalence classes. Perhaps some profiling would
highlight just where the problems lie. Tom recently re-posted a query
[1] which involved a large number of joins which was taking about 14
seconds to plan on my laptop. After writing some test code to allow
faster lookups of equivalence classes matching a set of relations I
got this down to about 2.4 seconds [2]. Perhaps this also helps the
partitioned table case a little too.

Another possible interesting idea would be to, instead of creating
large Append/MergeAppend plans for partition scanning, invent some
"Partition Seq Scan" and "Partition Index Scan" nodes that are able to
build plans more similar to scanning a normal table. Likely such nodes
would need to be programmed with a list of Oids that they're to scan
during their execution. They'd also need to take care of their own
tuple mapping for when partitions had their columns in varying orders.
At first thought, such a design does not seem so unrealistic, if so,
it would likely solve the generic plan problem you describe
completely.

[1] https://www.postgresql.org/message-id/6970.1545327857%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/CAKJS1f_BQvjetGKsjT65gLAVWXQyRYRJpuXE2eBKrE0o0EcWwA@mail.gmail.com

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Clean up some elog messages and comments for do_pg_stop_backupand do_pg_start_backup
Следующее
От: David Rowley
Дата:
Сообщение: Re: pg_dump multi VALUES INSERT