Re: Memory consumed by paths during partitionwise join planning

Поиск
Список
Период
Сортировка
От Andrei Lepikhov
Тема Re: Memory consumed by paths during partitionwise join planning
Дата
Msg-id bd68c396-5e17-4254-a2c3-992ab8e912cc@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Memory consumed by paths during partitionwise join planning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Ответы Re: Memory consumed by paths during partitionwise join planning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
On 19/2/2024 19:25, Ashutosh Bapat wrote:
> On Fri, Feb 16, 2024 at 8:42 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> Live example: right now, I am working on the code like MSSQL has - a
>> combination of NestLoop and HashJoin paths and switching between them in
>> real-time. It requires both paths in the path list at the moment when
>> extensions are coming. Even if one of them isn't referenced from the
>> upper pathlist, it may still be helpful for the extension.
> 
> There is no guarantee that every path presented to add_path will be
> preserved. Suboptimal paths are freed as and when add_path discovers
> that they are suboptimal. So I don't think an extension can rely on
> existence of a path. But having a refcount makes it easy to preserve
> the required paths by referencing them.
I don't insist, just provide my use case. It would be ideal if you would 
provide some external routines for extensions that allow for sticking 
the path in pathlist even when it has terrible cost estimation.
> 
>>
>>>> About partitioning. As I discovered planning issues connected to
>>>> partitions, the painful problem is a rule, according to which we are
>>>> trying to use all nomenclature of possible paths for each partition.
>>>> With indexes, it quickly increases optimization work. IMO, this can help
>>>> a 'symmetrical' approach, which could restrict the scope of possible
>>>> pathways for upcoming partitions if we filter some paths in a set of
>>>> previously planned partitions.
>>>
>>> filter or free?
>> Filter.
>> I meant that Postres tries to apply IndexScan, BitmapScan,
>> IndexOnlyScan, and other strategies, passing throughout the partition
>> indexes. The optimizer spends a lot of time doing that. So, why not
>> introduce a symmetrical strategy and give away from the search some
>> indexes of types of scan based on the pathifying experience of previous
>> partitions of the same table: if you have dozens of partitions, Is it
>> beneficial for the system to find a bit more optimal IndexScan on one
>> partition having SeqScans on 999 other?
>>
> IIUC, you are suggesting that instead of planning each
> partition/partitionwise join, we only create paths with the strategies
> which were found to be optimal with previous partitions. That's a good
> heuristic but it won't work if partition properties - statistics,
> indexes etc. differ between groups of partitions.
Sure, but the "Symmetry" strategy assumes that on the scope of a 
thousand partitions, especially with parallel append involved, it 
doesn't cause sensible performance degradation if we find a bit 
suboptimal path in a small subset of partitions. Does it make sense?
As I see, when people use 10-100 partitions for the table, they usually 
strive to keep indexes symmetrical for all partitions.

-- 
regards,
Andrei Lepikhov
Postgres Professional




В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Support boolcol IS [NOT] iso-8859-1 in partition pruning
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Synchronizing slots from primary to standby