Re: Performance issues with parallelism and LIMIT

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: Performance issues with parallelism and LIMIT
Дата	18 ноября 22:37:40
Msg-id	09ca028e-2b50-42af-baed-6d582252f359@vondra.me обсуждение исходный текст
Ответ на	Re: Performance issues with parallelism and LIMIT (David Geier <geidav.pg@gmail.com>)
Ответы	Re: Performance issues with parallelism and LIMIT
Список	pgsql-hackers

Дерево обсуждения


On 11/18/25 19:35, David Geier wrote:
> 
> On 18.11.2025 18:31, Tomas Vondra wrote:
>> On 11/18/25 17:51, Tom Lane wrote:
>>> David Geier <geidav.pg@gmail.com> writes:
>>>> On 18.11.2025 16:40, Tomas Vondra wrote:
>>>>> It'd need code in the parallel-aware scans, i.e. seqscan, bitmap, index.
>>>>> I don't think you'd need code in other plans, because all parallel plans
>>>>> have one "driving" table.
>>>
>>> You're assuming that the planner will insert Gather nodes at arbitrary
>>> places in the plan, which isn't true.  If it does generate plans that
>>> are problematic from this standpoint, maybe the answer is "don't
>>> parallelize in exactly that way".
>>>
>>
>> I think David has a point that nodes that "buffer" tuples (like Sort or
>> HashAgg) would break the approach making this the responsibility of the
>> parallel-aware scan. I don't see anything particularly wrong with such
>> plans - plans with partial aggregation often look like that.
>>
>> Maybe this should be the responsibility of execProcnode.c, not the
>> various nodes?
>>
> 
> I like that idea, even though it would still not work while a node is
> doing the crunching. That is after it has pulled all rows and before it
> can return the first row. During this time the node won't call
> ExecProcNode().
> 

True. Perhaps we could provide a function nodes could call in suitable
places to check whether to end?

Actually, how does canceling queries with parallel workers work? Is that
done similarly to what your patch did?

> But that seems like an acceptable limitation. At least it keeps working
> above "buffer" nodes.
> 
> I'll give this idea a try. Then we can contrast this approach with the
> approach in my initial patch.
> 
>> It'd be nice to show this in EXPLAIN (that some of the workers were
>> terminated early, before processing all the data).
> 
> Inspectability on that end seems useful. Maybe only with VERBOSE,
> similarly to the extended per-worker information.
> 

Maybe, no opinion. But it probably needs to apply to all nodes in the
parallel worker, right? Or maybe it's even a per-worker detail.


regards

-- 
Tomas Vondra

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performance issues with parallelism and LIMIT