Re: Performance issues with parallelism and LIMIT

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Performance issues with parallelism and LIMIT
Дата
Msg-id a371afc5-eb20-48d2-b6f3-01f976f9f922@vondra.me
обсуждение исходный текст
Ответ на Re: Performance issues with parallelism and LIMIT  (David Geier <geidav.pg@gmail.com>)
Список pgsql-hackers
On 11/14/25 19:20, David Geier wrote:
> Hi Tomas!
> 
> On 13.11.2025 23:36, Tomas Vondra wrote:
>> ...
>>
>> Unfortunately, I don't think this patch is the way to go. When I apply
>> it, I get:
>>
>>    ERROR:  InstrEndLoop called on running node
>>    CONTEXT:  parallel worker
>>
> 
> Ooops. That can likely be fixed.
> 
>> And I very much doubt inventing a new ad hoc way to signal workers is
>> the right solution (even if there wasn't the InstrEndLoop issue).
>>
>> What I think we should do is much simpler - make the threshold in shm_mq
>> dynamic, start with a very low value and gradually ramp up (up to 1/4).
>> So we'd have
>>
>>    if (mqh->mqh_consume_pending > threshold)
>>
>> We might start with
>>
>>   threshold = (mq->mq_ring_size / 1024)
>>
>> or maybe some fixed value, list
>>
>>   thredhold = 128
>>
>> And on every signal we'd double it, capping it to 1/4 of mq_ring_size.
>>
>>   threshold = Min(threshold * 2, mq->mq_ring_size / 1024);
>>
>> This is very similar to other places doing this gradual ramp up, like in
>> the prefetching / read_stream, etc. It allows fast termination for low
>> LIMIT values, but quickly amortizes the cost for high LIMIT values.
> 
> That's a different problem though, isn't it? The original thread
> contained two problems: (1) signaling the queue to late and (2) workers
> stopping to late in the presence of LIMIT if they're not finding any
> rows in their portion of the data.
> 
> Changing the queue thresholds is a solution for (1) but not for (2). For
> (2) we need a mechanism to instruct the parallel workers to stop when we
> find that other parallel workers have already produced enough rows to
> answer the query.
> 

Good point, I completely forgot about (2).

> An alternative mechanism that might work is using some stop_worker
> boolean in shared memory that we check in CHECK_FOR_INTERRUPTS().
> stop_worker is set to true by the leader as soon as it has collected
> enough tuples. But then CHECK_FOR_INTERRUPTS() would have to have access
> to the parallel context, which might also be a bit ugly.
> 

Hmmm, yeah. We already do have shared state for the parallel scan. Do
you think we could maybe integrate that into that? So the scan would
just "finished" for all the workers ...


regards

-- 
Tomas Vondra




В списке pgsql-hackers по дате отправления: