Re: Performance issues with parallelism and LIMIT

Поиск
Список
Период
Сортировка
От David Geier
Тема Re: Performance issues with parallelism and LIMIT
Дата
Msg-id 2d62033f-9475-1e4e-14e0-76e4b7b9b552@gmail.com
обсуждение исходный текст
Ответ на Re: Performance issues with parallelism and LIMIT  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Performance issues with parallelism and LIMIT  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
Hi,

On 2/8/23 11:42, Tomas Vondra wrote:
> On 2/1/23 14:41, David Geier wrote:
>
> Yeah, this is a pretty annoying regression. We already can hit poor
> behavior when matching rows are not distributed uniformly in the tables
> (which is what LIMIT costing assumes), and this makes it more likely to
> hit similar issues. A bit like when doing many HTTP requests makes it
> more likely to hit at least one 99% outlier.
Are you talking about the use of ordering vs filtering indexes in 
queries where there's both an ORDER BY and a filter present (e.g. using 
an ordering index but then all rows passing the filter are at the end of 
the table)? If not, can you elaborate a bit more on that and maybe give 
an example.
> No opinion on these options, but worth a try. Alternatively, we could
> try the usual doubling approach - start with a low threshold (and set
> the latch frequently), and then gradually increase it up to the 1/4.
>
> That should work both for queries expecting only few rows and those
> producing a lot of data.

I was thinking about this variant as well. One more alternative would be 
latching the leader once a worker has produced 1/Nth of the LIMIT where 
N is the number of workers. Both variants have the disadvantage that 
there are still corner cases where the latch is set too late; but it 
would for sure be much better than what we have today.

I also did some profiling and - at least on my development laptop with 8 
physical cores - the original example, motivating the batching change is 
slower than when it's disabled by commenting out:

     if (force_flush || mqh->mqh_send_pending > (mq->mq_ring_size >> 2))

SET parallel_tuple_cost TO 0;
CREATE TABLE b (a int);
INSERT INTO b SELECT generate_series(1, 200000000);
ANALYZE b;
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM b;

  Gather  (cost=1000.00..1200284.61 rows=200375424 width=4) (actual 
rows=200000000 loops=1)
    Workers Planned: 7
    Workers Launched: 7
    ->  Parallel Seq Scan on b  (cost=0.00..1199284.61 rows=28625061 
width=4) (actual rows=25000000 loops=8)

Always latch: 19055 ms
Batching:     19575 ms

If I find some time, I'll play around a bit more and maybe propose a patch.

>> ...
>>
>> We would need something similar to CHECK_FOR_INTERRUPTS() which returns
>> a NULL slot if a parallel worker is supposed to stop execution (we could
>> e.g. check if the queue got detached). Or could we amend
>> CHECK_FOR_INTERRUPTS() to just stop the worker gracefully if the queue
>> got detached?
>>
> That sounds reasonable, but I'm not very familiar the leader-worker
> communication, so no opinion on how it should be done.

I think an extra macro that needs to be called from dozens of places to 
check if parallel execution is supposed to end is the least preferred 
approach. I'll read up more on how CHECK_FOR_INTERRUPTS() works and if 
we cannot actively signal the workers that they should stop.

-- 
David Geier
(ServiceNow)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: Ignoring BRIN for HOT updates (was: -udpates seems broken)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations