Re: very slow queries when max_parallel_workers_per_gather is higherthan zero

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: very slow queries when max_parallel_workers_per_gather is higherthan zero
Дата
Msg-id 8f788568-cb24-9eaf-6772-3d6a7ce5e02b@2ndquadrant.com
обсуждение исходный текст
Ответ на very slow queries when max_parallel_workers_per_gather is higher than zero  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-hackers
Thanks. So we now have a trivial query demonstrating the issue. IMHO
this is not really a costing issue, but due to a misestimate.

Essentially, the problem is that the two sides of the join mismatch,
causing this:

    ->  Bitmap Heap Scan on dwh_dm ... d  (... rows=7 width=4) (...)

    ->  Bitmap Heap Scan on f_ticketupdate_aad5jtwal0ayaax f
          (cost=1633.90..214617.67 rows=87472 width=4)
          (actual time=0.003..0.003 rows=0 loops=7)
        Recheck Cond: (dt_event_id = d.id)

        ->  Bitmap Index Scan on f_ticketupdate_aad5jtwal0ayaa ...
              (cost=0.00..1612.03 rows=87472 width=0)
              (actual time=0.003..0.003 rows=0 loops=7)
            Index Cond: (dt_event_id = d.id)

I.e. the database believes the bitmap index scan will match 87k rows.
But in fact it matches 0, which makes the bitmap heap scan entirely
unnecessary (thus costing nothing, because it's skipped).

Of course, the parallel plan is structured slightly differently, and
does not allow this skipping because it places the f_ table on the outer
side of the join (and scans it using sequential scan).

Now, try changing the parameters (particularly id_euweek) so that the
bitmap index scan actually matches something. I'm pretty sure that will
make the non-parallel case much more expensive.

Increasing the parallel_setup_cost makes the parallel plan a bit more
expensive, enough to switch to the non-parallel plan. But that's mostly
a fluke and not particularly principled way to fix this - if the cost
difference gets a bit larger (or if you increase the number of parallel
workers) it's probably going to use the parallel plan again.

Obviously, PostgreSQL 9.5 doesn't have parallel queries, so it does not
have a chance of making this mistake.

regards
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: ON CONFLICT DO UPDATE for partitioned tables
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Instability in partition_prune test?