Re: Parallel query execution introduces performance regressions

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Parallel query execution introduces performance regressions
Дата
Msg-id CAH2-WzkdkPH8C+v533a7-kz+pMvhmEvypux-cZhNkP+xzL1j+g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel query execution introduces performance regressions  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
On Mon, Apr 1, 2019 at 12:00 PM Andres Freund <andres@anarazel.de> wrote:
> > Nested Loop Semi Join  (cost=0.00..90020417940.08 rows=30005835 width=8)
> > (actual time=0.034..24981.895 rows=90017507 loops=1)
> >   Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id)
> >   ->  Seq Scan on order_line ref_0  (cost=0.00..2011503.04 rows=90017504
> > width=12) (actual time=0.022..7145.811 rows=90017507 loops=1)
> >   ->  Materialize  (cost=0.00..2771.00 rows=100000 width=4) (actual
> > time=0.000..0.000 rows=1 loops=90017507)
> >       ->  Seq Scan on item ref_1  (cost=0.00..2271.00 rows=100000 width=4)
> > (actual time=0.006..0.006 rows=1 loops=1)
>
> note the estimated rows=100000 vs the actual rows=1 in the seqscan /
> materialize. That's what makes the planner think this is much more
> expensive than it is, which in turn triggers the use of a parallel scan.

Yeah, I just noticed that. The sequential scan on the inner side of
the nestloop join is a problem.

More generally, as somebody familiar with the TPC-C schema, I cannot
make sense of the query itself. Why would anybody want to join "Image
ID associated to Item" from the item table to the district column of
the orderlines table? It simply makes no sense.

-- 
Peter Geoghegan



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Parallel query execution introduces performance regressions
Следующее
От: Jinho Jung
Дата:
Сообщение: Re: Parallel query execution introduces performance regressions