Re: BUG #13528: LATERAL vs. correlated scalar subquery

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Re: BUG #13528: LATERAL vs. correlated scalar subquery
Дата
Msg-id CAK-MWwTOHzLEKVF+FahoMv+-r1E99AEQigCMYfUcHGjm9K170w@mail.gmail.com
обсуждение исходный текст
Ответ на BUG #13528: LATERAL vs. correlated scalar subquery  (marko@joh.to)
Ответы Re: BUG #13528: LATERAL vs. correlated scalar subquery  (Marko Tiikkaja <marko@joh.to>)
Список pgsql-bugs
On Thu, Jul 30, 2015 at 6:53 PM, <marko@joh.to> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      13528
> Logged by:          Marko Tiikkaja
> Email address:      marko@joh.to
> PostgreSQL version: 9.4.4
> Operating system:   Linux
> Description:
>
> Hi,
>
> Observe the following case:
>
> =3D# create table data(a int, b int, primary key(a,b));
> CREATE TABLE
> =3D# insert into data select i, random() * 100 from generate_series(1,
> 100000)
> i;
> INSERT 0 100000
> =3D# create view counts as select a, count(*) from data group by a;
> CREATE VIEW
> =3D# explain analyze select u.elem, x.count from unnest(array[1]) u(elem)=
,
> lateral (select counts.count from counts where counts.a =3D u.elem) x;
>                                                       QUERY PLAN
>
>
> -------------------------------------------------------------------------=
---------------------------------------------
>  Hash Join  (
> =E2=80=8B=E2=80=8B
> cost=3D1867.28..1873.03 rows=3D100 width=3D12) (actual
> time=3D69.858..77.021 rows=3D1 loops=3D1)
>    Hash Cond: (data.a =3D u.elem)
>    ->  HashAggregate  (cost=3D1865.03..1867.03 rows=3D200 width=3D4) (act=
ual
> time=3D44.528..70.394 rows=3D100000 loops=3D1)
>          Group Key: data.a
>          ->  Seq Scan on data  (cost=3D0.00..1391.02 rows=3D94802 width=
=3D4)
> (actual time=3D0.013..8.586 rows=3D100000 loops=3D1)
>    ->  Hash  (cost=3D1.00..1.00 rows=3D100 width=3D4) (actual time=3D0.01=
2..0.012
> rows=3D1 loops=3D1)
>          Buckets: 1024  Batches: 1  Memory Usage: 1kB
>          ->
> =E2=80=8B=E2=80=8B
> Function Scan on unnest u  (cost=3D0.00..1.00 rows=3D100 width=3D4)
> (actual time=3D0.010..0.011 rows=3D1 loops=3D1)
>  Planning time: 0.142 ms
>  Execution time: 77.551 ms
> (10 rows)
>
> Tweaking any of the enable_* parameters doesn't get me to the desired que=
ry
> produced by the old way of LATERALizing:
>
> =3D# explain analyze select u.elem, (select counts.count from counts wher=
e
> counts.a =3D u.elem) from unnest(array[1]) u(elem);
>                                                                   QUERY
> PLAN
>
>
> -------------------------------------------------------------------------=
---------------------------------------------------------------------
>
> =E2=80=8B=E2=80=8B
> Function Scan on unnest u  (
> =E2=80=8B=E2=80=8B
> cost=3D0.00..125498.75 rows=3D100 width=3D4) (actual
> time=3D0.037..0.038 rows=3D1 loops=3D1)
>    SubPlan 1
>      ->  Subquery Scan on counts  (cost=3D0.29..1254.98 rows=3D1 width=3D=
8)
> (actual time=3D0.024..0.024 rows=3D1 loops=3D1)
>            ->  GroupAggregate  (cost=3D0.29..1254.97 rows=3D1 width=3D4) =
(actual
> time=3D0.023..0.023 rows=3D1 loops=3D1)
>                  Group Key: data.a
>                  ->  Index Only Scan using data_pkey on data
> (cost=3D0.29..1252.59 rows=3D474 width=3D4) (actual time=3D0.017..0.018 r=
ows=3D1
> loops=3D1)
>                        Index Cond: (a =3D u.elem)
>                        Heap Fetches: 1
>  Planning time: 0.125 ms
>  Execution time: 0.073 ms
> (10 rows)
>
> Is there some fundamental issue here which prevents the planner from
> producing the same plan?
>

=E2=80=8BHi Marko,

You could see that the new plan have lower total cost than the old plan
(=E2=80=8Bcost=3D1867.28..1873.03 vs =E2=80=8Bcost=3D0.00..125498.75).
I think it's primary reason why it been selected (planner could produce the
old plan but new plan wins on the cost basis).
An fundamental issue there that unnest expected return 100 rows by default
instead of the 1 row in your case.
( =E2=80=8BFunction Scan on unnest u  (cost=3D0.00..1.00 rows=3D100 width=
=3D4) vs (actual
time=3D0.010..0.011 rows=3D1 loops=3D1) )
If you try the same test with 100 element array I think the new
plan/lateral query will be faster than the old.
It's somewhat fundamental issue with planning queries with unnest(constant
array) (i been burned by it many times).



--=20
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/=
>

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
=D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Jack Douglas"
Дата:
Сообщение: Re: BUG #13523: Unexplained deadlocks (possible race condition)
Следующее
От: Marko Tiikkaja
Дата:
Сообщение: Re: BUG #13528: LATERAL vs. correlated scalar subquery