Re: BUG #15001: planner cann't distinguish composite index?

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: BUG #15001: planner cann't distinguish composite index?
Дата
Msg-id CAA4eK1+LdJFwy5cOtmuz77anNg3CfeF0dS8=oXk5zEKOpJrXfw@mail.gmail.com
обсуждение исходный текст
Ответ на BUG #15001: planner cann't distinguish composite index?  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #15001: planner cann't distinguish composite index?  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-bugs
On Tue, Jan 9, 2018 at 4:55 PM, PG Bug reporting form
<noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      15001
> Logged by:          Zhou Digoal
> Email address:      digoal@126.com
> PostgreSQL version: 10.1
> Operating system:   centos 7.x x64
> Description:
>
> ```
> postgres=# create table tbl(c1 int, c2 int, c3 int);
> CREATE TABLE
> postgres=# create index idx_tbl on tbl (c1,c2);
> CREATE INDEX
> postgres=# insert into tbl select random()*100, random()*10 from
> generate_series(1,10000000);
> INSERT 0 10000000
>
> postgres=# explain select c1,c2 , count(*) from tbl group by c2,c1;
>                                                      QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------
>  Finalize GroupAggregate  (cost=176259.41..176337.18 rows=1111 width=16)
>    Group Key: c2, c1
>    ->  Sort  (cost=176259.41..176276.08 rows=6666 width=16)
>          Sort Key: c2, c1
>          ->  Gather  (cost=175158.32..175836.03 rows=6666 width=16)
>                Workers Planned: 6
>                ->  Partial HashAggregate  (cost=174158.32..174169.43
> rows=1111 width=16)
>                      Group Key: c2, c1
>                      ->  Parallel Index Only Scan using idx_tbl on tbl
> (cost=0.43..161658.26 rows=1666675 width=8)
> (9 rows)
>
> postgres=# explain select c1,c2 , count(*) from tbl group by c1,c2;
>                                                   QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------
>  Finalize GroupAggregate  (cost=1000.53..176040.80 rows=1111 width=16)
>    Group Key: c1, c2
>    ->  Gather Merge  (cost=1000.53..175979.69 rows=6666 width=16)
>          Workers Planned: 6
>          ->  Partial GroupAggregate  (cost=0.43..174169.43 rows=1111
> width=16)
>                Group Key: c1, c2
>                ->  Parallel Index Only Scan using idx_tbl on tbl
> (cost=0.43..161658.26 rows=1666675 width=8)
> (7 rows)
> ```
>
> i need to set enable_sort=off, so planner can choose the same planner with
> c1,c2 and c2,c1 group by.
>
> ```
> postgres=# set enable_sort=off;
> SET
> postgres=# explain select c1,c2 , count(*) from tbl group by c2,c1;
>                                                   QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------
>  Finalize HashAggregate  (cost=175886.03..175897.14 rows=1111 width=16)
>    Group Key: c2, c1
>    ->  Gather  (cost=175158.32..175836.03 rows=6666 width=16)
>          Workers Planned: 6
>          ->  Partial HashAggregate  (cost=174158.32..174169.43 rows=1111
> width=16)
>                Group Key: c2, c1
>                ->  Parallel Index Only Scan using idx_tbl on tbl
> (cost=0.43..161658.26 rows=1666675 width=8)
> (7 rows)
> ```
>

I don't think the plan chosen is same.  If you notice, for c1,c2 the
plan chosen is Finalize GroupAggregate -> Gather Merge whereas for
c2,c1 (with sort off) it is Finalize HashAggregate -> Gather. There is
a lot of difference between both the plans.  The output by Gather
Merge is sorted, so you can directly use GroupAggregate whereas the
output of Gather is unsorted, so the final plan is HashAggregate.

Now, here one can wonder why the planner hasn't chosen the path
without Sort for c2,c1 even when enable_sort=on as the cost of that
plan is less.  If you see the cost difference of plans with sort
(total_cost - 176337.18) and without sort (total_cost - 175897.14), it
is marginal and planner thinks that they are fuzzily same.  The same
is true for startup costs as well.  Now, if both have same costs
(fuzzily), it gives preference to the sorted path.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Devrim Gündüz
Дата:
Сообщение: Re: missing repodata forhttps://download.postgresql.org/pub/repos/yum/9.6/fedora/fedora-26-x86_64
Следующее
От: Andrey
Дата:
Сообщение: could not determine which collation to use for string comparison