Re: [HACKERS] Print correct startup cost for the group aggregate.

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: [HACKERS] Print correct startup cost for the group aggregate.
Дата
Msg-id CAFjFpRfvjgxfTKux5q=xA3pj3a8W-1s0Kg=Oy7nNQvCg1QHgJA@mail.gmail.com
обсуждение исходный текст
Ответ на [HACKERS] Print correct startup cost for the group aggregate.  (Rushabh Lathia <rushabh.lathia@gmail.com>)
Ответы Re: [HACKERS] Print correct startup cost for the group aggregate.  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Re: [HACKERS] Print correct startup cost for the group aggregate.  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
> Hi,
>
> While reading through the cost_agg() I found that startup cost for the
> group aggregate is not correctly assigned. Due to this explain plan is
> not printing the correct startup cost.
>
> Without patch:
>
> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
> filler like '%foo%' group by aid;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
>    Group Key: aid
>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>          Sort Key: aid
>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
> width=8)
>                Filter: (filler ~~ '%foo%'::text)
> (6 rows)
>
> With patch:
>
> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
> filler like '%foo%' group by aid;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
>    Group Key: aid
>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>          Sort Key: aid
>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
> width=8)
>                Filter: (filler ~~ '%foo%'::text)
> (6 rows)
>

The reason the reason why startup_cost = input_startup_cost and not
input_total_cost for aggregation by sorting is we don't need the whole
input before the Group/Agg plan can produce the first row. But I think
setting startup_cost = input_startup_cost is also not exactly correct.
Before the plan can produce one row, it has to transit through all the
rows belonging to the group to which the first row belongs. On an
average it has to scan (total number of rows)/(number of groups)
before producing the first aggregated row. startup_cost will be
input_startup_cost + cost to scan (total number of rows)/(number of
groups) rows + cost of transiting over those many rows. Total cost =
startup_cost + cost of scanning and transiting through the remaining
number of input rows.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: amul sul
Дата:
Сообщение: Re: [HACKERS] [POC] hash partitioning
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Parallel bitmap heap scan