Re: Optimizer on sort aggregate

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Optimizer on sort aggregate
Дата	18 октября 2014 г. 02:53:38
Msg-id	CAApHDvqfi5QKc8HCNmrE5vr=0dsE6DwuPPYK3pZuiYf2yVPLTQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Optimizer on sort aggregate (Tatsuo Ishii <ishii@postgresql.org>)
Список	pgsql-hackers

Дерево обсуждения

On Sat, Oct 18, 2014 at 12:35 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:

> The query,
> select count(distinct j) from t group by t, i;
>
> runs for 35 seconds. However, if I change the query to,
> select count(distinct j) from t group by i, t; -- note switching the
> ordering
> select count(distinct j) from t group by decode(t, 'escape'), i; -- convert
> t to bytea
>
> Run times are just about 5 and 6.5 seconds. The reason is clear, compare a
> string with collation is slow, which is well understood by pg hackers.
> However, here, the sorting order is forced by the planner, not user.
> Planner can do the following optimizations,

Interesting. I got following result:

test=# explain analyze select count(distinct j) from t group by t, i;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=137519.84..157519.84 rows=1000000 width=22) (actual time=1332.937..2431.238 rows=1000000 loops=1)
Group Key: t, i
-> Sort (cost=137519.84..140019.84 rows=1000000 width=22) (actual time=1332.922..1507.413 rows=1000000 loops=1)
Sort Key: t, i
Sort Method: external merge Disk: 33232kB
-> Seq Scan on t (cost=0.00..17352.00 rows=1000000 width=22) (actual time=0.006..131.406 rows=1000000 loops=1)
Planning time: 0.031 ms
Execution time: 2484.271 ms
(8 rows)

Time: 2484.520 ms

test=# explain analyze select count(distinct j) from t group by i, t;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=137519.84..157519.84 rows=1000000 width=22) (actual time=602.510..1632.087 rows=1000000 loops=1)
Group Key: i, t
-> Sort (cost=137519.84..140019.84 rows=1000000 width=22) (actual time=602.493..703.274 rows=1000000 loops=1)
Sort Key: i, t
Sort Method: external sort Disk: 33240kB
-> Seq Scan on t (cost=0.00..17352.00 rows=1000000 width=22) (actual time=0.014..129.213 rows=1000000 loops=1)
Planning time: 0.176 ms
Execution time: 1685.575 ms
(8 rows)

Time: 1687.641 ms

Not so big difference here (maybe because I use SSD) but there is
still about 50% difference in execution time. Note that I disable
locale support.

I think this is more likely your locale settings, as if I do:

create table t(i int, j int, k int, t text collate "C");

The GROUP BY t,i runs about 25% faster.

I've not looked at it yet, but Peter G's patch here https://commitfest.postgresql.org/action/patch_view?id=1462 will quite likely narrow the performance gap between the 2 queries.

Regards

David Rowley

В списке pgsql-hackers по дате отправления:

Предыдущее

От: David Rowley
Дата: 18 октября 2014 г., 02:36:35
Сообщение: Re: Optimizer on sort aggregate

Следующее

От: Stephen Frost
Дата: 18 октября 2014 г., 03:03:25
Сообщение: Re: Materialized views don't show up in information_schema

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Optimizer on sort aggregate

Предыдущее

Следующее