Re: select count(distinct ...) is slower than select distinct in about 5x

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: select count(distinct ...) is slower than select distinct in about 5x
Дата	10 декабря 2013 г. 21:19:52
Msg-id	CAMkU=1wFdvb76NWVw=zLKF3H6-tn=+z9o_Nv2oioWfy6Z8Xo0A@mail.gmail.com обсуждение исходный текст
Ответ на	select count(distinct ...) is slower than select distinct in about 5x (jacket41142 <jacket41142@gmail.com>)
Список	pgsql-performance

Дерево обсуждения

On Tue, Dec 10, 2013 at 9:28 AM, jacket41142 <jacket41142@gmail.com> wrote:

test=> select distinct col_int from t1 group by col_int;
Time: 1177.936 ms

So the performance difference is not very large.
But when I do that:

test=> select count(distinct col_int) from t1;
count
-------
1025
(1 row)

Time: 7367.476 ms

count(distinct ...) always sorts, rather than using a hash, to do its work. I don't think that there is any fundamental reason that it could not be changed to allow it to use hashing, it just hasn't been done yet. It is complicated by the fact that you can have multiple count() expressions in the same query which demand sorting/grouping on different columns.

Cheers,

Jeff

В списке pgsql-performance по дате отправления:

Предыдущее

От: Kevin Grittner
Дата: 10 декабря 2013 г., 21:16:46
Сообщение: Re: select count(distinct ...) is slower than select distinct in about 5x

Следующее

От: Merlin Moncure
Дата: 10 декабря 2013 г., 23:03:16
Сообщение: Re: Parallel Select query performance and shared buffers

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: select count(distinct ...) is slower than select distinct in about 5x

Предыдущее

Следующее