Re: DISTINCT with btree skip scan

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: DISTINCT with btree skip scan
Дата	27 октября 2014 г. 10:24:30
Msg-id	CAApHDvow8-DokZExZwHB_rJ9euT9NGgxX7+Zuy0AKHpcueF1Qg@mail.gmail.com обсуждение исходный текст
Ответ на	DISTINCT with btree skip scan (Thomas Munro <munro@ip9.org>)
Ответы	Re: DISTINCT with btree skip scan
Список	pgsql-hackers

Дерево обсуждения

On Sat, Jul 5, 2014 at 12:17 PM, Thomas Munro <munro@ip9.org> wrote:

postgres=# set enable_hashagg = false;
SET
Time: 0.302 ms
postgres=# explain select distinct a from foo;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├─────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Only Scan for distinct prefix 1 using foo_pkey on foo (cost=0.43..263354.20 rows=10 width=4) │
│ Planning time: 0.063 ms │
└─────────────────────────────────────────────────────────────────────────────────────────────────────┘
(2 rows)

Time: 0.443 ms
postgres=# select distinct a from foo;
┌───┐
│ a │
├───┤
│ 0 │
│ 1 │
│ 2 │
│ 3 │
│ 4 │
│ 5 │
│ 6 │
│ 7 │
│ 8 │
│ 9 │
└───┘
(10 rows)

Time: 0.565 ms

Hi Thomas,

I've had a quick look at this and it seems like a great win! I'm quite surprised that we've not got this already. I think this technology could also really help performance of queries such as SELECT * from bigtable bt WHERE EXISTS(SELECT 1 FROM otherbigtable obt WHERE bt.somecol = obt.someIndexedButNonUniqueColumn); I know you're not proposing to improve that first off, but it could be done later once the benefits of this are more commonly realised.

I think our shortfalls in this area have not gone unnoticed. I was reading this post https://www.periscope.io/blog/count-distinct-in-mysql-postgres-sql-server-and-oracle.html about comparing performance of COUNT(DISTINCT col). I think this would give a big win for query 3 in that post. I'm trying to find some time make some changes to transform queries to allow the group by to happen before the joins when possible, so between that and your patch we'd be 2 steps closer to making query 1 in the link above perform a little better on PostgreSQL.

Do you think you'll manage to get time to look at this a bit more? I'd be keen to look into the costing side of it if you think that'll help you any?

Regards

David Rowley

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Ali Akbar
Дата: 27 октября 2014 г., 10:13:08
Сообщение: Re: Function array_agg(array)

Следующее

От: "Amit Langote"
Дата: 27 октября 2014 г., 11:00:01
Сообщение: Re: On partitioning

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: DISTINCT with btree skip scan

Предыдущее

Следующее