Re: DISTINCT with btree skip scan

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: DISTINCT with btree skip scan
Дата
Msg-id CAEepm=0n0rxGWQH13pr3qD4z5zFK=0RqnosjEaPGfcSq7EyxBA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: DISTINCT with btree skip scan  (Thomas Munro <munro@ip9.org>)
Ответы Re: DISTINCT with btree skip scan  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Sat, Nov 1, 2014 at 10:35 AM, Thomas Munro <munro@ip9.org> wrote:
> I am definitely interested in collaborating on a series of patches to
> implement various kinds of skip-based plans as seen in other RDBMSs if
> others think it could be useful.  I see skip-based DISTINCT as a good
> place to start.  (I suspect the semi-related skip scan plan (for the
> case when your WHERE clause doesn't have a qual for the leading
> column(s)) is much harder to plan and execute and I also suspect it's
> less generally useful).

There's an interesting thread about merge joins with a skip
optimisation[1] ("leapfrog trie-join", though I haven't read the paper
yet).  That reminded me to go and rebase this experimental patch which
is somewhat related.  Here it is, now split into two parts: one patch
to add an amskip operation, and another to consider using it to
implement DISTINCT when it was already has an index only scan path on
an index that supports skipping.  As you can see it's still sketchy
experiment-grade code, but it now has a first attempt at costing.

postgres=# select count(a) from foo;
┌─────────┐
│  count  │
├─────────┤
│ 5000000 │
└─────────┘
(1 row)

Time: 493.501 ms
postgres=# select distinct a from foo;
┌───┐
│ a │
├───┤
│ 0 │
│ 1 │
│ 2 │
│ 3 │
│ 4 │
│ 5 │
│ 6 │
│ 7 │
│ 8 │
│ 9 │
└───┘
(10 rows)

Time: 0.516 ms
postgres=# explain select distinct a from foo;

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                     QUERY PLAN
                                               │

├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Only Scan for distinct values of leading 1 column(s) using
foo_pkey on foo  (cost=0.43..4.33 rows=10 width=4) │

└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.378 ms

[1] https://www.postgresql.org/message-id/flat/CAFQtOwp4e1hU6gKi1R_nHgOLTzUJko8EcaC1FdAV0rQFo1ARzA%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: int2vector and btree indexes
Следующее
От: Haribabu Kommi
Дата:
Сообщение: Re: New SQL counter statistics view (pg_stat_sql)