Re: Window functions and index usage

Поиск
Список
Период
Сортировка
От Robert Klemme
Тема Re: Window functions and index usage
Дата
Msg-id CAM9pMnPOkMLCH6b8ToHY9ui-6LFKmwXZNzPNyLoDEVRzB-fQZg@mail.gmail.com
обсуждение исходный текст
Ответ на Window functions and index usage  (Anssi Kääriäinen <anssi.kaariainen@thl.fi>)
Ответы Re: Window functions and index usage  (Anssi Kääriäinen <anssi.kaariainen@thl.fi>)
Список pgsql-performance
On Tue, Oct 4, 2011 at 11:39 AM, Anssi Kääriäinen
<anssi.kaariainen@thl.fi> wrote:
> I have the following setup:
>
> create table test(id integer, seq integer);
> insert into test select generate_series(0, 100), generate_series(0, 1000);
> create unique index test_idx on test(id, seq);
> analyze test;
>
> Now I try to fetch the latest 5 values per id, ordered by seq from the
> table:
>
> select * from (
> select id, seq, row_number() over (partition by id order by seq)
>  from test
>  where id in (1, 2, 3)
> ) where row_number() <= 5;

It seems this fetches the *first* 5 values per id - and not the
latest.  For that you would need to "order by seq desc" in the window
and probably also in the index.

> This does not use the index on id, seq for sorting the data. It uses a
> bitmap index scan, and sequential scan when issued SET enable_bitmapscan to
> false. Tested both on git head and 8.4.8. See end of post for plans. It
> seems it would be possible to fetch the first values per id using the index,
> or at least skip the sorting.

Just guessing: since row_number is an analytic function and it can be
combined with any type of windowing only in "rare" cases do the
ordering of index columns and the ordering in the window align.  So
while your particular use case could benefit from this optimization
the overall judgement might be that it's not worthwhile to make the
optimizer more complex to cover this case - or I fail to see the more
general pattern here. :-)

> Is there some other way to run the query so that it would use the index? Is
> there plans to support the index usage for the above query assuming that it
> is possible to use the index for that query?
>
> The real world use case would be to fetch latest 5 threads per discussion
> forum in one query. Or fetch 3 latest comments for all patches in given
> commit fest in single query.

Is it really that realistic that someone wants the latest n entries
for *all* threads / patches?  It seems since this can result in a very
large data set this is probably not the type of query which is done
often.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

В списке pgsql-performance по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: pkey is not used on productive database
Следующее
От: Anssi Kääriäinen
Дата:
Сообщение: Re: Window functions and index usage