Обсуждение: Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)
Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)
От
David Fetter
Дата:
On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: > Hello @all, > > I know, i can do: > > select * from (select ... row_number() over (...) ...) foo where > row_number < N > > to limit the rows per group, but the inner select has to retrieve > the whole set of records and in the outer select most of them > discarded. That sounds like the optimizer's falling down on the job. Would this be difficult to fix? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Mar 25, 2010 at 5:17 PM, David Fetter <david@fetter.org> wrote: > On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: >> Hello @all, >> >> I know, i can do: >> >> select * from (select ... row_number() over (...) ...) foo where >> row_number < N >> >> to limit the rows per group, but the inner select has to retrieve >> the whole set of records and in the outer select most of them >> discarded. > > That sounds like the optimizer's falling down on the job. Would this > be difficult to fix? I may not be the best person to offer an opinion on this topic, but it sounds tricky to me. I think it would need some kind of extremely specific special-case logic. The planner would have to recognize row_number() < n, row_number() <= n, and row_number = n as special cases indicating that n-1, n, and n records respectively should be expected to be fetched from the partition. And you might also worry about n > row_number(), and n >= row_number(). It might be worth doing because I suspect that is actually going to be a fairly common type of query, but some thought needs to be given to how to do it without resorting to abject kludgery. ...Robert
Re: Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)
От
Hitoshi Harada
Дата:
2010/3/26 David Fetter <david@fetter.org>: > On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: >> Hello @all, >> >> I know, i can do: >> >> select * from (select ... row_number() over (...) ...) foo where >> row_number < N >> >> to limit the rows per group, but the inner select has to retrieve >> the whole set of records and in the outer select most of them >> discarded. > > That sounds like the optimizer's falling down on the job. Would this > be difficult to fix? I believe this isn't the task of window functions. In fact, "over( ... LIMIT n)" or optimizer hack will fail on multiple window definitions. To take top N items of each group (I agree this is quite common job), I'd suggest syntax that is done by extending DISTINCT ON. SELECT DISTINCT n ON(key1, key2) ... where "n" means top "n" items on each "key1, key2" group. The current DISTINCT ON() syntax is equivalent to DISTINCT 1 ON() in this way. That'll be fairly easy to implement and you aren't be bothered by this like multiple window definitions. The cons of this is that it can be applied to only row_number logic. You may want to use rank, dense_rank, etc. sometimes. Regards, -- Hitoshi Harada