Re: Planning aggregates which require sorted or distinct

Поиск

Список

Период

Сортировка

От	Gavin Sherry
Тема	Re: Planning aggregates which require sorted or distinct
Дата	20 января 2007 г. 03:58:05
Msg-id	Pine.LNX.4.58.0701201543050.27649@linuxworld.com.au обсуждение исходный текст
Ответ на	Re: Planning aggregates which require sorted or distinct (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Planning aggregates which require sorted or distinct (Tom Lane <tgl@sss.pgh.pa.us>) Re: Planning aggregates which require sorted or distinct ("Simon Riggs" <simon@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, 19 Jan 2007, Tom Lane wrote:

> Gavin Sherry <swm@alcove.com.au> writes:
> > On Fri, 19 Jan 2007, Tom Lane wrote:
> >> Er, what primary key would that be exactly?  And even if you had a key,
> >> I wouldn't call joining on it trivial; I'd call it expensive ...
>
> > I should have used slightly different language. What I meant to say was,
> > both sets are primarily sorted by saledate so they can be merged back
> > together. This is why I said it was trivial.
>
> Ah, my misunderstanding.  Then isn't this basically isomorphic to what
> I was thinking of, ie, somewhat-smarter Aggref nodes attached to the
> existing GroupAggregate plan node?

Yep. I was thinking about this all morning. I think I've over engineered
the problem in my head. Window function input just looks like a slightly
more complex distinct aggregate input. I'll think on it more though.

To bring out a slightly different point -- and I know this is putting the
cart before the horse -- but window functions will (potentially) output
rows in the wrong order. I made a passing reference to this earlier. For
example, say we have a table employees with the following data:

empno | salary | age
====================
1     | 2000   | 50
2     | 6000   | 30
3     | 3000   | 20

We want to answer the following: for each employee: what is their rank in
terms of salary and what is their rank in terms of age. This query
answers that:

select empno, rank() over (order by salary) as srank, rank() over (order by age) as arank from employees order by
empno;

The result will be:

empno | salary | age
====================
1     | 1      | 3
2     | 3      | 2
3     | 2      | 1

Both window functions provide results based on the order of their input.
So, in terms of empno, srank will output in this order: empno = 1, 3, 2;
arank will output in this order: empno = 3, 2, 1. We need to glue these
back together and the only way I can think how is via a synthetic key.
Ideally, the planner would have some input on how to clue about how large
the result set will be and the orders from the window functions so that it
can decide whether to use nested loop, merge join or hash join to do it.

Can you think of a different approach?

Thanks,

Gavin

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 20 января 2007 г., 03:40:44
Сообщение: Re: Planning aggregates which require sorted or distinct

Следующее

От: Dennis Bjorklund
Дата: 20 января 2007 г., 05:12:29
Сообщение: Re: savepoint improvements

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Planning aggregates which require sorted or distinct

Предыдущее

Следующее