Re: Optimizing "top queries" ...

Поиск

Список

Период

Сортировка

От	Gregory Stark
Тема	Re: Optimizing "top queries" ...
Дата	6 декабря 2006 г. 11:34:22
Msg-id	87fybtkphx.fsf@enterprisedb.com обсуждение исходный текст
Ответ на	Re: Optimizing "top queries" ... (Markus Schiltknecht <markus@bluegap.ch>)
Ответы	Re: Optimizing "top queries" ... Re: Optimizing "top queries" ...
Список	pgsql-hackers

Дерево обсуждения

"Markus Schiltknecht" <markus@bluegap.ch> writes:

> Hi,
>
> Hans-Juergen Schoenig wrote:
>> in fact, the  sort step is not necessary here as we could add a node which
>> buffers the highest 10 records and replaces them  whenever a higher value is
>> returned from the underlaying node (in this case seq scan).
>> this query is a quite common scenario when it comes to some analysis related
>> issues.
>> saving the sort step is an especially good idea when the table is very large.
>
> That sounds very much like what's known as 'partial sort', which has been
> proposed by Oleg and Theodor. AFAIK they had a trivial patch sometime around
> version 7.1, without integration into the planer and optimizer. They were
> talking about libpsort, but I can't find that currently. See archives [1] and
> [2].

I actually implemented it again a few months ago during the feature freeze. I
had a few questions but since it was the middle of the feature freeze I expect
people had other things on their minds.

It is an important form of query since it crops up any time you have a UI
(read web page) with a paged result set. Currently postgres has to gather up
all the records in the result set and sort them which makes it compare poorly
against other databases popular with web site authors...

The open question in my patch was how to communicate about the limit down to
the sort node. I had implemented it by having ExecLimit peek into the SortNode
and set a field there.

This alternative of making a whole new plan node may have more promise though.
It would make it easier to come up with reasonable cost estimates.

One thing to keep in mind though is that I also wanted to cover the case of
Unique(Sort(...)) and Limit(Unique(Sort(...))) which can throw away duplicates
earlier. Do we want three different plan nodes? Are there other cases like
these that can benefit?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Optimizing "top queries" ...