Re: Evaluation of secondary sort key.

Поиск
Список
Период
Сортировка
От Jesper Krogh
Тема Re: Evaluation of secondary sort key.
Дата
Msg-id 4DABCB5B.4080606@krogh.cc
обсуждение исходный текст
Ответ на Re: Evaluation of secondary sort key.  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Evaluation of secondary sort key.  (Greg Stark <gsstark@mit.edu>)
Список pgsql-hackers
On 2011-04-09 18:54, Tom Lane wrote:
> I think that would be a positive disimprovement.  The current design
> guarantees that volatile sort expressions are evaluated exactly once,
> in the order the rows are read from the data source.  There would be no
> guarantees at all, either as to the number of evaluations or the order
> in which they happen, if we tried to do evaluation only during the
> actual sort.
If I could "only evaluate" the rows needed, then that would also
be a huge win, below case shifts what "definitely shouldn't be a seqscan"
into one due to a secondary sort key.

> Another small problem is that any such thing would require carrying
> along some kind of closure (ie, the expression and all its input
> values), not just the final sort key value, in tuples being sorted.
> The ensuing complexity, speed penalty, and increase in data volume
> to be sorted would be paid by everybody, making this probably a net
> performance loss when considered across all applications.
Getting the value for the first sortkey and carrying on a closure
for the rest would mostly (very often) be "optimal" ?

It would also enable a select that has to sortkeys to utilize an
index that only contains the primary sortkey, which is a huge
negative effect of what's being done today.

2011-04-18 07:12:43.931 testdb=# explain select id from testdb.testtable 
order by id limit 500;                                             QUERY PLAN
---------------------------------------------------------------------------------------------------- Limit
(cost=0.00..262.75rows=500 width=4)   ->  Index Scan using testtable_pkey on testtable  
 
(cost=0.00..15015567.84 rows=28573400 width=4)
(2 rows)

Time: 1.363 ms
2011-04-18 07:12:52.498 testdb=# explain select id from testdb.testtable 
order by id,name limit 500;                                    QUERY PLAN
----------------------------------------------------------------------------------- Limit  (cost=5165456.70..5165457.95
rows=500width=35)   ->  Sort  (cost=5165456.70..5236890.20 rows=28573400 width=35)         Sort Key: id, name
-> Seq Scan on testtable  (cost=0.00..3741675.00 
 
rows=28573400 width=35)
(4 rows)

Time: 1.420 ms

Enabling any users to sort using multiple keys, without ending in Seq 
Scans somewhere down
the line seems to require multi dimension indexes on all combinations of 
colums

-- 
Jesper


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: Formatting Curmudgeons WAS: MMAP Buffers
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Formatting Curmudgeons WAS: MMAP Buffers