Re: limit clause produces wrong query plan

Поиск
Список
Период
Сортировка
От Scott Carey
Тема Re: limit clause produces wrong query plan
Дата
Msg-id BDFBB77C9E07BE4A984DAAE981D19F961ACA1F1765@EXVMBX018-1.exch018.msoutlookonline.net
обсуждение исходный текст
Ответ на Re: limit clause produces wrong query plan  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Список pgsql-performance
I believe his earlier (original) complaint was that it was slower with the LIMIT than with no limit.  As in, the select
(*)query was faster to get the whole thing than apply the limit.  Wherever that is the case, it is broken. 

Certainly a complex join makes this more difficult, but one would agree that with a LIMIT it should never be slower
thanwithout, right? 

Any LIMIT combined with an ORDER by at the last stage can't be completed without fetching every row and sorting.  If
thereis an index on the column being sorted at each table in the join it is possible to short-circuit the plan after
enoughresult rows are found, but this would have to iterate on each arm of the join until there were enough matches,
andpostgres does not have any such iterative query strategy that I'm aware of. 
For a single large, indexed table with no joins it certainly makes sense to terminate the index scan early.

Often, one is better off with explicit subselects for the join branches with explicit LIMIT on each of those (oversized
tocompensate for the likelihood of a match) but you have to be willing to get less than the LIMIT ammount if there are
notenough matches, and that is usually bad for 'paging' style queries rather than 'report on top X' queries where the
prematuretruncation is probably ok and is enough to make the query fast.  But if you have very large datasets on the
querybranch arms, asking for only 10K of 300K rows to sort and then match against 10K in another arm, and find 500
items,can be a huge gain versus doing the entire thing. 


-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Scott Marlowe
Sent: Monday, November 24, 2008 11:23 AM
To: Andrus
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] limit clause produces wrong query plan

On Mon, Nov 24, 2008 at 10:26 AM, Andrus <kobruleht2@hot.ee> wrote:
>> it was veery fast. To be honest I do not know what is happening?!
>
> This is really weird.
> It seems that PostgreSql OFFSET / LIMIT are not optimized and thus typical
> paging queries

And how exactly should it be optimized?  If a query is even moderately
interesting, with a few joins and a where clause, postgresql HAS to
create the rows that come before your offset in order to assure that
it's giving you the right rows.

> SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET pageno*100 LIMIT 100

This will get progressively slower as pageno goes up.

> SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET 0 LIMIT 100

That should be plenty fast.

> cannot be used in PostgreSql at all for big tables.

Can't be used in any real database with any real complexity to its query either.

> Do you have any idea how to fix this ?

A standard workaround is to use some kind of sequential, or nearly so,
id field, and then use between on that field.

select * from table where idfield between x and x+100;

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

В списке pgsql-performance по дате отправления:

Предыдущее
От: PFC
Дата:
Сообщение: Re: limit clause produces wrong query plan
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Monitoring buffercache...