Re: Query optimization using order by and limit

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Query optimization using order by and limit
Дата
Msg-id 20110922145319.GJ12765@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Query optimization using order by and limit  (Michael Viscuso <michael.viscuso@getcarbonblack.com>)
Ответы Re: Query optimization using order by and limit  (Michael Viscuso <michael.viscuso@getcarbonblack.com>)
Re: Query optimization using order by and limit  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
* Michael Viscuso (michael.viscuso@getcarbonblack.com) wrote:
> Adding the final condition hosts_guid = '2007075705813916178' is what
> ultimately kills it http://explain.depesz.com/s/8zy.  By adding the
> host_guid, it spends considerably more time in the older tables than
> without this condition and I'm not sure why.

What I think is happening here is that PG is pushing down that filter
(not typically a bad thing..), but with that condition, it's going to
scan the index until it finds a match for that filter before returning
back up only to have that result cut out due to the limit.  Having it as
numerics isn't helping here, but the bigger issue is having to check all
those tuples for a match to the filter.

Mike, the filter has to be applied before the order by/limit, since
those clauses come after the filter has been applied (you wouldn't want
a 'where x = 2 limit 10' to return early just because it found 10
records where x didn't equal 2).

What would be great is if PG would realize that the CHECK constraints
prevent earlier records from being in these earlier tables, so it
shouldn't need to consider them at all once the records from the
'latest' table has been found and the limit reached (reverse all this
for an 'ascending' query, of course), which we can do when there's no
order by.  I don't believe we have that kind of logic or that
information available at this late stage- the CHECK constraints are used
to eliminate the impossible-to-match tables, but that's it.

One option, which isn't great of course, would be to implement your own
'nested loop' construct (something I typically despise..) in the
application which just walks backwards from the latest and pulls
whatever records it can from each day and then stops once it hits the
limit.

    Thanks,

        Stephen

Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Michael Viscuso
Дата:
Сообщение: Re: Query optimization using order by and limit
Следующее
От: Jonathan Bartlett
Дата:
Сообщение: Optimizing Trigram searches in PG 9.1