Re: SELECT MIN, MAX took longer time than SELECT

От: K C Lau
Тема: Re: SELECT MIN, MAX took longer time than SELECT
Дата: ,
Msg-id: 6.2.1.2.0.20060121211310.08cc91c8@localhost
(см: обсуждение, исходный текст)
Ответ на: Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX  (K C Lau, )
 Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX  ("Jim C. Nasby", )
  Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX  (Tom Lane, )
   Re: Stored procedures  (Rikard Pavelic, )
   Re: SELECT MIN, MAX took longer time than SELECT  (K C Lau, )
  Re: SELECT MIN, MAX took longer time than SELECT  (K C Lau, )

I have worked round the issue by using 2 separate queries with the LIMIT
construct.

LogSN and create_time are indeed directly correlated, both monotonously
increasing, occasionally with multiple LogSN's having the same create_time.

What puzzles me is why the query with COUNT, MIN, MAX uses idx_logtime for
the scan, but the query without the COUNT uses pk_log and takes much
longer. If it had chosen idx_logtime instead, then it should have returned
immediately for both MIN and MAX.

Best regards,
KC.

At 02:51 06/01/21, Tom Lane wrote:
>"Jim C. Nasby" <> writes:
> > On Fri, Jan 20, 2006 at 12:35:36PM +0800, K C Lau wrote:
> > Here's the problem... the estimate for the backwards index scan is *way*
> > off:
>
> >> ->  Limit  (cost=0.00..1.26 rows=1 width=4) (actual
> >> time=200032.928..200032.931 rows=1 loops=1)
> >> ->  Index Scan Backward using pk_log on
> >> log  (cost=0.00..108047.11 rows=86089 width=4) (actual
> >> time=200032.920..200032.920 rows=1 loops=1)
> >> Filter: (((create_time)::text < '2005/10/19'::text) AND
> >> (logsn IS NOT NULL))
> >> Total runtime: 200051.701 ms
>
>It's more subtle than you think.  The estimated rowcount is the
>estimated number of rows fetched if the indexscan were run to
>completion, which it isn't because the LIMIT cuts it off after the
>first returned row.  That estimate is not bad (we can see from the
>aggregate plan that the true value would have been 106708, assuming
>that the "logsn IS NOT NULL" condition isn't filtering anything).
>
>The real problem is that it's taking quite a long time for the scan
>to reach the first row with create_time < 2005/10/19, which is not
>too surprising if logsn is strongly correlated with create_time ...
>but in the absence of any cross-column statistics the planner has
>no very good way to know that.  (Hm ... but both of them probably
>also show a strong correlation to physical order ... we could look
>at that maybe ...)  The default assumption is that the two columns
>aren't correlated and so it should not take long to hit the first such
>row, which is why the planner likes the indexscan/limit plan.
>
>                         regards, tom lane



В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very
От: Martijn van Oosterhout
Дата:
Сообщение: Re: [GENERAL] Creation of tsearch2 index is very slow