Re: Selecting max(pk) is slow on empty set

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: Selecting max(pk) is slow on empty set
Дата
Msg-id 4795FA5B.8030003@archonet.com
обсуждение исходный текст
Ответ на Re: Selecting max(pk) is slow on empty set  ("Alexander Staubo" <alex@purefiction.net>)
Ответы Re: Selecting max(pk) is slow on empty set  ("Alexander Staubo" <alex@purefiction.net>)
Список pgsql-general
Alexander Staubo wrote:
> On 1/22/08, Richard Huxton <dev@archonet.com> wrote:
>> Then see if an ALTER TABLE SET
>> STATISTICS 100 makes a difference.
>
> So it does:
>
> # explain analyze select max(id) from user_messages where user_id = 13604;
>
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=1320.52..1320.53 rows=1 width=4) (actual
> time=13.640..13.640 rows=1 loops=1)
>    ->  Index Scan using user_messages_user on user_messages
> (cost=0.00..1319.62 rows=358 width=4) (actual time=13.631..13.631
> rows=0 loops=1)
>          Index Cond: (user_id = 13604)
>  Total runtime: 13.712 ms
>
> Thank you! That solves my performance problem, at least.

Although the row-estimate still seems quite high. You might want to
increase it even further (maximum is 1000). If this is a common query,
I'd look at an index on (user,id) rather than just (user) perhaps.

> But it's worrying that PostgreSQL should be so off in planning the
> query. Does this behaviour qualify as a bug, or is this -- that is,
> the need to tweak statistics parameters -- just your garden-variety
> application-specific optimization?

Well, it's data-specific rather than application specific I suppose. The
  issue is that there is a cost to tracking 100 values and you don't
want to pay that on every column in every table. If user 13604 isn't in
the list of most-common users then all it can really do is fix an upper
bound on how many matches it can have. Of course you and I can reason
outside of the data and guess that manu users won't have more than a
handful of messages, but that's not something PG can do.

In theory, PG could auto-tune itself for various parameters. The problem
  then is, do you:
1. Learn constantly, meaning you constantly pay the cost of checking
your decisions and never get consistent plans.
2. Learn once, in which case a change in data frequencies or usage
patterns renders your learning out of date.

You might find http://pgfoundry.org/ useful with the fouine / pqa
projects to analyse query logs.

--
   Richard Huxton
   Archonet Ltd

В списке pgsql-general по дате отправления:

Предыдущее
От: "Rhys Stewart"
Дата:
Сообщение: Re: (un)grouping question
Следующее
От: "Alexander Staubo"
Дата:
Сообщение: Re: Selecting max(pk) is slow on empty set