Re: Selecting max(pk) is slow on empty set

Поиск
Список
Период
Сортировка
От Alexander Staubo
Тема Re: Selecting max(pk) is slow on empty set
Дата
Msg-id 88daf38c0801220516o7e4d828djda659033824a8aca@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Selecting max(pk) is slow on empty set  (Richard Huxton <dev@archonet.com>)
Ответы Re: Selecting max(pk) is slow on empty set  (Richard Huxton <dev@archonet.com>)
Список pgsql-general
On 1/22/08, Richard Huxton <dev@archonet.com> wrote:
> Hmm, but with an estimated cost of 3646 (vs.633 for the max(*) which
> uses the wrong index). That explains why it's walking backwards through
> the pkey index, it thinks that it's 8 times cheaper.
[...]
> Have a look at most_common_vals,most_common_freqs in pg_stats for
> tbl=user_messages, att=user perhaps.

# select histogram_bounds from pg_stats where
tablename='user_messages' and attname='user_id';
                   histogram_bounds
-------------------------------------------------------
 {1,489,1097,1824,2555,3452,4488,5679,6879,8637,13448}

# select null_frac, n_distinct, most_common_vals, most_common_freqs
from pg_stats where tablename='user_messages' and attname='user_id';
 null_frac | n_distinct |                 most_common_vals
    |                                           most_common_freqs

-----------+------------+--------------------------------------------------+-------------------------------------------------------------------------------------------------------
         0 |       2652 |
{5826,484,1206,780,823,4085,4157,5852,1962,6453} |
{0.00933333,0.00766667,0.00666667,0.00633333,0.006,0.00566667,0.00566667,0.00533333,0.005,0.00466667}

> Then see if an ALTER TABLE SET
> STATISTICS 100 makes a difference.

So it does:

# explain analyze select max(id) from user_messages where user_id = 13604;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1320.52..1320.53 rows=1 width=4) (actual
time=13.640..13.640 rows=1 loops=1)
   ->  Index Scan using user_messages_user on user_messages
(cost=0.00..1319.62 rows=358 width=4) (actual time=13.631..13.631
rows=0 loops=1)
         Index Cond: (user_id = 13604)
 Total runtime: 13.712 ms

Thank you! That solves my performance problem, at least.

But it's worrying that PostgreSQL should be so off in planning the
query. Does this behaviour qualify as a bug, or is this -- that is,
the need to tweak statistics parameters -- just your garden-variety
application-specific optimization?

Alexander.

В списке pgsql-general по дате отправления:

Предыдущее
От: "Josh Harrison"
Дата:
Сообщение: Re: Online Oracle to Postgresql data migration
Следующее
От: "Rhys Stewart"
Дата:
Сообщение: Re: (un)grouping question