Обсуждение: Planner making poor choices?

Поиск
Список
Период
Сортировка

Planner making poor choices?

От
Mike Benoit
Дата:
Postgres (v7.3.1) doesn't seem to making use of indexes when it clearly
is the proper choice to make. I've ran in to this problem several times,
but "alter table statistics" has always solved the problem. However it
didn't seem to help in this case.

Any ideas why Postgres refuses to use the index?

Summary: (details follow)
=====================================================================

vacuum analyze verbose mail_aliases;

explain analyze select count(*) from mail_aliases where account_id =
13275;
Time: 90.21 ms

set enable_seqscan = off;

explain analyze select count(*) from mail_aliases where account_id =
13275;
Time: 6.16 ms

ALTER table mail_aliases alter account_id set statistics 25;

vacuum analyze verbose mail_aliases;

explain analyze select count(*) from mail_aliases where account_id =
13275;
Time: ~90 ms

ALTER table mail_aliases alter account_id set statistics 100;

vacuum analyze verbose mail_aliases;

explain analyze select count(*) from mail_aliases where account_id =
13275;
Time: ~90 ms

ALTER table mail_aliases alter account_id set statistics 1000;

vacuum analyze verbose mail_aliases;

explain analyze select count(*) from mail_aliases where account_id =
13275;
Time: ~90 ms


Details:
=====================================================================
                            version
---------------------------------------------------------------
 PostgreSQL 7.3.1 on i386-pc-linux-gnu, compiled by GCC 2.95.4
(1 row)

userservices=# select count(*) from mail_aliases;
 count
-------
 49766
(1 row)

Time: 107.89 ms

userservices=# select count(*) from mail_aliases where account_id =
13275;
 count
-------
   624
(1 row)

Time: 90.26 ms

userservices=# vacuum analyze verbose mail_aliases;
INFO:  --Relation public.mail_aliases--
INFO:  Pages 789: Changed 0, Empty 0; Tup 49756: Vac 0, Keep 0, UnUsed
21811.
        Total CPU 0.00s/0.02u sec elapsed 0.01 sec.
INFO:  Analyzing public.mail_aliases
VACUUM
Time: 219.23 ms

userservices=# explain analyze select count(*) from mail_aliases where
account_id = 13275;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1412.77..1412.77 rows=1 width=0) (actual
time=88.42..88.42 rows=1 loops=1)
   ->  Seq Scan on mail_aliases  (cost=0.00..1410.95 rows=730 width=0)
(actual time=1.87..87.60 rows=624 loops=1)
         Filter: (account_id = 13275)
 Total runtime: 88.54 msec
(4 rows)

Time: 90.21 ms

userservices=# set enable_seqscan = off;
SET
Time: 1.03 ms

userservices=# explain analyze select count(*) from mail_aliases where
account_id = 13275;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2012.53..2012.53 rows=1 width=0) (actual
time=4.19..4.19 rows=1 loops=1)
   ->  Index Scan using account_id_mail_aliases_key on mail_aliases
(cost=0.00..2010.70 rows=730 width=0) (actual time=0.08..3.24 rows=624
loops=1)
         Index Cond: (account_id = 13275)
 Total runtime: 4.28 msec
(4 rows)

Time: 6.16 ms

userservices=# ALTER table mail_aliases alter account_id set statistics
25;
ALTER TABLE
Time: 36.83 ms
userservices=# vacuum analyze verbose mail_aliases;
INFO:  --Relation public.mail_aliases--
INFO:  Pages 789: Changed 0, Empty 0; Tup 49756: Vac 0, Keep 0, UnUsed
21811.
        Total CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  Analyzing public.mail_aliases
VACUUM
Time: 495.19 ms

userservices=# set enable_seqscan = on;
SET
Time: 1.28 ms

userservices=# explain analyze select count(*) from mail_aliases where
account_id = 13275;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1412.33..1412.33 rows=1 width=0) (actual
time=90.31..90.32 rows=1 loops=1)
   ->  Seq Scan on mail_aliases  (cost=0.00..1410.95 rows=551 width=0)
(actual time=2.02..89.47 rows=624 loops=1)
         Filter: (account_id = 13275)
 Total runtime: 90.46 msec
(4 rows)

Time: 92.99 ms

userservices=# ALTER table mail_aliases alter account_id set statistics
100;
ALTER TABLE
Time: 22.61 ms

userservices=# vacuum analyze verbose mail_aliases;
INFO:  --Relation public.mail_aliases--
INFO:  Index mail_aliases_pkey: Pages 395; Tuples 49758: Deleted 2.
        CPU 0.00s/0.02u sec elapsed 0.01 sec.
INFO:  Index account_id_mail_aliases_key: Pages 334; Tuples 49758:
Deleted 2.
        CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  Index alias_mail_aliases_key: Pages 375; Tuples 49758: Deleted 2.
        CPU 0.00s/0.02u sec elapsed 0.01 sec.
INFO:  Removed 2 tuples in 1 pages.
        CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Pages 789: Changed 0, Empty 0; Tup 49758: Vac 2, Keep 2, UnUsed
21807.
        Total CPU 0.00s/0.07u sec elapsed 0.06 sec.
INFO:  Analyzing public.mail_aliases
VACUUM
Time: 1915.31 ms

userservices=# explain analyze select count(*) from mail_aliases where
account_id = 13275;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1412.53..1412.53 rows=1 width=0) (actual
time=87.98..87.98 rows=1 loops=1)
   ->  Seq Scan on mail_aliases  (cost=0.00..1410.97 rows=622 width=0)
(actual time=1.85..87.17 rows=624 loops=1)
         Filter: (account_id = 13275)
 Total runtime: 88.09 msec
(4 rows)

Time: 90.68 ms

userservices=# ALTER table mail_aliases alter account_id set statistics
1000;
ALTER TABLE
Time: 1.53 ms
userservices=# vacuum analyze verbose mail_aliases;
INFO:  --Relation public.mail_aliases--
INFO:  Index mail_aliases_pkey: Pages 395; Tuples 49756: Deleted 4.
        CPU 0.00s/0.02u sec elapsed 0.01 sec.
INFO:  Index account_id_mail_aliases_key: Pages 334; Tuples 49756:
Deleted 4.
        CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  Index alias_mail_aliases_key: Pages 375; Tuples 49756: Deleted 4.
        CPU 0.00s/0.02u sec elapsed 0.01 sec.
INFO:  Removed 4 tuples in 1 pages.
        CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Pages 789: Changed 0, Empty 0; Tup 49756: Vac 4, Keep 0, UnUsed
21807.
        Total CPU 0.00s/0.07u sec elapsed 0.06 sec.
INFO:  Analyzing public.mail_aliases
VACUUM
Time: 2935.78 ms

userservices=# explain analyze select count(*) from mail_aliases where
account_id = 13275;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1412.51..1412.51 rows=1 width=0) (actual
time=88.00..88.00 rows=1 loops=1)
   ->  Seq Scan on mail_aliases  (cost=0.00..1410.95 rows=624 width=0)
(actual time=1.87..87.20 rows=624 loops=1)
         Filter: (account_id = 13275)
 Total runtime: 88.11 msec
(4 rows)

Time: 92.31 ms



Re: Planner making poor choices?

От
Tom Lane
Дата:
Mike Benoit <mikeb@netnation.com> writes:
>     Postgres (v7.3.1) doesn't seem to making use of indexes when it clearly
> is the proper choice to make. I've ran in to this problem several times,
> but "alter table statistics" has always solved the problem. However it
> didn't seem to help in this case.

It wouldn't, since the planner seems to be doing a fine job at
estimating the row count already.  I think it may be dropping the ball
on correlation: is this table pretty well clustered by account_id?
It's hard to see how the indexscan could be so cheap if there's not
any clustering, because it would probably have to hit most of the 789
pages in the table in order to retrieve 624 randomly-scattered rows.
It would be useful to look at the number of blocks actually read
(you could investigate that by turning on the statistics collector),
and to see what the correlation value is for account_id in pg_stats.

Another factor is that with such a small table (only about six Mb),
the whole table is probably sitting in kernel disk cache.  I'm not
sure if you really want to optimize the behavior for that case,
but if you do, try lowering random_page_cost.  For an all-in-RAM
scenario, random_page_cost = 1 is the most accurate setting.

            regards, tom lane

Re: Planner making poor choices?

От
Mike Benoit
Дата:
On Fri, 2003-01-10 at 16:48, Tom Lane wrote:
> Mike Benoit <mikeb@netnation.com> writes:
> >     Postgres (v7.3.1) doesn't seem to making use of indexes when it clearly
> > is the proper choice to make. I've ran in to this problem several times,
> > but "alter table statistics" has always solved the problem. However it
> > didn't seem to help in this case.
>
> It wouldn't, since the planner seems to be doing a fine job at
> estimating the row count already.  I think it may be dropping the ball
> on correlation: is this table pretty well clustered by account_id?
> It's hard to see how the indexscan could be so cheap if there's not
> any clustering, because it would probably have to hit most of the 789
> pages in the table in order to retrieve 624 randomly-scattered rows.
> It would be useful to look at the number of blocks actually read
> (you could investigate that by turning on the statistics collector),
> and to see what the correlation value is for account_id in pg_stats.

select * from pg_stats where tablename = 'mail_aliases' and
attname='account_id';

avg_width=4
n_distinct=1833
most_common_vals={13275,21845,11402,5535,27252,16878,54262,4027,55189,38627}
most_common_freqs={0.0156667,0.01,0.009,0.00766667,0.00733333,0.00666667,0.00666667,0.00633333,0.00633333,0.006}
histogram_bounds={302,9225,13797,19183,29119,41098,48174,54048,59123,65082,73292}
correlation=-0.0468637

>
> Another factor is that with such a small table (only about six Mb),
> the whole table is probably sitting in kernel disk cache.  I'm not
> sure if you really want to optimize the behavior for that case,
> but if you do, try lowering random_page_cost.  For an all-in-RAM
> scenario, random_page_cost = 1 is the most accurate setting.
>
>             regards, tom lane

I'll give that a try and see how things change.

Thanks Tom.