Обсуждение: Planner issue
I get the following output from explain analyze on a certain subset of
a large query I'm doing.
From the looks of it, I need to increase how often postgres uses an
index over a seq scan, but I'm not sure how to do that. I looked
through the run-time configuration docs on the website, but didn't see
anything pertaining to index selectivity.
Thanks,
Alex Turner
netEconomist
trendmls=# explain analyze select listnum from propmain where
listprice<=300000 and listprice>=220000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on propmain (cost=0.00..15556.05 rows=6228 width=4) (actual
time=0.093..506.730 rows=5671 loops=1)
Filter: ((listprice <= 300000::numeric) AND (listprice >= 220000::numeric))
Total runtime: 510.482 ms
(3 rows)
trendmls=# explain analyze select listnum from propmain where
listprice<=300000 and listprice>=250000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using propmain_listprice_i on propmain
(cost=0.00..12578.65 rows=3486 width=4) (actual time=0.103..16.418
rows=3440 loops=1)
Index Cond: ((listprice <= 300000::numeric) AND (listprice >=
250000::numeric))
Total runtime: 18.528 ms
(3 rows)
Alex Turner wrote: >I get the following output from explain analyze on a certain subset of >a large query I'm doing. > > > Try increases the statistics on the listprice column with alter table and then re-run analyze. alter table foo alter column set statistics <n> Sincerely, Joshua D. Drake >From the looks of it, I need to increase how often postgres uses an >index over a seq scan, but I'm not sure how to do that. I looked >through the run-time configuration docs on the website, but didn't see >anything pertaining to index selectivity. > >Thanks, > >Alex Turner >netEconomist > > >trendmls=# explain analyze select listnum from propmain where >listprice<=300000 and listprice>=220000; > QUERY PLAN >-------------------------------------------------------------------------------------------------------------- > Seq Scan on propmain (cost=0.00..15556.05 rows=6228 width=4) (actual >time=0.093..506.730 rows=5671 loops=1) > Filter: ((listprice <= 300000::numeric) AND (listprice >= 220000::numeric)) > Total runtime: 510.482 ms >(3 rows) > >trendmls=# explain analyze select listnum from propmain where >listprice<=300000 and listprice>=250000; > QUERY PLAN >------------------------------------------------------------------------------------------------------------------------------------------ > Index Scan using propmain_listprice_i on propmain >(cost=0.00..12578.65 rows=3486 width=4) (actual time=0.103..16.418 >rows=3440 loops=1) > Index Cond: ((listprice <= 300000::numeric) AND (listprice >= >250000::numeric)) > Total runtime: 18.528 ms >(3 rows) > >---------------------------(end of broadcast)--------------------------- >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
Вложения
This helps a bit when I set it to 1000 - but it's still pretty bad:
I will use an index 220-300, but not 200-300.
Alex
trendmls=# explain analyze select listnum from propmain where
listprice<=300000 and listprice>=200000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on propmain (cost=0.00..15517.56 rows=6842 width=4) (actual
time=0.039..239.760 rows=6847 loops=1)
Filter: ((listprice <= 300000::numeric) AND (listprice >= 200000::numeric))
Total runtime: 244.301 ms
(3 rows)
trendmls=# set enable_seqscan=off;
SET
trendmls=# explain analyze select listnum from propmain where
listprice<=300000 and listprice>=200000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using propmain_listprice_i on propmain
(cost=0.00..22395.95 rows=6842 width=4) (actual time=0.084..25.751
rows=6847 loops=1)
Index Cond: ((listprice <= 300000::numeric) AND (listprice >=
200000::numeric))
Total runtime: 30.193 ms
(3 rows)
trendmls=#
On Tue, 22 Mar 2005 08:22:59 -0800, Joshua D. Drake
<jd@commandprompt.com> wrote:
> Alex Turner wrote:
>
> >I get the following output from explain analyze on a certain subset of
> >a large query I'm doing.
> >
> >
> >
> Try increases the statistics on the listprice column with alter
> table and then re-run analyze.
>
> alter table foo alter column set statistics <n>
>
> Sincerely,
>
> Joshua D. Drake
>
>
> >>From the looks of it, I need to increase how often postgres uses an
> >index over a seq scan, but I'm not sure how to do that. I looked
> >through the run-time configuration docs on the website, but didn't see
> >anything pertaining to index selectivity.
> >
> >Thanks,
> >
> >Alex Turner
> >netEconomist
> >
> >
> >trendmls=# explain analyze select listnum from propmain where
> >listprice<=300000 and listprice>=220000;
> > QUERY PLAN
> >--------------------------------------------------------------------------------------------------------------
> > Seq Scan on propmain (cost=0.00..15556.05 rows=6228 width=4) (actual
> >time=0.093..506.730 rows=5671 loops=1)
> > Filter: ((listprice <= 300000::numeric) AND (listprice >= 220000::numeric))
> > Total runtime: 510.482 ms
> >(3 rows)
> >
> >trendmls=# explain analyze select listnum from propmain where
> >listprice<=300000 and listprice>=250000;
> > QUERY PLAN
>
>------------------------------------------------------------------------------------------------------------------------------------------
> > Index Scan using propmain_listprice_i on propmain
> >(cost=0.00..12578.65 rows=3486 width=4) (actual time=0.103..16.418
> >rows=3440 loops=1)
> > Index Cond: ((listprice <= 300000::numeric) AND (listprice >=
> >250000::numeric))
> > Total runtime: 18.528 ms
> >(3 rows)
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
> >
>
> --
> Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
> Postgresql support, programming shared hosting and dedicated hosting.
> +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
> PostgreSQL Replicator -- production quality replication for PostgreSQL
>
>
>
On Tue, 2005-03-22 at 14:36 -0500, Alex Turner wrote: > I will use an index 220-300, but not 200-300. > ... > Seq Scan on propmain (cost=0.00..15517.56 rows=6842 width=4) (actual > time=0.039..239.760 rows=6847 loops=1) > ... > Index Scan using propmain_listprice_i on propmain > (cost=0.00..22395.95 rows=6842 width=4) (actual time=0.084..25.751 > rows=6847 loops=1) the rows estimates are accurate, so it is not a question of statistics anymore. first make sure effective_cache_size is correctly set, and then if that is not enough, you might try to lower random_page_cost a bit gnari
I'm guessing your data is actually more "clustered" than the "correlation" stastic thinks it is. Alex Turner wrote: > trendmls=# explain analyze select listnum from propmain where > listprice<=300000 and listprice>=200000; Is that a database of properties like land/houses? If your table is clustered geographically (by zip code, etc), the index scan might do quite well because all houses in a neighborhood may have similar prices (and therefore live on just a few disk pages). However since high-priced neighborhoods are scattered across the country, the optimizer would see a very low "correlation" and not notice this clustering. If this is the cause, one thing you could do is CLUSTER your table on propmain_listprice_i. I'm quite confident it'll fix this particular query - but might slow down other queries.
I'm guessing your data is actually more "clustered" than the "correlation" statistic thinks it is. Alex Turner wrote: > trendmls=# explain analyze select listnum from propmain where > listprice<=300000 and listprice>=200000; Is that a database of properties like land/houses? If your table is clustered geographically (by zip code, etc), the index scan might do quite well because all houses in a neighborhood may have similar prices (and therefore live on just a few disk pages). However since high-priced neighborhoods are scattered across the country, the optimizer would see a very low "correlation" and not notice this clustering. If this is the cause, one thing you could do is CLUSTER your table on propmain_listprice_i. I'm quite confident it'll fix this particular query - but might slow down other queries.