Обсуждение: weird query plan

Поиск

Список

Период

Сортировка

weird query plan

От

weiping

Дата:

06 июня 2007 г., 00:56:30

I have a table:
webdigest=# \d wd_urlusermaps
表 "public.wd_urlusermaps"
字段名 | 类型 | 修饰词
---------+-----------------------------+-------------------------------------------------------------
id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass)
urlid | integer | not null
tag | character varying(512) |
title | character varying(512) |
summary | character varying(1024) |
comment | character varying(1024) |
ctime | timestamp without time zone |
mtime | timestamp without time zone |
share | smallint |
userid | integer |
import | smallint | default 0
索引:
"wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER
"urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid)
"urlusermaps_urlid" btree (urlid)
"urlusermaps_userid" btree (userid)
"wd_urlusermaps_ctime_idx" btree (ctime)
"wd_urlusermaps_share_idx" btree ("share")

and target statistic set to 1000, and two different query plan:

webdigest=# explain analyze select A.id as
fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
limit 20 ;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..4932.56 rows=20 width=96) (actual
time=730.461..2374.435 rows=20 loops=1)
-> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a
(cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367
rows=20 loops=1)
Filter: (("share" = 1) AND (userid = 219177))
Total runtime: 2374.513 ms
(4 rows)

webdigest=# explain analyze select A.id as
fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
limit 40 ;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905
rows=40 loops=1)
-> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual
time=5.726..5.785 rows=40 loops=1)
Sort Key: id
-> Index Scan using urlusermaps_userid on wd_urlusermaps a
(cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616
rows=41 loops=1)
Index Cond: (userid = 219177)
Filter: ("share" = 1)
Total runtime: 6.013 ms
(7 rows)

the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query
shows so much difference?

any hint would be greatly appreciated.

-laser

Re: weird query plan

От

weiping

Дата:

06 июня 2007 г., 00:59:23

sorry, forgot to mention our version, it's postgresql 8.2.3

-laser
> I have a table:
> webdigest=# \d wd_urlusermaps
> 表 "public.wd_urlusermaps"
> 字段名 | 类型 | 修饰词
> ---------+-----------------------------+-------------------------------------------------------------
> id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass)
> urlid | integer | not null
> tag | character varying(512) |
> title | character varying(512) |
> summary | character varying(1024) |
> comment | character varying(1024) |
> ctime | timestamp without time zone |
> mtime | timestamp without time zone |
> share | smallint |
> userid | integer |
> import | smallint | default 0
> 索引:
> "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER
> "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid)
> "urlusermaps_urlid" btree (urlid)
> "urlusermaps_userid" btree (userid)
> "wd_urlusermaps_ctime_idx" btree (ctime)
> "wd_urlusermaps_share_idx" btree ("share")
>
> and target statistic set to 1000, and two different query plan:
>
> webdigest=# explain analyze select A.id as
> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
> limit 20 ;
> QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..4932.56 rows=20 width=96) (actual
> time=730.461..2374.435 rows=20 loops=1)
> -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a
> (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367
> rows=20 loops=1)
> Filter: (("share" = 1) AND (userid = 219177))
> Total runtime: 2374.513 ms
> (4 rows)
>
> webdigest=# explain analyze select A.id as
> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
> limit 40 ;
> QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905
> rows=40 loops=1)
> -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual
> time=5.726..5.785 rows=40 loops=1)
> Sort Key: id
> -> Index Scan using urlusermaps_userid on wd_urlusermaps a
> (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616
> rows=41 loops=1)
> Index Cond: (userid = 219177)
> Filter: ("share" = 1)
> Total runtime: 6.013 ms
> (7 rows)
>
> the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query
> shows so much difference?
>
> any hint would be greatly appreciated.
>
> -laser
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>
>
>

different query plan because different limit # (Re: weird query plan)

От

weiping

Дата:

06 июня 2007 г., 04:24:38

I changed the query to :
EXPLAIN ANALYZE select id from wd_urlusermaps where id in (select id
from wd_urlusermaps where share =1 and userid='219177') order by id desc
limit 20;

and it's much better now (from real execute time), but the cost report
higher
then slower one above, may be I should do some tunning on planner
parameter or
is it a planner bug?
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=16118.83..16118.88 rows=20 width=4) (actual
time=17.539..17.619 rows=20 loops=1)
-> Sort (cost=16118.83..16121.57 rows=1094 width=4) (actual
time=17.534..17.560 rows=20 loops=1)
Sort Key: public.wd_urlusermaps.id
-> Nested Loop (cost=6753.28..16063.61 rows=1094 width=4) (actual
time=16.739..17.439 rows=41 loops=1)
-> HashAggregate (cost=6753.28..6764.22 rows=1094 width=4) (actual
time=16.707..16.786 rows=41 loops=1)
-> Index Scan using urlusermaps_userid on wd_urlusermaps
(cost=0.00..6750.55 rows=1094 width=4) (actual time=1.478..16.563
rows=41 loops=1)
Index Cond: (userid = 219177)
Filter: ("share" = 1)
-> Index Scan using wd_urlusermaps_pkey on wd_urlusermaps
(cost=0.00..8.49 rows=1 width=4) (actual time=0.008..0.010 rows=1 loops=41)
Index Cond: (public.wd_urlusermaps.id = public.wd_urlusermaps.id)
Total runtime: 17.762 ms
(11 rows)

> sorry, forgot to mention our version, it's postgresql 8.2.3
>
> -laser
>
>> I have a table:
>> webdigest=# \d wd_urlusermaps
>> 表 "public.wd_urlusermaps"
>> 字段名 | 类型 | 修饰词
>> ---------+-----------------------------+-------------------------------------------------------------
>> id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass)
>> urlid | integer | not null
>> tag | character varying(512) |
>> title | character varying(512) |
>> summary | character varying(1024) |
>> comment | character varying(1024) |
>> ctime | timestamp without time zone |
>> mtime | timestamp without time zone |
>> share | smallint |
>> userid | integer |
>> import | smallint | default 0
>> 索引:
>> "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER
>> "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid)
>> "urlusermaps_urlid" btree (urlid)
>> "urlusermaps_userid" btree (userid)
>> "wd_urlusermaps_ctime_idx" btree (ctime)
>> "wd_urlusermaps_share_idx" btree ("share")
>>
>> and target statistic set to 1000, and two different query plan:
>>
>> webdigest=# explain analyze select A.id as
>> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
>> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
>> limit 20 ;
>> QUERY PLAN
>>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Limit (cost=0.00..4932.56 rows=20 width=96) (actual
>> time=730.461..2374.435 rows=20 loops=1)
>> -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a
>> (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367
>> rows=20 loops=1)
>> Filter: (("share" = 1) AND (userid = 219177))
>> Total runtime: 2374.513 ms
>> (4 rows)
>>
>> webdigest=# explain analyze select A.id as
>> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
>> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
>> limit 40 ;
>> QUERY PLAN
>>
---------------------------------------------------------------------------------------------------------------------------------------------------------
>> Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905
>> rows=40 loops=1)
>> -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual
>> time=5.726..5.785 rows=40 loops=1)
>> Sort Key: id
>> -> Index Scan using urlusermaps_userid on wd_urlusermaps a
>> (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616
>> rows=41 loops=1)
>> Index Cond: (userid = 219177)
>> Filter: ("share" = 1)
>> Total runtime: 6.013 ms
>> (7 rows)
>>
>> the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query
>> shows so much difference?
>>
>> any hint would be greatly appreciated.
>>
>> -laser
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>
>>
>>
>>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>
>

Re: different query plan because different limit # (Re: weird query plan)

От

weiping

Дата:

06 июня 2007 г., 04:41:06

continue digging shows:
set cpu_tuple_cost to 0.1;
explain analyze select * from wd_urlusermaps where share =1 and
userid='219177' order by id desc limit 20;
SET
时间: 0.256 ms
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=7063.98..7064.03 rows=20 width=110) (actual
time=6.047..6.130 rows=20 loops=1)
-> Sort (cost=7063.98..7066.71 rows=1094 width=110) (actual
time=6.043..6.070 rows=20 loops=1)
Sort Key: id
-> Index Scan using urlusermaps_userid on wd_urlusermaps
(cost=0.00..7008.76 rows=1094 width=110) (actual time=0.710..5.838
rows=41 loops=1)
Index Cond: (userid = 219177)
Filter: ("share" = 1)
Total runtime: 6.213 ms
(7 rows)

now it's what i need, which means we should increase cpu_tuple_cost for
large
RAM node (we got 16G RAN and the table only serveral hundred M) to avoid
sort
happened too early. is it true?

-laser
> I changed the query to :
> EXPLAIN ANALYZE select id from wd_urlusermaps where id in (select id
> from wd_urlusermaps where share =1 and userid='219177') order by id desc
> limit 20;
>
> and it's much better now (from real execute time), but the cost report
> higher
> then slower one above, may be I should do some tunning on planner
> parameter or
> is it a planner bug?
> QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=16118.83..16118.88 rows=20 width=4) (actual
> time=17.539..17.619 rows=20 loops=1)
> -> Sort (cost=16118.83..16121.57 rows=1094 width=4) (actual
> time=17.534..17.560 rows=20 loops=1)
> Sort Key: public.wd_urlusermaps.id
> -> Nested Loop (cost=6753.28..16063.61 rows=1094 width=4) (actual
> time=16.739..17.439 rows=41 loops=1)
> -> HashAggregate (cost=6753.28..6764.22 rows=1094 width=4) (actual
> time=16.707..16.786 rows=41 loops=1)
> -> Index Scan using urlusermaps_userid on wd_urlusermaps
> (cost=0.00..6750.55 rows=1094 width=4) (actual time=1.478..16.563
> rows=41 loops=1)
> Index Cond: (userid = 219177)
> Filter: ("share" = 1)
> -> Index Scan using wd_urlusermaps_pkey on wd_urlusermaps
> (cost=0.00..8.49 rows=1 width=4) (actual time=0.008..0.010 rows=1 loops=41)
> Index Cond: (public.wd_urlusermaps.id = public.wd_urlusermaps.id)
> Total runtime: 17.762 ms
> (11 rows)
>
>
>> sorry, forgot to mention our version, it's postgresql 8.2.3
>>
>> -laser
>>
>>
>>> I have a table:
>>> webdigest=# \d wd_urlusermaps
>>> 表 "public.wd_urlusermaps"
>>> 字段名 | 类型 | 修饰词
>>> ---------+-----------------------------+-------------------------------------------------------------
>>> id | integer | not null default nextval('wd_urlusermaps_id_seq'::regclass)
>>> urlid | integer | not null
>>> tag | character varying(512) |
>>> title | character varying(512) |
>>> summary | character varying(1024) |
>>> comment | character varying(1024) |
>>> ctime | timestamp without time zone |
>>> mtime | timestamp without time zone |
>>> share | smallint |
>>> userid | integer |
>>> import | smallint | default 0
>>> 索引:
>>> "wd_urlusermaps_pkey" PRIMARY KEY, btree (id) CLUSTER
>>> "urlusermaps_urlid_userid" UNIQUE, btree (urlid, userid)
>>> "urlusermaps_urlid" btree (urlid)
>>> "urlusermaps_userid" btree (userid)
>>> "wd_urlusermaps_ctime_idx" btree (ctime)
>>> "wd_urlusermaps_share_idx" btree ("share")
>>>
>>> and target statistic set to 1000, and two different query plan:
>>>
>>> webdigest=# explain analyze select A.id as
>>> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
>>> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
>>> limit 20 ;
>>> QUERY PLAN
>>>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> Limit (cost=0.00..4932.56 rows=20 width=96) (actual
>>> time=730.461..2374.435 rows=20 loops=1)
>>> -> Index Scan Backward using wd_urlusermaps_pkey on wd_urlusermaps a
>>> (cost=0.00..269810.77 rows=1094 width=96) (actual time=730.456..2374.367
>>> rows=20 loops=1)
>>> Filter: (("share" = 1) AND (userid = 219177))
>>> Total runtime: 2374.513 ms
>>> (4 rows)
>>>
>>> webdigest=# explain analyze select A.id as
>>> fav_id,A.urlid,A.tag,A.title,A.summary,A.comment,A.ctime,A.share from
>>> wd_urlusermaps A where share =1 and A.userid='219177' ORDER BY A.id DESC
>>> limit 40 ;
>>> QUERY PLAN
>>>
---------------------------------------------------------------------------------------------------------------------------------------------------------
>>> Limit (cost=6805.77..6805.87 rows=40 width=96) (actual time=5.731..5.905
>>> rows=40 loops=1)
>>> -> Sort (cost=6805.77..6808.50 rows=1094 width=96) (actual
>>> time=5.726..5.785 rows=40 loops=1)
>>> Sort Key: id
>>> -> Index Scan using urlusermaps_userid on wd_urlusermaps a
>>> (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616
>>> rows=41 loops=1)
>>> Index Cond: (userid = 219177)
>>> Filter: ("share" = 1)
>>> Total runtime: 6.013 ms
>>> (7 rows)
>>>
>>> the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query
>>> shows so much difference?
>>>
>>> any hint would be greatly appreciated.
>>>
>>> -laser
>>>
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>>        message can get through to the mailing list cleanly
>>>
>>>
>>>
>>>
>>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>>
>>
>>
>>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>
>
>

Re: different query plan because different limit # (Re: weird query plan)

От

Gregory Stark

Дата:

06 июня 2007 г., 06:22:13

"weiping" <laserlist@pgsqldb.com> writes:

> -> Index Scan using urlusermaps_userid on wd_urlusermaps
>    (cost=0.00..6750.55 rows=1094 width=4) (actual time=1.478..16.563 rows=41 loops=1)
>    Index Cond: (userid = 219177)
>    Filter: ("share" = 1)

It's estimating 1094 rows and getting 41 rows. You might considering raising
the statistics target for that table.

Does it get accurate estimates for the number of rows for each of these?

explain analyze select * from wd_urlusermaps where userid=219177
explain analyze select * from wd_urlusermaps where share=1

(the latter might take a while)

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: weird query plan

От

Tom Lane

Дата:

06 июня 2007 г., 11:40:57

weiping <laserlist@pgsqldb.com> writes:
> -> Index Scan using urlusermaps_userid on wd_urlusermaps a
> (cost=0.00..6750.55 rows=1094 width=96) (actual time=0.544..5.616
> rows=41 loops=1)
> Index Cond: (userid = 219177)
> Filter: ("share" = 1)

> the userid=219177 got 2000+ record and around 40 shared=1, why above 2 query
> shows so much difference?

Probably because the rowcount estimate is so far off (1094 vs 41).

Possibly boosting the statistics target would help.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: weird query plan

weird query plan

Re: weird query plan

different query plan because different limit # (Re: weird query plan)

Re: different query plan because different limit # (Re: weird query plan)

Re: different query plan because different limit # (Re: weird query plan)

Re: weird query plan