Re: performance help
От | Tom Lane |
---|---|
Тема | Re: performance help |
Дата | |
Msg-id | 581.964808242@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: performance help (Ernie <ernie.cline@ipgdirect.com>) |
Список | pgsql-general |
Ernie <ernie.cline@ipgdirect.com> writes: > This query is very fast. > > cw=# SELECT distinct n.news_id, headline, link, to_char(created, > 'mm-dd-yyyyhh24:mi'), > cw-# created FROM news_article_summary n, news_cat nc WHERE n.news_id = > nc.news_id AND > cw-# created > CURRENT_TIMESTAMP-30 AND nc.code_id > cw-# > in(14,227,326,321,327,4301,5179,5100,585,5175,567,5185,568,5381,5109,554,5621,5 > 462, > cw(# 597,5324,5117,569,5142,570,5327,571,5167,5481,5145) ORDER BY created desc; > > Unique (cost=60322.14..60559.66 rows=1900 width=48) > -> Sort (cost=60322.14..60322.14 rows=19001 width=48) > -> Nested Loop (cost=0.00..58651.80 rows=19001 width=48) > -> Seq Scan on news_article_summary n (cost=0.00..416.14 > rows=1898 width=36) > -> Index Scan using news_cat_news_id on news_cat nc > (cost=0.00..30.53 rows=10 width=12) > > And here is the query I'm having problems with (slow) > cw=# SELECT distinct n.news_id, headline, link, > cw-# to_char(created, 'mm-dd-yyyy hh24:mi'), > cw-# created FROM news_article_summary n, news_cat nc > cw-# WHERE n.news_id = nc.news_id AND created > CURRENT_TIMESTAMP-30 AND > cw-# nc.code_id in(4261,4182) ORDER BY created desc; > > Unique (cost=35162.58..35181.27 rows=150 width=48) > -> Sort (cost=35162.58..35162.58 rows=1496 width=48) > -> Nested Loop (cost=0.00..35083.71 rows=1496 width=48) > -> Index Scan using news_cat_code_id, news_cat_code_id on > news_cat nc (cost=0.00..55.31 rows=79 width=12) > -> Seq Scan on news_article_summary n (cost=0.00..416.14 > rows=1898 width=36) The difference evidently is which table is scanned as the outside of the nested loop. The first plan says "scan news_article_summary sequentially, and for each row that passes the WHERE clauses that mention only that table, probe into news_cat_news_id for the row(s) that match by news_id; then check the remaining WHERE clauses on the combined row(s)." The second plan says "probe into news_cat_news_id for the row(s) that have the requested code_id values, and for each one scan news_article_summary sequentially to locate matching rows." If there are a lot of matches for nc.code_id in(4261,4182) then the second plan would scan news_article_summary many times, which'd account for it being slow. Are news_id and code_id unique columns? I am guessing from context that at least news_id might be. The planner does not seem to know that, judging from its row-count guesses. Perhaps all you need is a VACUUM ANALYZE so that the planner becomes aware the column is unique. regards, tom lane
В списке pgsql-general по дате отправления: