Speeding up Aggregates
От | Dror Matalon |
---|---|
Тема | Speeding up Aggregates |
Дата | |
Msg-id | 20031003202120.GO87525@rlx11.zapatec.com обсуждение исходный текст |
Ответы |
Re: Speeding up Aggregates
(Josh Berkus <josh@agliodbs.com>)
Re: Speeding up Aggregates (Rod Taylor <rbt@rbt.ca>) |
Список | pgsql-performance |
Hi, I have a query that ran quite well initially, but slowed down quite a bit once I introduced an aggregate into the equation. The average execution time went up from around 15 msec to around 300 msec. The original query fetches a bunch of articles: select articlenumber, channel, description, title, link, dtstamp from items, my_channels where items.channel = '22222' and my_channels.id = '22222' and owner = 'drormata' and dtstamp > last_viewed and articlenumber not in (select item from viewed_items where channel ='22222' and owner = 'drormata'); I then added a call to a function: and (dtstamp = item_max_date(22222, link)) item_max_date() looks like this: select max(dtstamp) from items where channel = $1 and link = $2; This should eliminate duplicate articles and only show the most recent one. resulting in the following query select articlenumber, channel, description, title, link, dtstamp from items, my_channels where items.channel = '22222' and my_channels.id = '22222' and owner = 'drormata' and dtstamp > last_viewed and articlenumber not in (select item from viewed_items where channel ='22222' and owner = 'drormata') and (dtstamp = item_max_date(22222, link)); Any suggestions on optimizing the query/function? It makes sense that it slowed down, but I wonder if I can do better. I'm including index list as well as "explain analyze" of both versions. Indexes: "item_channel_link" btree (channel, link) "item_created" btree (dtstamp) "item_signature" btree (signature) "items_channel_article" btree (channel, articlenumber) explain analyze select articlenumber, channel, description, title, link, dtstamp from items, my_channels where items.channel= '22222' and my_channels.id = '22222' and owner = 'drormata' and dtstamp > last_viewed and articlenumber notin (select item from viewed_items where channel ='22222' and owner = 'drormata'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=8.19..6982.58 rows=302 width=259) (actual time=16.95..17.16 rows=8 loops=1) Join Filter: ("inner".dtstamp > "outer".last_viewed) -> Seq Scan on my_channels (cost=0.00..3.23 rows=1 width=8) (actual time=0.36..0.38 rows=1 loops=1) Filter: ((id = 22222) AND (("owner")::text = 'drormata'::text)) -> Index Scan using items_channel_article on items (cost=8.19..6968.05 rows=904 width=259) (actual time=0.68..13.94rows=899 loops=1) Index Cond: (channel = 22222) Filter: (NOT (hashed subplan)) SubPlan -> Seq Scan on viewed_items (cost=0.00..8.19 rows=2 width=4) (actual time=0.48..0.48 rows=0 loops=1) Filter: ((channel = 22222) AND (("owner")::text = 'drormata'::text)) Total runtime: 17.42 msec (11 rows) explain analyze select articlenumber, channel, description, title, link, dtstamp from items, my_channels where items.channel= '22222' and my_channels.id = '22222' and owner = 'drormata' and dtstamp > last_viewed and articlenumber notin (select item from viewed_items where channel ='22222' and owner = 'drormata') and (dtstamp = item_max_date(22222, link)); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=8.19..6980.33 rows=1 width=259) (actual time=262.94..265.14 rows=7 loops=1) Join Filter: ("outer".dtstamp > "inner".last_viewed) -> Index Scan using items_channel_article on items (cost=8.19..6977.08 rows=1 width=259) (actual time=1.94..150.55 rows=683loops=1) Index Cond: (channel = 22222) Filter: ((dtstamp = item_max_date(22222, link)) AND (NOT (hashed subplan))) SubPlan -> Seq Scan on viewed_items (cost=0.00..8.19 rows=2 width=4) (actual time=0.43..0.43 rows=0 loops=1) Filter: ((channel = 22222) AND (("owner")::text = 'drormata'::text)) -> Seq Scan on my_channels (cost=0.00..3.23 rows=1 width=8) (actual time=0.14..0.15 rows=1 loops=683) Filter: ((id = 22222) AND (("owner")::text = 'drormata'::text)) Total runtime: 265.39 msec -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.zapatec.com
В списке pgsql-performance по дате отправления: