Обсуждение: Why so much time difference with a same query/plan?
Merry Xmas! I have a query. It sometimes runs OK and sometimes horrible. Here is result from explain analyze: explain analyze SELECT module, sum(c1) + sum(c2) + sum(c3) + sum(c4) + sum(c5) AS "count" FROM xxx WHERE created >= ('now'::timestamptz - '1 day'::interval) AND customer_id='158' AND domain='xyz.com' GROUP BY module; There is an index: Indexes: xxx_idx btree (customer_id, created, "domain") Table are regularlly "vacuum full" and reindex and it has 3 million rows. QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=139.53..141.65 rows=12 width=30) (actual time=17623.65..17623.65 rows=0 loops=1) -> Group (cost=139.53..140.14 rows=121 width=30) (actual time=17623.64..17623.64 rows=0 loops=1) -> Sort (cost=139.53..139.83 rows=121 width=30) (actual time=17623.63..17623.63 rows=0 loops=1) Sort Key: module -> Index Scan using xxx_idx on xxx (cost=0.00..135.33 rows=121 width=30) (actual time=17622.95..17622.95 rows=0 loops=1) Index Cond: ((customer_id = 158) AND (created >= '2004-12-02 11:26:22.596656-05'::timestamp with time zone) AND ("domain" = 'xyz.com'::character varying)) Total runtime: 17624.05 msec (7 rows) QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=142.05..144.21 rows=12 width=30) (actual time=1314931.09..1314931.09 rows=0 loops=1) -> Group (cost=142.05..142.66 rows=124 width=30) (actual time=1314931.08..1314931.08 rows=0 loops=1) -> Sort (cost=142.05..142.36 rows=124 width=30) (actual time=1314931.08..1314931.08 rows=0 loops=1) Sort Key: module -> Index Scan using xxx_idx on xxx (cost=0.00..137.74 rows=124 width=30) (actual time=1314930.72..1314930.72 rows=0 loops=1) Index Cond: ((customer_id = 158) AND (created >= '2004-12-01 15:21:51.785526-05'::timestamp with time zone) AND ("domain" = 'xyz.com'::character varying)) Total runtime: 1314933.16 msec (7 rows) What can I try? Thanks, __________________________________ Do you Yahoo!? Dress up your holiday email, Hollywood style. Learn more. http://celebrity.mail.yahoo.com
Does the order of columns in the index matter since more than 50% customer_id = 158? I think it does not in Oracle. Will the performance be better if I change index xxx_idx to ("domain", customer_id, created)? I will test myself when possible. Thanks, --- Litao Wu <litaowu@yahoo.com> wrote: > Merry Xmas! > > I have a query. It sometimes runs OK and sometimes > horrible. Here is result from explain analyze: > > explain analyze > SELECT module, sum(c1) + sum(c2) + sum(c3) + > sum(c4) > + sum(c5) AS "count" > FROM xxx > WHERE created >= ('now'::timestamptz - '1 > day'::interval) AND customer_id='158' > AND domain='xyz.com' > GROUP BY module; > > There is an index: > Indexes: xxx_idx btree (customer_id, created, > "domain") > > Table are regularlly "vacuum full" and reindex and > it has 3 million rows. > > > > QUERY PLAN > > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=139.53..141.65 rows=12 width=30) > (actual time=17623.65..17623.65 rows=0 loops=1) > -> Group (cost=139.53..140.14 rows=121 > width=30) > (actual time=17623.64..17623.64 rows=0 loops=1) > -> Sort (cost=139.53..139.83 rows=121 > width=30) (actual time=17623.63..17623.63 rows=0 > loops=1) > Sort Key: module > -> Index Scan using xxx_idx on xxx > (cost=0.00..135.33 rows=121 width=30) (actual > time=17622.95..17622.95 rows=0 loops=1) > Index Cond: ((customer_id = > 158) > AND (created >= '2004-12-02 > 11:26:22.596656-05'::timestamp with time zone) AND > ("domain" = 'xyz.com'::character varying)) > Total runtime: 17624.05 msec > (7 rows) > > > QUERY PLAN > > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=142.05..144.21 rows=12 width=30) > (actual time=1314931.09..1314931.09 rows=0 loops=1) > -> Group (cost=142.05..142.66 rows=124 > width=30) > (actual time=1314931.08..1314931.08 rows=0 loops=1) > -> Sort (cost=142.05..142.36 rows=124 > width=30) (actual time=1314931.08..1314931.08 rows=0 > loops=1) > Sort Key: module > -> Index Scan using xxx_idx on xxx > (cost=0.00..137.74 rows=124 width=30) (actual > time=1314930.72..1314930.72 rows=0 loops=1) > Index Cond: ((customer_id = > 158) > AND (created >= '2004-12-01 > 15:21:51.785526-05'::timestamp with time zone) AND > ("domain" = 'xyz.com'::character varying)) > Total runtime: 1314933.16 msec > (7 rows) > > What can I try? > > Thanks, > > > > > __________________________________ > Do you Yahoo!? > Dress up your holiday email, Hollywood style. Learn > more. > http://celebrity.mail.yahoo.com > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the > unregister command > (send "unregister YourEmailAddressHere" to > majordomo@postgresql.org) > __________________________________ Do you Yahoo!? Yahoo! Mail - 250MB free storage. Do more. Manage less. http://info.mail.yahoo.com/mail_250
Hi, On Wed, Dec 22, 2004 at 01:52:40PM -0800, Litao Wu wrote: > Does the order of columns in the index matter since > more than 50% customer_id = 158? > > I think it does not in Oracle. > > Will the performance be better if I change index > xxx_idx to ("domain", customer_id, created)? Well, in Oracle this would of cause matter. Oracle calculates index usage by being able to fill all index's attributes from the left to the right. If any one attribute within is missing Oracle would not test if it is only one attribute missing, or if all other attributes are missing within the query's where clause. Normaly you'd create an index using the most frequently parametrized attributes first, then the second ones and so on. If the usage isn't that different, you would use the most granule attribute in foremost followed by the second and so on. Regards, Yann
Yann Michel <yann-postgresql@spline.de> writes: > On Wed, Dec 22, 2004 at 01:52:40PM -0800, Litao Wu wrote: >> Does the order of columns in the index matter since >> more than 50% customer_id = 158? >> >> I think it does not in Oracle. >> >> Will the performance be better if I change index >> xxx_idx to ("domain", customer_id, created)? > > Well, in Oracle this would of cause matter. Oracle calculates index > usage by being able to fill all index's attributes from the left to the > right. If any one attribute within is missing Oracle would not test if > it is only one attribute missing, or if all other attributes are missing > within the query's where clause. This depends on the version of Oracle you're using. Oracle 9i introduced Index Skip Scans: http://www.oracle.com/technology//products/oracle9i/daily/apr22.html I don't know whether pg has something similar?
On Sun, Dec 26, 2004 at 13:30:15 +0100, Karl Vogel <karl.vogel@telenet.be> wrote: > > This depends on the version of Oracle you're using. Oracle 9i > introduced Index Skip Scans: > > http://www.oracle.com/technology//products/oracle9i/daily/apr22.html > > I don't know whether pg has something similar? Postgres doesn't currently do this. There was some discussion about this not too long ago, but I don't think anyone indicated that they were going to work on it for 8.1. Postgres can use the leading part of a multikey index to start a scan, but it will just do a normal index scan with a filter.