Обсуждение: Wildly inaccurate query plan
I'm using PostgreSQL 9.0 beta 1. I've got the following table definition: # \d parts_2576 Table "public.parts_2576" Column | Type | Modifiers ------------+------------------------+----------------------------------------------------------- ID | bigint | not null default nextval('"parts_2576_ID_seq"'::regclass) binaryID | character varying(32) | not null default ''::character varying messageID | character varying(255) | not null default ''::character varying subject | character varying(512) | not null default ''::character varying fromname | character varying(512) | not null default ''::character varying date | bigint | default 0::bigint partnumber | bigint | not null default 0::bigint size | bigint | not null default 0::bigint Indexes: "parts_2576_pkey" PRIMARY KEY, btree ("ID") "binaryID_2576_idx" btree ("binaryID") "date_2576_idx" btree (date) "parts_2576_binaryID_idx" btree ("binaryID") If I run this: EXPLAIN ANALYZE SELECT SUM("size") AS totalsize, "binaryID", COUNT(*) AS parttotal, MAX("subject") AS subject, MAX("fromname") AS fromname, MIN("date") AS mindate FROM parts_2576 WHERE "binaryID" > '1082fa89fe499741b8271f9c92136f44' GROUP BY "binaryID" ORDER BY "binaryID" LIMIT 400; I get this: Limit (cost=0.00..316895.11 rows=400 width=211) (actual time=3.880..1368.936 rows=400 loops=1) -> GroupAggregate (cost=0.00..41843621.95 rows=52817 width=211) (actual time=3.872..1367.048 rows=400 loops=1) -> Index Scan using "binaryID_2576_idx" on parts_2576 (cost=0.00..41683754.21 rows=10578624 width=211) (actual time=0.284..130.756 rows=19954 loops=1) Index Cond: (("binaryID")::text > '1082fa89fe499741b8271f9c92136f44'::text) Total runtime: 1370.140 ms The first thing which strikes me is how the GroupAggregate step shows it got the 400 rows which matches the limit, but it estimated 52,817 rows. Shouldn't it have already known it would be 400? I've got an index on "binaryID" (actually, I appear to have 2), but I suspect it's not really working as intended as it's doing an evaluation on its value and those greater than it. Is there a way to optimise this like using a functional index or something? Obviously this isn't my design (duplicate indexes and mixed-case column names?), but I'd like to see if I can get things running faster. Thanks Thom
Thom Brown <thombrown@gmail.com> writes: > I get this: > Limit (cost=0.00..316895.11 rows=400 width=211) (actual > time=3.880..1368.936 rows=400 loops=1) > -> GroupAggregate (cost=0.00..41843621.95 rows=52817 width=211) > (actual time=3.872..1367.048 rows=400 loops=1) > -> Index Scan using "binaryID_2576_idx" on parts_2576 > (cost=0.00..41683754.21 rows=10578624 width=211) (actual > time=0.284..130.756 rows=19954 loops=1) > Index Cond: (("binaryID")::text > > '1082fa89fe499741b8271f9c92136f44'::text) > Total runtime: 1370.140 ms > The first thing which strikes me is how the GroupAggregate step shows > it got the 400 rows which matches the limit, but it estimated 52,817 > rows. Shouldn't it have already known it would be 400? No. Rowcount estimates are always in terms of what the node would emit if allowed to run to completion. Likewise cost. In this case both the indexscan and the groupagg are terminated early once they satisfy the limit. The planner is expecting this which is why the estimated cost for the limit node is way less than those for its inputs. That looks like a perfectly reasonable plan from here, though it would probably not get chosen with a larger limit or no limit at all, since the ultimate costs are pretty large. Essentially this is a fast-start plan rather than a lowest-total-cost plan, and that looks like the best bet for a small limit value. regards, tom lane
On 28 May 2010 19:54, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thom Brown <thombrown@gmail.com> writes: >> I get this: > >> Limit (cost=0.00..316895.11 rows=400 width=211) (actual >> time=3.880..1368.936 rows=400 loops=1) >> -> GroupAggregate (cost=0.00..41843621.95 rows=52817 width=211) >> (actual time=3.872..1367.048 rows=400 loops=1) >> -> Index Scan using "binaryID_2576_idx" on parts_2576 >> (cost=0.00..41683754.21 rows=10578624 width=211) (actual >> time=0.284..130.756 rows=19954 loops=1) >> Index Cond: (("binaryID")::text > >> '1082fa89fe499741b8271f9c92136f44'::text) >> Total runtime: 1370.140 ms > >> The first thing which strikes me is how the GroupAggregate step shows >> it got the 400 rows which matches the limit, but it estimated 52,817 >> rows. Shouldn't it have already known it would be 400? > > No. Rowcount estimates are always in terms of what the node would emit > if allowed to run to completion. Likewise cost. In this case both the > indexscan and the groupagg are terminated early once they satisfy the > limit. The planner is expecting this which is why the estimated cost > for the limit node is way less than those for its inputs. > > That looks like a perfectly reasonable plan from here, though it would > probably not get chosen with a larger limit or no limit at all, since > the ultimate costs are pretty large. Essentially this is a fast-start > plan rather than a lowest-total-cost plan, and that looks like the > best bet for a small limit value. > > regards, tom lane You're absolutely right, it's not chosen when without limit. I see what you mean though about terminating once it has enough rows. It's a shame I can't optimise it though as the real case that runs is with a limit of 4000 which takes a long time to complete. Thanks Thom
Thom Brown wrote: > It's a shame I can't optimise it though as the real case that runs > is with a limit of 4000 which takes a long time to complete. Perhaps you should post the real case. -Kevin