Обсуждение: Strange query plan
Tom, I have a problem with slow query execution (postgresql 7.1.2): There are 2 tables - idx, msg_prt: bug=# \dt List of relations Name | Type | Owner ---------+-------+--------idx | table | megeramsg_prt | table | megera (2 rows) bug=# \d idx Table "idx"Attribute | Type | Modifier -----------+---------+----------tid | integer |lid | integer |did | integer | Index: idxidx bug=# \d msg_prt Table "msg_prt"Attribute | Type | Modifier -----------+---------+----------tid | integer | Index: mprt_tid Also there are 2 indexes - idxidx, mprt_tid bug=# \d idxidx Index "idxidx"Attribute | Type -----------+---------lid | integerdid | integertid | integer unique btree bug=# \d mprt_tid Index "mprt_tid"Attribute | Type -----------+---------tid | integer unique btree Query is: select msg_prt.tid as mid from msg_prt where exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1 and idx.lid in (1207,59587) ) Plan for this query looks very ineffective and query is very slow: select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1and idx.lid in (1207,59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..119090807.13 rows=69505 width=4) SubPlan -> Index Scan using idxidx, idxidx on idx (cost=0.00..1713.40rows=1 width=4) total: 6.80 sec; number: 1; for one: 6.796 sec; Statistics on tables: idx - 103651 rows msg_prt - 69505 rows There are only 16 rows in 'idx' table satisfied subselect condition. I did vacuum analyze. Adding another index 'create index tididx on idx (tid);' helps: select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1and idx.lid in (1207,59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..1134474.94 rows=69505 width=4) SubPlan -> Index Scan using tididx on idx (cost=0.00..16.31rows=1 width=4) total: 1.71 sec; number: 1; for one: 1.711 sec; but still plan looks ineffective. The best plan I've got eliminating IN predicate: select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1and idx.lid = 1207 and idx.lid=59587 ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..167368.47 rows=69505 width=4) SubPlan -> Index Scan using idxidx on idx (cost=0.00..2.39rows=1 width=4) total: 0.54 sec; number: 1; for one: 0.541 sec; Unfortunately I can't use this way in general case. Does it's a known problem ? data+schema is available from http://www.sai.msu.su/~megera/postgres/data/bug.dump.gz It's about 500Kb ! Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes:
> should be
> select msg_prt.tid as mid from msg_prt
> where exists (select idx.tid from idx where msg_prt.tid=idx.tid
> and idx.did=1 and ( idx.lid = 1207 or idx.lid=59587 ));
> but this is not a big win.
Shouldn't be any win at all: the IN expression-list notation will get
translated to exactly that form.
> Anyway, what's about original query ?
IN/EXISTS subqueries suck. This has been true for a long time and is
going to be true for a while longer, unless someone else fixes it before
I have a chance to look at it. See if you can't rewrite your query as
a plain join.
regards, tom lane
On Tue, 5 Jun 2001, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: > > should be > > select msg_prt.tid as mid from msg_prt > > where exists (select idx.tid from idx where msg_prt.tid=idx.tid > > and idx.did=1 and ( idx.lid = 1207 or idx.lid=59587 )); > > but this is not a big win. > > Shouldn't be any win at all: the IN expression-list notation will get > translated to exactly that form. Sure > > > Anyway, what's about original query ? > > IN/EXISTS subqueries suck. This has been true for a long time and is > going to be true for a while longer, unless someone else fixes it before > I have a chance to look at it. See if you can't rewrite your query as > a plain join. > That's why we've moved to GiST and feel fine :-) That query we used in our old project, now swithing to GiST. > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes:
> select msg_prt.tid as mid from msg_prt
> where exists (select idx.tid from idx where msg_prt.tid=idx.tid
> and idx.did=1 and idx.lid in (1207,59587) )
> NOTICE: QUERY PLAN:
> Seq Scan on msg_prt (cost=0.00..119090807.13 rows=69505 width=4)
> SubPlan
> -> Index Scan using idxidx, idxidx on idx (cost=0.00..1713.40 rows=1 width=4)
Actually, this example does reveal an unnecessary inefficiency: the
planner is only using the "idx.lid in (1207,59587)" clause for the
indexscan, ignoring the fact that the did and tid clauses match the
additional columns of your three-column index. The attached patch
should improve matters.
regards, tom lane
*** src/backend/optimizer/path/indxpath.c.orig Sun May 20 16:28:18 2001
--- src/backend/optimizer/path/indxpath.c Tue Jun 5 12:38:21 2001
***************
*** 397,403 **** clause, false); }
! /* * Given an OR subclause that has previously been determined to match * the specified index, extract a list of
specificopclauses that can be * used as indexquals.
--- 397,403 ---- clause, false); }
! /*---------- * Given an OR subclause that has previously been determined to match * the specified index, extract a
listof specific opclauses that can be * used as indexquals.
***************
*** 406,415 **** * given opclause. However, if the OR subclause is an AND, we have to * scan it to find the
opclause(s)that match the index. (There should * be at least one, if match_or_subclause_to_indexkey succeeded, but
there
! * could be more.) Also, we apply expand_indexqual_conditions() to convert
! * any special matching opclauses to indexable operators. * * The passed-in clause is not changed. */ List *
extract_or_indexqual_conditions(RelOptInfo*rel,
--- 406,430 ---- * given opclause. However, if the OR subclause is an AND, we have to * scan it to find the
opclause(s)that match the index. (There should * be at least one, if match_or_subclause_to_indexkey succeeded, but
there
! * could be more.)
! *
! * Also, we can look at other restriction clauses of the rel to discover
! * additional candidate indexquals: for example, consider
! * ... where (a = 11 or a = 12) and b = 42;
! * If we are dealing with an index on (a,b) then we can include the clause
! * b = 42 in the indexqual list generated for each of the OR subclauses.
! * Essentially, we are making an index-specific transformation from CNF to
! * DNF. (NOTE: when we do this, we end up with a slightly inefficient plan
! * because create_indexscan_plan is not very bright about figuring out which
! * restriction clauses are implied by the generated indexqual condition.
! * Currently we'll end up rechecking both the OR clause and the transferred
! * restriction clause as qpquals. FIXME someday.)
! *
! * Also, we apply expand_indexqual_conditions() to convert any special
! * matching opclauses to indexable operators. * * The passed-in clause is not changed.
+ *---------- */ List * extract_or_indexqual_conditions(RelOptInfo *rel,
***************
*** 417,470 **** Expr *orsubclause) { List *quals = NIL;
! if (and_clause((Node *) orsubclause)) {
! /*
! * Extract relevant sub-subclauses in indexkey order. This is
! * just like group_clauses_by_indexkey() except that the input and
! * output are lists of bare clauses, not of RestrictInfo nodes.
! */
! int *indexkeys = index->indexkeys;
! Oid *classes = index->classlist;
! do {
! int curIndxKey = indexkeys[0];
! Oid curClass = classes[0];
! List *clausegroup = NIL;
! List *item;
! foreach(item, orsubclause->args) { if (match_clause_to_indexkey(rel, index,
curIndxKey, curClass,
! lfirst(item), false))
! clausegroup = lappend(clausegroup, lfirst(item)); }
! /*
! * If no clauses match this key, we're done; we don't want to
! * look at keys to its right.
! */
! if (clausegroup == NIL)
! break;
!
! quals = nconc(quals, clausegroup);
!
! indexkeys++;
! classes++;
! } while (!DoneMatchingIndexKeys(indexkeys, index));
!
! if (quals == NIL)
! elog(ERROR, "extract_or_indexqual_conditions: no matching clause");
! }
! else
! {
! /* we assume the caller passed a valid indexable qual */
! quals = makeList1(orsubclause);
! } return expand_indexqual_conditions(quals); }
--- 432,503 ---- Expr *orsubclause) { List *quals = NIL;
+ int *indexkeys = index->indexkeys;
+ Oid *classes = index->classlist;
! /*
! * Extract relevant indexclauses in indexkey order. This is essentially
! * just like group_clauses_by_indexkey() except that the input and
! * output are lists of bare clauses, not of RestrictInfo nodes.
! */
! do {
+ int curIndxKey = indexkeys[0];
+ Oid curClass = classes[0];
+ List *clausegroup = NIL;
+ List *item;
! if (and_clause((Node *) orsubclause))
! {
! foreach(item, orsubclause->args)
! {
! Expr *subsubclause = (Expr *) lfirst(item);
! if (match_clause_to_indexkey(rel, index,
! curIndxKey, curClass,
! subsubclause, false))
! clausegroup = lappend(clausegroup, subsubclause);
! }
! }
! else if (match_clause_to_indexkey(rel, index,
! curIndxKey, curClass,
! orsubclause, false)) {
! clausegroup = makeList1(orsubclause);
! }
! /*
! * If we found no clauses for this indexkey in the OR subclause
! * itself, try looking in the rel's top-level restriction list.
! */
! if (clausegroup == NIL)
! {
! foreach(item, rel->baserestrictinfo) {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(item);
+ if (match_clause_to_indexkey(rel, index, curIndxKey,
curClass,
! rinfo->clause, false))
! clausegroup = lappend(clausegroup, rinfo->clause); }
+ }
! /*
! * If still no clauses match this key, we're done; we don't want to
! * look at keys to its right.
! */
! if (clausegroup == NIL)
! break;
!
! quals = nconc(quals, clausegroup);
!
! indexkeys++;
! classes++;
! } while (!DoneMatchingIndexKeys(indexkeys, index));
!
! if (quals == NIL)
! elog(ERROR, "extract_or_indexqual_conditions: no matching clause"); return
expand_indexqual_conditions(quals);}
On Tue, 5 Jun 2001, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: > > select msg_prt.tid as mid from msg_prt > > where exists (select idx.tid from idx where msg_prt.tid=idx.tid > > and idx.did=1 and idx.lid in (1207,59587) ) > > NOTICE: QUERY PLAN: > > > Seq Scan on msg_prt (cost=0.00..119090807.13 rows=69505 width=4) > > SubPlan > > -> Index Scan using idxidx, idxidx on idx (cost=0.00..1713.40 rows=1 width=4) > > Actually, this example does reveal an unnecessary inefficiency: the > planner is only using the "idx.lid in (1207,59587)" clause for the > indexscan, ignoring the fact that the did and tid clauses match the > additional columns of your three-column index. The attached patch > should improve matters. > > regards, tom lane Cool. Looks better select msg_prt.tid as mid from msg_prt where exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1 andidx.lid in ( 1207, 59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..333700.88 rows=69505 width=4) SubPlan -> Index Scan using idxidx, idxidx on idx (cost=0.00..4.79rows=1 width=4) total: 3.15 sec; number: 1; for one: 3.153 sec; interesting that droping index 'idxidx' and creating simple create index tididx on idx (tid); behaves better, while plan looks worse. Notice, index on tididx estimates cost better (16). select msg_prt.tid as mid from msg_prt where exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1 andidx.lid in ( 1207, 59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..1134474.94 rows=69505 width=4) SubPlan -> Index Scan using tididx on idx (cost=0.00..16.31rows=1 width=4) total: 1.70 sec; number: 1; for one: 1.703 sec; Interesting that earlier if I have 2 indexes - idxidx and tididx optimizer choose tididx, while now (after patching) optimizer always choose idxidx. Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83