Обсуждение: BUG #4926: too few pathkeys for mergeclauses
The following bug has been logged online: Bug reference: 4926 Logged by: Roman Kononov Email address: kononov@ftml.net PostgreSQL version: 8.4.0 Operating system: Linux x86_64 Description: too few pathkeys for mergeclauses Details: test=# create table junk(i int); CREATE TABLE test=# select * from junk left outer join (select coalesce(i,1) as x, coalesce(i,2) as y from junk) t on coalesce(i,3)=x and coalesce(i,4)=y and coalesce(i,5)=x; ERROR: too few pathkeys for mergeclauses test=# select * from junk left outer join (select coalesce(i,1) as x, coalesce(i,2) as y from junk) t on coalesce(i,3)=x and coalesce(i,4)=y and coalesce(i,5)=y; i | x | y ---+---+--- (0 rows) I think, the first query should be handled the same way second is.
On Thu, Jul 16, 2009 at 9:07 PM, Roman Kononov<kononov@ftml.net> wrote: > > test=3D# create table junk(i int); > CREATE TABLE > test=3D# select * from junk left outer join (select coalesce(i,1) as x, > coalesce(i,2) as y from junk) t on coalesce(i,3)=3Dx and coalesce(i,4)=3D= y and > coalesce(i,5)=3Dx; > ERROR: =A0too few pathkeys for mergeclauses Thanks for the bug report. That's definitely not supposed to be happening. It's always nice when it's easy to reproduce the problem like this. --=20 greg http://mit.edu/~gsstark/resume.pdf
Greg Stark wrote: > On Thu, Jul 16, 2009 at 9:07 PM, Roman Kononov<kononov@ftml.net> wrote: >> test=# create table junk(i int); >> CREATE TABLE >> test=# select * from junk left outer join (select coalesce(i,1) as x, >> coalesce(i,2) as y from junk) t on coalesce(i,3)=x and coalesce(i,4)=y and >> coalesce(i,5)=x; >> ERROR: too few pathkeys for mergeclauses > > Thanks for the bug report. That's definitely not supposed to be > happening. It's always nice when it's easy to reproduce the problem > like this. Yep. This can be further reduced into this: CREATE TABLE a (i integer); CREATE TABLE b (x integer, y integer); select * from a left outer join b on i=x and i=y and i=x; The planner is choosing a merge join, where the outer side (table a) is sorted by (i), and the inner side is sorted by (x, y). But that doesn't work with the merge condition (i=x AND i=y AND i=x). Version 8.3 has the same bug, apparently introduced along with the equivalence classes. In 8.2, the merge condition is reduced into (i=x AND i=y), IOW the planner eliminates the duplicate condition. I believe 8.2 would otherwise have the same problem as well. I can see two different things that you could say is at fault here: 1. We no longer eliminate the duplicate condition, but the find_mergeclauses_for_pathkeys() + make_inner_pathkeys_for_merge() combination relies on there being no duplicates. We should try harder to eliminate duplicates in left join clauses. 2. make_inner_pathkeys_for_merge() should have created sort order (x, y, x) for the inner side. The first solution is what we probably want, to avoid unnecessary work at execution time. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > 2. make_inner_pathkeys_for_merge() should have created sort order (x, y, > x) for the inner side. On further thought, that would make no sense. Sort order (x, y) is always equivalent to (x, y, x). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Version 8.3 has the same bug, apparently introduced along with the > equivalence classes. In 8.2, the merge condition is reduced into (i=x > AND i=y), IOW the planner eliminates the duplicate condition. I believe > 8.2 would otherwise have the same problem as well. The fact that 8.2 eliminates the redundant condition is more or less accidental, I think. It's using equal() to detect duplicate RestrictInfos coming up from the two input relations for the join, while later versions rely on pointer equality for that. You can fool 8.2 by commuting the duplicate condition, but it still doesn't fail: regression=# explain select * from a left outer join b on i=x and i=y and x=i; QUERY PLAN ----------------------------------------------------------------- Merge Left Join (cost=285.12..325.93 rows=2140 width=12) Merge Cond: ((a.i = b.x) AND (a.i = b.y) AND (a.i = b.x)) -> Sort (cost=149.78..155.13 rows=2140 width=4) Sort Key: a.i -> Seq Scan on a (cost=0.00..31.40 rows=2140 width=4) -> Sort (cost=135.34..140.19 rows=1940 width=8) Sort Key: b.x, b.y -> Seq Scan on b (cost=0.00..29.40 rows=1940 width=8) (8 rows) I think what this case may show is simply that the consistency checking I added to create_mergejoin_plan in 8.3 is too strict. Not quite convinced yet though. Another possible solution for this particular case is to allow the equivclass code to deduce x=y as an equivalence class, that is the plan should enforce that check at the scan of b and then just have one sort key for the merge. Not sure how complicated that is, however, and in any case it may not fix every possible failure case for create_mergejoin_plan. regards, tom lane
I wrote: > I think what this case may show is simply that the consistency > checking I added to create_mergejoin_plan in 8.3 is too strict. > Not quite convinced yet though. After further review I think that is the correct approach to take. The proximate cause of the problem is that find_mergeclauses_for_pathkeys is selecting an order for the merge clauses that corresponds to a noncanonical pathkey list for the inner relation (to wit, x, y, x). While it would be possible in this particular example to put the clauses in x, x, y order instead, I don't think that is necessarily possible in every case. The clause ordering is constrained by the outer pathkeys and what we have here is a demonstration that the inner pathkeys needn't match the outer ones one-to-one. So you could have a clause that references an inner pathkey that is also referenced by some earlier clause that matches a different outer pathkey, and there won't be any way to make them adjacent. By the time the plan gets to create_mergejoin_plan, the inner pathkey list has been reduced to canonical form (x, y), but *this does not represent any actual change in sort order*. (Which is why there's no actual bug in 8.2 and before, which blithely generate plans that involve such "incorrect" mergeclause orderings.) So I think we should just weaken the checks in create_mergejoin_plan to allow such cases, ie, each mergeclause should be allowed to match any already-used inner pathkey. The other approach we could possibly take is to have find_mergeclauses_for_pathkeys reject candidate mergeclauses that produce out-of-order inner pathkeys, but that would break at least this Assert at joinpath.c:272: /* Should have used them all... */ Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list)); and it'd be rather expensive to test for anyway. regards, tom lane
"Roman Kononov" <kononov@ftml.net> writes: > Description: too few pathkeys for mergeclauses I've applied a patch for this. Thanks for the report. regards, tom lane