Обсуждение: [HACKERS] propose to pushdown qual into EXCEPT
Recently, we find PG fails to generate an effective plan for following SQL:
select * from (select * from table1 execpt select * from table2) as foo where foo.a > 0;
Because PG does not pushdown qual to the none of the subquery. And I check the source code, find some comments in src/backend/optimizer/path/allpaths.c, which says "If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push quals into it, because that could change the results".
However, for this case, I think we can pushdown qual to the left most subquery of EXCEPT, just like other database does. And we can get an more effective plan such as:
postgres=# explain select * from (select * from table1 except select * from table2) as foo where foo.a > 0;
QUERY PLAN
----------------------------------------------------------------------------------------
Subquery Scan on foo (cost=0.00..118.27 rows=222 width=8)
-> HashSetOp Except (cost=0.00..116.05 rows=222 width=12)
-> Append (cost=0.00..100.98 rows=3013 width=12)
-> Subquery Scan on "*SELECT* 1" (cost=0.00..45.78 rows=753 width=12)
-> Seq Scan on table1 (cost=0.00..38.25 rows=753 width=8)
Filter: (a > 0)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..55.20 rows=2260 width=12)
-> Seq Scan on table2 (cost=0.00..32.60 rows=2260 width=8)
(8 rows)
And the attached patch is a draft, it works for this case.
------------------
Jerry Yu
https://github.com/scarbrofair
Вложения
"Armor" <yupengstone@qq.com> writes: > Because PG does not pushdown qual to the none of the subquery. And I check the source code, find some comments in src/backend/optimizer/path/allpaths.c,which says "If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot pushquals into it, because that could change the results". > However, for this case, I think we can pushdown qual to the left most subquery of EXCEPT, just like other databasedoes. That is not an adequate argument for such a change being okay. Postgres, with its extensible set of datatypes, has to be much more careful about the semantic soundness of optimizations than some other DBs do. The existing behavior here dates to commit 0201dac1c319599a, which was inspired by this thread: https://www.postgresql.org/message-id/flat/46C15C39FEB2C44BA555E356FBCD6FA48879A6%40m0114.s-mxs.net (BTW, several of the concrete examples discussed in that thread no longer apply because of later changes. But the problem still exists.) We had convinced ourselves that pushing down a qual into UNION and INTERSECT cases is okay even if the qual can distinguish rows that the setop comparisons see as equal, because you would get results consistent with the setop having chosen some legitimate set of representative row(s) for each group of "duplicate" rows. However that did not seem to apply to EXCEPT, or at least I wasn't convinced enough to risk it. I'm still not, without a worked-out argument as to why it's okay. In particular I'd like to see a proof that establishes (a) why it is or is not okay to push into the right-hand side of EXCEPT, and (b) whether ALL does or does not make a difference. Now you could argue that we threw all these fine points out the window with respect to window functions in commit d222585a9f7a18f2, so maybe it's okay to do it with respect to EXCEPT as well. But that would lead to deciding it's okay to push into both sides of EXCEPT, which is still not what this patch does. Anyway I'm not very pleased by the idea that we'd hold EXCEPT to a weaker semantic standard than UNION and INTERSECT. regards, tom lane
Tom Lane wrote: > "Armor" <yupengstone@qq.com> writes: > > Because PG does not pushdown qual to the none of the subquery. And I check the source code, find some comments insrc/backend/optimizer/path/allpaths.c, which says "If the subquery contains EXCEPT or EXCEPT ALL set ops we cannotpush quals into it, because that could change the results". > > However, for this case, I think we can pushdown qual to the left most subquery of EXCEPT, just like other databasedoes. > > That is not an adequate argument for such a change being okay. Postgres, > with its extensible set of datatypes, has to be much more careful about > the semantic soundness of optimizations than some other DBs do. Can we use the record_image_ops mechanism here? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> That is not an adequate argument for such a change being okay. Postgres, >> with its extensible set of datatypes, has to be much more careful about >> the semantic soundness of optimizations than some other DBs do. > Can we use the record_image_ops mechanism here? Don't see how that helps? But I went through the logic and think I convinced myself that pushdown to LHS only is okay, and no more questionable than what happens for INTERSECT. Argue thus: Assume a group of non-distinct-according-to-the-setop rows includes m rows from left-hand side, n rows from right-hand side, of which the upper qual would pass m' <= m and n' <= n rows respectively. UNION: unoptimized setop produces 1 row (since we may assume the group is nonempty). That row might or might not pass the upper qual, so we get 0 or 1 row out. User can be certain about what will happen only if m'=m and n'=n (row must pass qual) or m'=n'=0 (row must not pass qual). After optimizing by pushing qual down: we get one row if m' > 0 or n' > 0. This is equivalent to the unoptimized behavior if we assume that the setop always magically selects a representative row passing the qual if possible, so OK. INTERSECT: unoptimized setop produces 1 row if m>0 and n>0, which again might or might not pass the upper qual, so 0 or 1 row out. Optimized case: we get 1 row if m' > 0 and n' > 0. This is equivalent to unoptimized behavior if you allow that the representative row could have been taken from either input. (For example, if m' = m, n > 0, and n' = 0, producing 0 rows is legit only if the hypothetical representative row was taken from RHS.) That's legal per SQL, though I'm unsure offhand if our implementation actually would do that. INTERSECT ALL: unoptimized setop produces min(m,n) rows, which might or might not pass the upper qual, so anywhere from 0 to min(m,n) rows out. (It is probably possible to set a tighter upper bound when m', n' are small, but I've not bothered to reason that out.) Optimized case: we get min(m', n') rows out, which seems fine; again it could be produced by the unoptimized implementation selecting representative row(s) that happen to have that many members passing the upper qual. But actually, our implementation doesn't produce some random collection of min(m,n) elements of the row group. What you get is min(m,n) copies of a single representative row; therefore, assuming the upper qual is nonvolatile, either they all pass or none do, so the actual output is either 0 or min(m,n) duplicates. (The spec says "Let R be a row that is a duplicate of some row in ET1 or of some row in ET2 or both. Let m be the number of duplicates of R in ET1 and let n be the number of duplicates of R in ET2, where m >= 0 and n >= 0. [For INTERSECT ALL] the number of duplicates of R that T contains is the minimum of m and n." So depending on what you want to assume that "duplicates of R" means, you could argue that the spec actually requires our implementation's behavior here. It certainly doesn't appear to forbid it.) The optimized output is therefore distinguishable from unoptimized, because it could produce a number of rows that the unoptimized implementation cannot. But nobody's complained about it, and it's hard to see there being a legitimate beef. EXCEPT: unoptimized setop produces 1 row if m>0 and n=0, which again might or might not pass the upper qual, so 0 or 1 row out. If we were to push qual into both sides: we'd get one row if m' > 0 and n' = 0. So if the qual passes some of LHS and none of RHS, we'd get a row, breaking SQL semantics. If we push into LHS only: we get one row if m' > 0 and n = 0. Certainly then m > 0. So this is equivalent to behavior where the setop magically selects a representative LHS row passing the qual. OTOH, if m' = 0, then the setop wouldn't produce a row, but if it had that row would have failed the upper qual, so behavior is still equivalent. EXCEPT ALL: unoptimized setop produces max(m-n,0) rows, which might or might not pass the upper qual, so anywhere from 0 to max(m-n,0) rows out. (Again it is probably possible to set a tighter upper bound when m', n' are small.) Again, pushing into RHS would allow n to be reduced, perhaps allowing more rows to be returned than SQL would allow, so we can't do that. If we push into LHS only: we get max(m'-n,0) rows. Again, it would be possible for the unoptimized setop to select its representative rows in such a way that this happens (it just has to prefer LHS rows passing the qual over those that don't). Again, this doesn't quite match reality of the current setop implementation, since it returns max(m-n,0) identical rows, not that many random members of the current group. But it's still hard to believe anyone would have a legitimate beef about it. tl;dr: I now think what the patch proposes to do is legit. There's a heck of a lot of work to be done on the comments it's falsified, though. regards, tom lane
I wrote: > tl;dr: I now think what the patch proposes to do is legit. There's a heck > of a lot of work to be done on the comments it's falsified, though. Hmm, wait a minute. I mentioned before that what nodeSetOp.c actually returns is N copies of the same representative tuple. That in itself doesn't break the proposed optimization, or at least so I argued --- but the real question is which representative tuple does it pick? The answer, as noted in the file header comment, is * Note that SetOp does no qual checking nor projection. The delivered* output tuples are just copies of the first-to-arrivetuple in each* input group. In HASHED mode, the first-to-arrive tuple must be from the lefthand input, which would mean that it's passed the pushed-down qual, so all is well. (If no LHS tuples exist in a given group, then EXCEPT won't output anything, so the fact that it could have collected a representative tuple from the RHS doesn't matter.) However, in SORTED mode, I don't see that there's anything particularly guaranteeing the order in which tuples arrive within a sort group. If the sort isn't stable, and I don't think all our sorting paths are, it would be possible to return an RHS tuple as the representative one. This breaks the proposed optimization because it would become possible to return a tuple that doesn't pass the pushed-down qual at all. There are at least two ways this could be dealt with. We could add the flag column as a low-order sort column so that it's still guaranteed that LHS tuples arrive before RHS ones within a group. (This'd complicate matters in generate_nonunion_path because now the sort keys would be different from the setop node's own grouping keys, but it's certainly possible.) Or we could fix it at runtime by complicating setop_retrieve_direct so that it replaces the representative tuple with the first LHS tuple when that arrives. Either way, though, more work is needed than just hacking the qual pushdown logic. regards, tom lane