Обсуждение: Improving planner variable handling
I've been thinking about how to improve the planner's poor handling of variables in outer-join situations. Here are some past examples for motivation: http://archives.postgresql.org/pgsql-hackers/2006-02/msg00154.php http://archives.postgresql.org/pgsql-general/2008-03/msg01440.php The reason why the planner acts so stupidly in these examples is that we're still using a kluge solution for this old bug: http://archives.postgresql.org/pgsql-bugs/2001-04/msg00223.php The root of the problem is that the planner does not worry about computing output expressions until the top of a plan tree. All lower-level join nodes are made to output simple lists of Vars referencing columns of the base relations of the join. We handle outer-join cases by forcing the values of the Vars of the nullable side to null at the level of the join, whenever there's no matching row in the nullable side. If one of the base relations of the join is a sub-SELECT whose output list includes expressions that don't certainly go to null when the input variables are forced to null, then we can't flatten that sub-SELECT, because flattening the sub-SELECT means that the expression evaluations bubble to the top of the plan tree and can produce non-null results when they shouldn't (as happened in the above bug, before we realized that we had to prevent flattening in this case). Another problem with this approach is that depending on what level of the plan tree you are thinking about, a Var nominally referencing tab.col might really mean the value of tab.col, or it might mean "either tab.col or NULL depending on what happened at some lower level of outer join". Since we can't readily tell the difference, we have estimation errors arising from failure to expect some NULLs (there have been recent complaints about this), and we need some pretty ugly kluges in places like EquivalenceClass processing to handle the risk that apparently identical expressions might not really be equal. I think the basic solution for this is that upper levels of the plan tree should refer to the nullable output columns of an outer join using "alias Vars" that name the join rel, not the underlying base relation, even if there is a simple base-relation Var that the alias is tracking. In the case involving a sub-SELECT, the alias Var would stand for whatever output expression appears in the sub-SELECT. We already have the concept of these alias Vars, in fact --- that's exactly the representation emitted by the parser. But historically the planner has smashed aliases down to their base Vars as early as possible (see flatten_join_alias_vars). That has some advantages but I'm thinking it's outweighed by the disadvantages. I'd like to try leaving alias Vars as aliases all the way through the planner, in any case where they might be semantically different from their referent (ie, whenever there's a possible force-to-null involved). To make this work, we'd need to have the constructed plan tree compute the alias Var from its referent expression at the lowest outer-join that can null the alias Var. The trick for the executor is to know when to force the value to null instead of computing the expression. I first thought about marking entries of the join node's targetlist as to be forced to null if left or right input row is null. However, that fails if we want the join node to compute some projection expressions on top of the raw join output (as would certainly happen if it were the top node of the tree, for example). That could be handled by inserting another level of plan node (ie, a Result) to do the projection, but that seems a pretty ugly and inefficient solution. What I have in mind instead is to insert a new kind of expression node "ForceToNull" atop the referent expression, with this node able to look at the EState (in the same way a regular Var node would) to see if it should return a null instead of computing its child expression. Then expansion of an alias Var into a ForceToNull and the underlying expression would work. I'm envisioning keeping track of active alias Vars and their expansions in a new list attached to the PlannerInfo "root" node. This would provide a place to record important information like which level of the join tree such a Var needs to be evaluated at. This is all pretty handwavy yet, but I don't think I'll be able to fill in many more details until I try to code it. I thought I'd put up this summary to see if anyone can shoot holes in it at this level of detail ... Comments? regards, tom lane
I wonder if this would help to clean up the equivalence class hacks in Greg's ordered append patch? Tom Lane wrote: > I've been thinking about how to improve the planner's poor handling of > variables in outer-join situations. Here are some past examples for > motivation: > > http://archives.postgresql.org/pgsql-hackers/2006-02/msg00154.php > http://archives.postgresql.org/pgsql-general/2008-03/msg01440.php > > The reason why the planner acts so stupidly in these examples is that > we're still using a kluge solution for this old bug: > http://archives.postgresql.org/pgsql-bugs/2001-04/msg00223.php > > The root of the problem is that the planner does not worry about computing > output expressions until the top of a plan tree. All lower-level join > nodes are made to output simple lists of Vars referencing columns of the > base relations of the join. We handle outer-join cases by forcing the > values of the Vars of the nullable side to null at the level of the join, > whenever there's no matching row in the nullable side. If one of the base > relations of the join is a sub-SELECT whose output list includes > expressions that don't certainly go to null when the input variables are > forced to null, then we can't flatten that sub-SELECT, because flattening > the sub-SELECT means that the expression evaluations bubble to the top of > the plan tree and can produce non-null results when they shouldn't > (as happened in the above bug, before we realized that we had to prevent > flattening in this case). > > Another problem with this approach is that depending on what level of the > plan tree you are thinking about, a Var nominally referencing tab.col > might really mean the value of tab.col, or it might mean "either tab.col > or NULL depending on what happened at some lower level of outer join". > Since we can't readily tell the difference, we have estimation errors > arising from failure to expect some NULLs (there have been recent > complaints about this), and we need some pretty ugly kluges in places like > EquivalenceClass processing to handle the risk that apparently identical > expressions might not really be equal. > > I think the basic solution for this is that upper levels of the plan tree > should refer to the nullable output columns of an outer join using > "alias Vars" that name the join rel, not the underlying base relation, > even if there is a simple base-relation Var that the alias is tracking. > In the case involving a sub-SELECT, the alias Var would stand for whatever > output expression appears in the sub-SELECT. We already have the concept > of these alias Vars, in fact --- that's exactly the representation emitted > by the parser. But historically the planner has smashed aliases down to > their base Vars as early as possible (see flatten_join_alias_vars). > That has some advantages but I'm thinking it's outweighed by the > disadvantages. I'd like to try leaving alias Vars as aliases all the > way through the planner, in any case where they might be semantically > different from their referent (ie, whenever there's a possible > force-to-null involved). > > To make this work, we'd need to have the constructed plan tree compute the > alias Var from its referent expression at the lowest outer-join that can > null the alias Var. The trick for the executor is to know when to force > the value to null instead of computing the expression. I first thought > about marking entries of the join node's targetlist as to be forced to > null if left or right input row is null. However, that fails if we want > the join node to compute some projection expressions on top of the raw > join output (as would certainly happen if it were the top node of the > tree, for example). That could be handled by inserting another level of > plan node (ie, a Result) to do the projection, but that seems a pretty > ugly and inefficient solution. What I have in mind instead is to insert a > new kind of expression node "ForceToNull" atop the referent expression, > with this node able to look at the EState (in the same way a regular Var > node would) to see if it should return a null instead of computing its > child expression. Then expansion of an alias Var into a ForceToNull and > the underlying expression would work. > > I'm envisioning keeping track of active alias Vars and their expansions in > a new list attached to the PlannerInfo "root" node. This would provide a > place to record important information like which level of the join tree > such a Var needs to be evaluated at. > > This is all pretty handwavy yet, but I don't think I'll be able to fill in > many more details until I try to code it. I thought I'd put up this > summary to see if anyone can shoot holes in it at this level of detail ... > Comments? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki@enterprisedb.com> writes: > I wonder if this would help to clean up the equivalence class hacks in > Greg's ordered append patch? Pretty hard to say at this early stage, but the more I think about it the more I like the idea of never using the same Var to represent two values that aren't guaranteed equal. The maybe-its-null business has been a thorn in the side of any sort of close semantic analysis for a long time, but somehow I never perceived what a bogus idea that was. I'm almost more excited about that than about fixing the original problem ... regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Heikki Linnakangas <heikki@enterprisedb.com> writes: >> I wonder if this would help to clean up the equivalence class hacks in >> Greg's ordered append patch? > > Pretty hard to say at this early stage, I've started to have some doubts myself about stuffing all these vars into the same equivalence class. My concern here is actually performance. I'm concerned that if you use a partitioned table within a larger query once we've generated the paths for the partitioned table we don't want to then carry that baggage of hundreds or possibly thousands of variables for planning the whole rest of the query above that rel. They'll never be relevant again after that. The more these ideas start fitting together in my head the more I think Tom's original instinct would perform better. I was concerned that pulling up and pushing down pathkey lists sounded like a lot of extra work. But at least you would only have to pay that for that one level, not carry it along and pay for it for the rest of the planning. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
I wrote: > I've been thinking about how to improve the planner's poor handling of > variables in outer-join situations. > ... > I think the basic solution for this is that upper levels of the plan tree > should refer to the nullable output columns of an outer join using > "alias Vars" that name the join rel, not the underlying base relation, After further thought about this, and about the new "ForceToNull" expression node that I was suggesting, I have a more radical proposal in mind: let's get rid of join alias variables, instead of expanding their use. What I'm now envisioning is that the parser would generate Vars that always refer to base relations, never to join nodes; and if the reference appears above an outer join that can null the variable, plaster a ForceToNull node atop the Var. ForceToNull would carry the relid (rangetable index) of the RTE_JOIN RTE for the outer join, so that it expresses exactly which outer join has done the nulling. Actually, rather than just an index of one join RTE, ForceToNull should carry a bitmapset of relids of multiple outer joins, in case we are looking down through multiple outer joins to the base relation. The advantage of doing it that way instead of stacking several ForceToNull nodes is that the representation doesn't change if we change the order of application of the outer joins. In this approach ForceToNull is carried all the way through the parsing/planning process, rather than being inserted on-the-fly in some late stage of the planner. At the last stage of the planner (setrefs.c) we could mark it or modify it in join plan nodes to let the executor know which side of the join needs to be checked to decide whether to null at execution time. This representation has the nice property that two Var-and-optional- ForceToNull expression trees are equal() if and only if they are semantically equivalent --- a property we don't have right now, either before or after smashing join alias vars to base vars (although it's worse without doing that, which is why the planner is doing it currently). So we don't need flatten_join_alias_vars anymore. There is one corner case that doesn't fit into this nicely, which is merged output columns from FULL JOIN USING. Currently we represent those as join alias Vars initially, and expand them into "COALESCE(left-side-var, right-side-var)" during flatten_join_alias_vars. We could keep doing that, since the planner has no great intelligence about FULL JOIN anyway. But I was hoping to get rid of the flatten_join_alias_vars pass altogether. Perhaps it is worth adding a special parsetree representation for these things --- I'm imagining something roughly like ForceToNull but with two inputs not one. I think the only reason we need a special representation at all is so that ruleutils.c can decompile it as a Var reference rather than COALESCE(). This representation makes ruleutils.c's decompilation job harder, since it's no longer clear from inspection of a Var node which RTE entry it should be displayed with reference to. (If the base relation is underneath an aliased JOIN then we *must* reference the JOIN instead, not the base rel.) But it's clearly possible, and I'm happy to push complexity into decompilation if it means savings in the main parse/plan code path. Another nice thing is we won't need to widen AttrNumber; in fact, I don't think we need to permanently store any per-Var data in join RTEs, except maybe for those darn FULL JOIN USING vars. Variables' varattnos only refer to base relations and their range doesn't increase in a join nest. Comments? regards, tom lane
I wrote: > After further thought about this, and about the new "ForceToNull" > expression node that I was suggesting, I have a more radical proposal > in mind: let's get rid of join alias variables, instead of expanding > their use. I've been studying this idea more, and it seems workable and useful, but as always there are a few rough edges. One tricky area is involved when we rearrange outer joins using the third outer-join identity: (A leftjoin B on (Pab)) leftjoin C on (Pbc) is equivalent toA leftjoin (B leftjoin C on (Pbc)) on (Pab) if Pbc is strictfor at least one column of B The problem here is that in form 2, Pbc's references to columns of B would presumably be plain Vars, since there can be no need to force them to null before Pbc is evaluated. But in form 1 there had better be ForceToNull nodes referencing the A/B join. Conversely, if form 1 was what was originally entered, the parser would emit ForceToNull nodes atop the B Vars in Pbc, but these are unnecessary if we implement it as form 2. I'm not too worried about wasting a few cycles to fall through a useless ForceToNull node at runtime, but there's a bigger problem here: if expressions that are really semantically equivalent might or might not contain ForceToNull nodes, that's likely to get in the way of planner optimization activities. One of the things I was hoping to get out of this was that Var-plus-ForceToNull trees would be equal() if and only if semantically equivalent, and that property seems to be slipping away. Another strange thing that's happening here is that after a transformation from form 1 to form 2, Pbc would contain ForceToNull references to a join that's actually above its evaluation point in the tree. We could presumably deal with that by decreeing such a thing to be a no-op, but it seems mighty ugly, as well as being a rule that would prevent detection of erroneous rearrangements. And again we are faced with the realization that apparently equal() trees might not mean the same thing, depending on where in the query you are looking. So I'm feeling a bit dissatisfied and wondering whether there isn't a better way to do this. Any thoughts out there? regards, tom lane
Back in April I posted some musings about improving the planner's handling of variables that could be forced to null by outer joins: http://archives.postgresql.org/pgsql-hackers/2008-04/msg01063.php I haven't gotten anything further done on that, and time grows short for 8.4. I am still interested in the concept of fixing the parsetree representation so that equal() Vars must denote the same value. But it seems we're still shy an idea or two needed to make that happen in a clean fashion --- and if it's not clean, there's not much point, because the main argument for changing that is to make things cleaner. However, the main practical problem that was to be solved was to fix things so that we could flatten sub-selects that have non-nullable targetlist entries, even when they're below an outer join. I think that this could still get done for 8.4, along the following lines: * When we have a non-nullable expression in a sub-select's targetlist, and it's below an outer join, replace the expression byCASE WHEN flag_var THEN original_expression ELSE NULL END and then flatten as normal. * The "flag_var" will be a boolean variable that is always computed as "true" at the semantic level of the original sub-select. It then bubbles up through joins like any other variable. If an upper outer join emits a null-extended row, this variable will be nulled just like any other. Thus, when we get to a point where the sub-select output expression is needed, the flag_var will be TRUE if it's okay to compute the expression, and NULL if we should force it to null. Which is exactly what the above CASE does. * Potentially we need a flag_var for each subselect that gets flattened. I'm inclined to represent these as Vars with varno equal to the sub-select's relid and varattno equal to a "system attribute number" that isn't used anywhere else. Since a sub-select that's been flattened out of the query is not going to be one of the "base rels" to be joined, this will require a little bit of klugery in a couple of places in the planner that assume every Var corresponds to a base rel. It doesn't look too terribly awful though. * It might also be a good idea to use a special node type instead of a general CASE expression for the expression control node. This is not for efficiency but just so that the planner can understand the semantics of the expression a bit better (for instance, such a node could still be considered "strict" whereas CASE in general is not). Thoughts, objections? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > * When we have a non-nullable expression in a sub-select's targetlist, > and it's below an outer join, replace the expression by > CASE WHEN flag_var THEN original_expression ELSE NULL END > and then flatten as normal. I don't understand how this gets you any further ahead. Doesn't it just move the problems you have now with the original_expression to flag_var instead? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
Gregory Stark <stark@enterprisedb.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> * When we have a non-nullable expression in a sub-select's targetlist, >> and it's below an outer join, replace the expression by >> CASE WHEN flag_var THEN original_expression ELSE NULL END >> and then flatten as normal. > I don't understand how this gets you any further ahead. Doesn't it just move > the problems you have now with the original_expression to flag_var instead? No, because the point is that a var will go to null when passed through an outer join that is trying to set it to null. The original_expression might be something that doesn't go to null by itself; for instance a non-null constant (see bug report cited in original message). An alternative approach to solving the problem is to ensure that the expression gets evaluated below the outer join and then pass its value up as a variable. But that requires a lot more planner hacking than I have time to get done for 8.4. The current planner tries to postpone expression evaluation as long as possible, and changing that isn't trivial. (It's not necessarily desirable, either --- pushing an expensive expression down to a place where it'd get evaluated more times isn't a win.) This is definitely an area where more work remains, but I'm trying to solve the largest practical problem in a way that's implementable without too much work. If I end up throwing that work away in some future release, in favor of a more general answer, I won't cry too much. regards, tom lane
Gregory Stark <stark@enterprisedb.com> writes: > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes: >> * When we have a non-nullable expression in a sub-select's targetlist, >> and it's below an outer join, replace the expression by >> CASE WHEN flag_var THEN original_expression ELSE NULL END >> and then flatten as normal. > I don't understand how this gets you any further ahead. After a lot of thought I'm beginning to come round to your point of view :-(. This just isn't working out. What I'm now considering is a more frontal attack on the problem: invent a type of Var that can be evaluated lower in the plan tree than where its value gets used. The idea would be to replace a non-nullable expression coming out of a flattened sub-select by one of these special Vars, which would be marked so that it gets evaluated (by executing the original expression) just below the lowest outer join that can null the original sub-select. The Var then bubbles up the plan tree normally (and possibly gets set to null by various outer joins). Above the evaluation level we have just ordinary references to the Var, not to the original expression. There might also be references to the sub-select result in join ON clauses below that lowest outer join. I'm inclined to make these just expand to the original expression, rather than trying to combine uses (at least for the first pass; maybe we could get smarter later). Note that the expression won't be volatile --- else we'd refuse to flatten the sub-select in the first place --- so this is just an efficiency not a correctness issue. This mechanism might have other uses later, since it gives us some flexibility about where in the tree an expression gets evaluated. We might be able to re-introduce something like Hellerstein's expensive-function optimization. But for 8.4 I'd be happy if we can fix the problem with not being able to flatten subselects with non-nullable outputs. Thoughts, objections? BTW, can anyone think of a good name for these animals? I've only come up with things like "PseudoVar" or "ExpressionVar", which seem rather content-free ... regards, tom lane