Обсуждение: Proposal: relaxing link between explicit JOINs and execution order
There's been some recent discussion about the fact that Postgres treats explicit JOIN syntax as constraining the actual join plan, cf http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/explicit-joins.html This behavior was originally in there simply because of lack of time to consider alternatives. I now realize that it wouldn't be hard to get the planner to do better --- basically, preprocess_jointree just has to be willing to fold JoinExpr-under-JoinExpr into a FromExpr when the joins are inner joins. But in the meantime, some folks have found the present behavior to be a feature rather than a bug, since it lets them control planning time on many-table queries. If we are going to change it, I think we need some way to accommodate both camps. What I've been toying with is inventing a GUC variable or two. I am thinking of defining a variable that sets the maximum size of a FromExpr that preprocess_jointree is allowed to create by folding JoinExprs. If this were set to 2, the behavior would be the same as before: no collapsing of JoinExprs can occur. If it were set to a large number, inner JOIN syntax would never affect the planner at all. In practice it'd be smart to leave it at some value less than GEQO_THRESHOLD, so that folding a large number of JOINs wouldn't leave you with a query that takes a long time to plan or produces unpredictable plans. There is already a need for a GUC variable to control the existing behavior of preprocess_jointree: right now, it arbitrarily uses GEQO_THRESHOLD/2 as the limit for the size of a FromExpr that can be made by collapsing FromExprs together. This ought to be a separately settable parameter, I think. Comments? In particular, can anyone think of pithy names for these variables? The best I'd been able to come up with is MAX_JOIN_COLLAPSE and MAX_FROM_COLLAPSE, but neither of these exactly sing... regards, tom lane
Re: [PERFORM] Proposal: relaxing link between explicit JOINs and execution order
От
Sean Chittenden
Дата:
> There's been some recent discussion about the fact that Postgres > treats explicit JOIN syntax as constraining the actual join plan, cf > http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/explicit-joins.html > > This behavior was originally in there simply because of lack of time > to consider alternatives. I now realize that it wouldn't be hard to > get the planner to do better --- basically, preprocess_jointree just > has to be willing to fold JoinExpr-under-JoinExpr into a FromExpr > when the joins are inner joins. > > But in the meantime, some folks have found the present behavior to be > a feature rather than a bug, since it lets them control planning time > on many-table queries. If we are going to change it, I think we need > some way to accommodate both camps. [snip] > Comments? In particular, can anyone think of pithy names for these > variables? The best I'd been able to come up with is > MAX_JOIN_COLLAPSE and MAX_FROM_COLLAPSE, but neither of these > exactly sing... How about something that's runtime tunable via a SET/SHOW config var? There are some queries that I have that I haven't spent any time tuning and would love to have the planner spend its CPU thinking about it instead of mine. Setting it to 2 by default, then on my tuned queries, setting to something obscenely high so the planner won't muck with what I know is fastest (or so I think at least). I know this is a can of worms, but what about piggy backing on an Oracle notation and having an inline way of setting this inside of a comment? SELECT /* +planner:collapse_tables=12 */ .... ? ^^^^^^^ ^^^^^^^^^^^^^^^ ^^^ system variable value ::shrug:: In brainstorm mode. Anyway, a few names: auto_order_join auto_order_join_max auto_reorder_table_limit auto_collapse_join auto_collapse_num_join auto_join_threshold When I'm thinking about what this variable will do for me as a DBA, I think it will make the plan more intelligent by reordering the joins. My $0.02. -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > How about something that's runtime tunable via a SET/SHOW config var? Er, that's what I was talking about. > I know this is a can of worms, but what about piggy backing on an > Oracle notation and having an inline way of setting this inside of a > comment? I don't want to go there ... regards, tom lane
Re: [PERFORM] Proposal: relaxing link between explicit JOINs and execution order
От
Josh Berkus
Дата:
Tom, I am very strongly in favor of this idea. I would personally prefer it if the Join collapsing parmeter could be set at query time through a SET statement, but will of course defer to the difficulty level in doing so. > Comments? In particular, can anyone think of pithy names for these > variables? The best I'd been able to come up with is MAX_JOIN_COLLAPSE > and MAX_FROM_COLLAPSE, but neither of these exactly sing... How about: EXPLICIT_JOIN_MINIMUM and FROM_COLLAPSE_LIMIT Just to make the two params not sound so identical? -- -Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > I am very strongly in favor of this idea. I would personally prefer it if > the Join collapsing parmeter could be set at query time through a SET > statement, but will of course defer to the difficulty level in doing so. I guess I failed to make it clear that that's what I meant. GUC variables are those things that you can set via SET, or in the postgresql.conf file, etc. These values would be just as manipulable as, say, ENABLE_SEQSCAN. > How about: > EXPLICIT_JOIN_MINIMUM > and > FROM_COLLAPSE_LIMIT > Just to make the two params not sound so identical? Hmm. The two parameters would have closely related functions, so I'd sort of think that the names *should* be pretty similar. regards, tom lane
Josh Berkus <josh@agliodbs.com> writes: > How about: > EXPLICIT_JOIN_MINIMUM > and > FROM_COLLAPSE_LIMIT I've implemented this using FROM_COLLAPSE_LIMIT and JOIN_COLLAPSE_LIMIT as the variable names. It'd be easy enough to change if someone comes up with better names. You can read updated documentation at http://developer.postgresql.org/docs/postgres/explicit-joins.html regards, tom lane