Обсуждение: force_parallel_mode uniqueness

Поиск
Список
Период
Сортировка

force_parallel_mode uniqueness

От
"David G. Johnston"
Дата:
My take below is that of a user reading our documentation and our projected consistency via that document.

All of the other planner GUCs are basically, {on, off, special} with on or special the default as appropriate for the feature - since most/all features default to enabled.  While I get that the expected usage is to set this to off (which really leaves parallel mode in its default on behavior) and then reduce the parallel workers to zero to disable that runs contrary to all of the other switches listed alongside force_parallel_mode.  constraint_exclusion seems like something to be emulated here.

Also, all of the other geoq options get placed here yet max parallel degree is in an entirely different section.  I'm a bit torn on this point though since it does fit nicely in asynchronous behavior.  I think the next thought finds the happy middle.

If nothing else this option should include a link to max_parallel_degree and vice-versa.  Noting how to disable the feature in this section, if the guc semantics are not changed, would be good too.  That note would likely suffice to establish the linking term to parallel degree.  Something can be devised, even if just a see also, to link back.

David J.

Re: force_parallel_mode uniqueness

От
Robert Haas
Дата:
On Sat, May 7, 2016 at 11:42 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> All of the other planner GUCs are basically, {on, off, special} with on or
> special the default as appropriate for the feature - since most/all features
> default to enabled.  While I get that the expected usage is to set this to
> off (which really leaves parallel mode in its default on behavior) and then
> reduce the parallel workers to zero to disable that runs contrary to all of
> the other switches listed alongside force_parallel_mode.
> constraint_exclusion seems like something to be emulated here.

I am not really sure what you are suggesting here.  If you're saying
that you don't like the ordering regress > on > off, because there are
other GUCs where the intermediate values are all between "on" and
"off", then I think that's silly.  We should name and order the
options based on what makes sense, not based on what made sense for
other options.  Note that if you think there are no other GUCs which
have a value greater than "on", see also
synchronous_commit=remote_apply.

> Also, all of the other geoq options get placed here yet max parallel degree
> is in an entirely different section.

max_parallel_degree has nothing to do with GEQO, so I don't know that
the location of "other" GEQO options has much to do with anything.  It
also has nothing to do with force_parallel_mode, which is what this
email was about until you abruptly switched topics.

> I'm a bit torn on this point though
> since it does fit nicely in asynchronous behavior.  I think the next thought
> finds the happy middle.

We could put max_parallel_degree under "other planner options" rather
than "asynchronous behavior".  However, I wonder what behavior people
will want for parallel operations that are not queries.  For example,
suppose we have parallel CREATE INDEX.  Should the number of workers
for that operation also be controlled by max_parallel_degree?  If yes,
then this shouldn't be a query planner option, because CREATE INDEX is
not a query.

> If nothing else this option should include a link to max_parallel_degree and
> vice-versa.  Noting how to disable the feature in this section, if the guc
> semantics are not changed, would be good too.  That note would likely
> suffice to establish the linking term to parallel degree.  Something can be
> devised, even if just a see also, to link back.

It's probably worth mentioning under force_parallel_mode that it will
have no effect if parallel query is disabled by the
max_parallel_degree setting.  But it is completely unnecessary IMHO
for max_parallel_degree to link to force_parallel_mode.  Most people
should not be using force_parallel_mode; it is there for testing
whether functions are correctly labeled as to parallel safety and
that's it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: force_parallel_mode uniqueness

От
"David G. Johnston"
Дата:
On Sun, May 8, 2016 at 10:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, May 7, 2016 at 11:42 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> All of the other planner GUCs are basically, {on, off, special} with on or
> special the default as appropriate for the feature - since most/all features
> default to enabled.  While I get that the expected usage is to set this to
> off (which really leaves parallel mode in its default on behavior) and then
> reduce the parallel workers to zero to disable that runs contrary to all of
> the other switches listed alongside force_parallel_mode.
> constraint_exclusion seems like something to be emulated here.

I am not really sure what you are suggesting here.  If you're saying
that you don't like the ordering regress > on > off, because there are
other GUCs where the intermediate values are all between "on" and
"off", then I think that's silly.  We should name and order the
options based on what makes sense, not based on what made sense for
other options.  Note that if you think there are no other GUCs which
have a value greater than "on", see also
synchronous_commit=remote_apply.

​I was thinking more along the lines that it should be called:

parallel_mode (enum)

It would default to "on"

"off" would turn it off (instead of having to set parallel_degree to 0)

And it would have additional enum values for:

"always" - basically what on means in the current setup
"regress" - the same as the current regress.​


> Also, all of the other geoq options get placed here yet max parallel degree
> is in an entirely different section.

max_parallel_degree has nothing to do with GEQO, so I don't know that
the location of "other" GEQO options has much to do with anything.  It
also has nothing to do with force_parallel_mode, which is what this
email was about until you abruptly switched topics.

​I was simply trying to indicate that the various settings that configure geqo are present on this page.  max_parallel_degree is likewise a setting that configures parallel_mode but it isn't on this page.​


> I'm a bit torn on this point though
> since it does fit nicely in asynchronous behavior.  I think the next thought
> finds the happy middle.

We could put max_parallel_degree under "other planner options" rather
than "asynchronous behavior".  However, I wonder what behavior people
will want for parallel operations that are not queries.  For example,
suppose we have parallel CREATE INDEX.  Should the number of workers
for that operation also be controlled by max_parallel_degree?  If yes,
then this shouldn't be a query planner option, because CREATE INDEX is
not a query.

​Like I said, it isn't clear-cut.  But at the moment it is just for queries - it could be moved if it gets dual purposed as you describe.


> If nothing else this option should include a link to max_parallel_degree and
> vice-versa.  Noting how to disable the feature in this section, if the guc
> semantics are not changed, would be good too.  That note would likely
> suffice to establish the linking term to parallel degree.  Something can be
> devised, even if just a see also, to link back.

It's probably worth mentioning under force_parallel_mode that it will
have no effect if parallel query is disabled by the
max_parallel_degree setting.  But it is completely unnecessary IMHO
for max_parallel_degree to link to force_parallel_mode.  Most people
should not be using force_parallel_mode; it is there for testing
whether functions are correctly labeled as to parallel safety and
that's it.

So this particular capability is unique and as such it warrants offing a "force" mode that none of the other planner configuration GUCs on this page have.  Its clear that this is how it was intended but as a casual reader of the section its uniqueness stood out - and maybe that is for the better.

I guess part of the misunderstanding is simply that you have a lot more plans for this feature than are currently implemented but I am reading the documentation only knowing about those parts that are.

David J.

Re: force_parallel_mode uniqueness

От
Robert Haas
Дата:
On Sun, May 8, 2016 at 2:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I was thinking more along the lines that it should be called:
>
> parallel_mode (enum)
>
> It would default to "on"
>
> "off" would turn it off (instead of having to set parallel_degree to 0)
>
> And it would have additional enum values for:
>
> "always" - basically what on means in the current setup
> "regress" - the same as the current regress.

So, right now, most people can totally ignore force_parallel_mode.
Under your proposal, parallel query could be disabled either by
setting parallel_mode=off or by setting max_parallel_degree=0 (or 1,
after we do that renumbering).  That does not seem like a usability
improvement.

>> We could put max_parallel_degree under "other planner options" rather
>> than "asynchronous behavior".  However, I wonder what behavior people
>> will want for parallel operations that are not queries.  For example,
>> suppose we have parallel CREATE INDEX.  Should the number of workers
>> for that operation also be controlled by max_parallel_degree?  If yes,
>> then this shouldn't be a query planner option, because CREATE INDEX is
>> not a query.
>
> Like I said, it isn't clear-cut.  But at the moment it is just for queries -
> it could be moved if it gets dual purposed as you describe.

That's true.  But it could also be left where it is, and then we
wouldn't have to move it back.  I believe that at least some parallel
utility commands are going to arrive in 9.7 - for example, I think
Peter Geoghegan (whom my wife accurately dubbed the Sultan of Sort) is
interested in parallel CREATE INDEX and parallel CLUSTER.  Now I don't
know yet whether max_parallel_degree will affect those things or not,
and if it works out that we never use max_parallel_degree for anything
other than queries, then maybe I'll regret putting it where I did.
But I don't think it makes much sense to move it at this point.  It
isn't a clear improvement, and we've got plenty of things to tinker
with that are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company