Обсуждение: max_parallel_workers question
The current docs for max_parallel_workers start out: "Sets the maximum number of workers that the system can support for parallel operations..." In my interpretation, "the system" means the entire cluster, but the max_parallel_workers setting is PGC_USERSET. That's a bit confusing, because two different backends can have different settings for "the maximum number ... the system can support". max_parallel_workers is compared against the total number of parallel workers in the system, which appears to be why the docs are worded that way. But it's still confusing to me. If the purpose is to make sure parallel queries don't take up all of the worker processes, perhaps we should rename the setting reserved_worker_processes, and make it PGC_SUPERUSER. If the purpose is to control execution within a backend, perhaps we should just compare it to the count of parallel processes that the backend is already using. If the purpose is just to be a more flexible version of max_worker_processes, maybe we should change it to PGC_SIGHUP? If it has multiple purposes, perhaps we should have multiple GUCs? Regards, Jeff Davis
On Fri, Sep 27, 2019 at 8:07 PM Jeff Davis <pgsql@j-davis.com> wrote: > The current docs for max_parallel_workers start out: > > "Sets the maximum number of workers that the system can support for > parallel operations..." > > In my interpretation, "the system" means the entire cluster, but the > max_parallel_workers setting is PGC_USERSET. That's a bit confusing, > because two different backends can have different settings for "the > maximum number ... the system can support". Oops. I intended it to mean "the entire cluster." Basically, how many workers out of max_worker_processes are you willing to use for parallel query, as opposed to other things. I agree that PGC_USERSET doesn't make any sense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, 2019-09-28 at 00:10 -0400, Robert Haas wrote: > I intended it to mean "the entire cluster." Basically, how many > workers out of max_worker_processes are you willing to use for > parallel query, as opposed to other things. I agree that PGC_USERSET > doesn't make any sense. In that case, PGC_SIGHUP seems most appropriate. It also might make more sense to rename it to reserved_worker_processes and invert the meaning. To me, that would be more clear that it's designed to prevent parallel query from interfering with other uses of worker processes. Another option would be to make it two pools, one for parallel workers and one for everything else, and each one would be controlled by a PGC_POSTMASTER setting. But it seems like some thought went into trying to share the pool of workers[1], so I assume there was a good reason you wanted to do that. Regards, Jeff Davis [1] If I'm reading correctly, it uses both lock-free code and intentional overflow.
On Sat, Sep 28, 2019 at 1:36 PM Jeff Davis <pgsql@j-davis.com> wrote: > In that case, PGC_SIGHUP seems most appropriate. Yeah. > It also might make more sense to rename it to reserved_worker_processes > and invert the meaning. To me, that would be more clear that it's > designed to prevent parallel query from interfering with other uses of > worker processes. I don't think that would work as well. Some day we might have another class of worker processes with its own independent limit, and then this terminology would get confusing. It makes sense to say that you can have up to 10 worker processes of which at most 4 can be used for parallel query and at most 3 can be used for logical replication, but it doesn't make nearly as much sense to say that you can have up to 10 worker processes of which 6 can't be used for parallel query and of which 7 can't be used for logical application. That leaves, uh, how many? > Another option would be to make it two pools, one for parallel workers > and one for everything else, and each one would be controlled by a > PGC_POSTMASTER setting. But it seems like some thought went into trying > to share the pool of workers[1], so I assume there was a good reason > you wanted to do that. Here again, I imagine that in the future we might have various different worker classes that need to share the total number of workers, but not necessarily via a hard partition. For example, you could sensible say that there are 3 purposes for workers and 10 workers, and no single purpose can consume more than 4 workers. Even though 4 * 3 > 10, it's a completely reasonable configuration. The early bird gets the juiciest worm, and the late bird doesn't starve to death. Even a more extreme configuration where you limit each purpose to, say, 7 workers could be reasonable. Here there is a risk of starvation, but you may know that in your environment it's not likely to last for very long. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Sep 28, 2019 at 12:10:53AM -0400, Robert Haas wrote: > On Fri, Sep 27, 2019 at 8:07 PM Jeff Davis <pgsql@j-davis.com> wrote: > > The current docs for max_parallel_workers start out: > > > > "Sets the maximum number of workers that the system can support for > > parallel operations..." > > > > In my interpretation, "the system" means the entire cluster, but the > > max_parallel_workers setting is PGC_USERSET. That's a bit confusing, > > because two different backends can have different settings for "the > > maximum number ... the system can support". > > Oops. > > I intended it to mean "the entire cluster." Basically, how many > workers out of max_worker_processes are you willing to use for > parallel query, as opposed to other things. I agree that PGC_USERSET > doesn't make any sense. I found two places there "custer" was better than "system", so I applied the attached patch to master. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.