Обсуждение: background processes vs. hot standby
CheckRequiredParameterValues() has some code that, when hot standby is
in use, checks the values of max_connections,
max_prepared_transactions, and max_locks_per_transaction against the
master.   The comment says "we must have at least as many backend
slots as the primary" ... but the code no longer enforces that,
because we now compute MaxBackends like this:
       MaxBackends = MaxConnections + autovacuum_max_workers + 1 +               GetNumShmemAttachedBgworkers();
If GetNumShmemAttachedBgworkers() returns a lower value on the standby
than it did on the master, then we might well have fewer backend slots
on the standby.  I'm having trouble remembering why it's a problem to
have fewer backend slots on the standby than the master, but if we
need to prevent that then this code is no longer adequate to the task.
The comment doesn't explain why we check max_locks_per_transaction.  I
thought the reason for that check was that we needed to ensure that
there were at least as many lock table slots on the standby as there
were on the master, to prevent bad things from happening later.  That
was already not true, since the existing code didn't enforce any
limitation on autovacuum_max_workers on the standby side.  Maybe that
doesn't matter, since autovacuum workers can't run in hot standby
mode; not sure.  But the addition of background workers to MaxBackends
provides another way for that to be not true.  Here's how we compute
the size of the lock table:
#define NLOCKENTS() \       mul_size(max_locks_per_xact, add_size(MaxBackends, max_prepared_xacts))
Thoughts?
-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
			
		Hi, On 2013-05-24 09:48:03 -0400, Robert Haas wrote: > CheckRequiredParameterValues() has some code that, when hot standby is > in use, checks the values of max_connections, > max_prepared_transactions, and max_locks_per_transaction against the > master. The comment says "we must have at least as many backend > slots as the primary" ... but the code no longer enforces that, > because we now compute MaxBackends like this: > > MaxBackends = MaxConnections + autovacuum_max_workers + 1 + > GetNumShmemAttachedBgworkers(); > > If GetNumShmemAttachedBgworkers() returns a lower value on the standby > than it did on the master, then we might well have fewer backend slots > on the standby. I'm having trouble remembering why it's a problem to > have fewer backend slots on the standby than the master, but if we > need to prevent that then this code is no longer adequate to the task. It's afair important because we need to allocate shared memory which can keep track of the maximum number of xids (toplevel * max_non_suboverflowed_subxids) in progress. That's the KnownAssignedXids* stuff in procarray.c. > The comment doesn't explain why we check max_locks_per_transaction. I > thought the reason for that check was that we needed to ensure that > there were at least as many lock table slots on the standby as there > were on the master, to prevent bad things from happening later. That > was already not true, since the existing code didn't enforce any > limitation on autovacuum_max_workers on the standby side. Maybe that > doesn't matter, since autovacuum workers can't run in hot standby > mode; not sure. But the addition of background workers to MaxBackends > provides another way for that to be not true. Here's how we compute > the size of the lock table: Yea, we need it exactly for that reason. I think its unlikely to cause actual problems since we only ship access exclusive locks to the standby and its hard to see scenarios where we have that many AEL on the primary. But we probably should fix it anyway. I think fixing the autovacuum_max_workers case is entirely reasonable and relatively unlikely to cause problems. I don't think we can easily do it in a minor release though since I don't see a way to transport knowledge about it via the WAL without breaking either the WAL format entirely or change the meaning of MaxConnections in ControlFile which would cause problems with upgrading the primary first. I am less excited about doing something similar for the background worker case. Requiring just as many background workers on the standby sounds like a bad idea to me, there seem to be too many cases where that doesn't seem to make sense. I wonder if we shouldn't make background workers use connections slots from max_connections similar to how superuser_reserved_connections work. That would mean we don't need to care about it for HS. Greetings, Andres Freund
Andres Freund escribió: > I wonder if we shouldn't make background workers use connections slots > from max_connections similar to how superuser_reserved_connections > work. That would mean we don't need to care about it for HS. I remember considering this and concluding that it's messy. Suppose we decide that the registered bgworker number would be subtracted from max_connections: if the configuration registers as many bgworkers as max_connections, then no client connections can take place; if there are more bgworkers than max_connections, there's going to be errors at startup because the last few bgworkers cannot start at all (and no client connections will be allowed). So users would be forced to change max_connections according to bgworkers configuration. That doesn't sound friendly. If, instead of subtracting bgworkers from max_connections, we were to add the number of bgworkers to max_connections, then we're no better than currently, because the number of bgworkers from the standby would be different from those in the master, and we'd be back into the problem of how to ensure that the allowed number of locks meets the restriction. If you wanted to have more bgworkers in the master than the standby, you'd have to advise users to increase max_connections in the standby to fulfill the restriction. (I currently have no proposal on how to go about solving this problem.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, May 24, 2013 at 11:25 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Andres Freund escribió: >> I wonder if we shouldn't make background workers use connections slots >> from max_connections similar to how superuser_reserved_connections >> work. That would mean we don't need to care about it for HS. > > I remember considering this and concluding that it's messy. Suppose we > decide that the registered bgworker number would be subtracted from > max_connections: if the configuration registers as many bgworkers as > max_connections, then no client connections can take place; if there are > more bgworkers than max_connections, there's going to be errors at > startup because the last few bgworkers cannot start at all (and no > client connections will be allowed). So users would be forced to change > max_connections according to bgworkers configuration. That doesn't > sound friendly. I agree. To put that more succinctly, if we take that approach, then max_connections is no longer the maximum number of connections, which is a POLA violation. > (I currently have no proposal on how to go about solving this problem.) If the problem were only with the size of the lock table, I'd be somewhat inclined to propose ripping out max_locks_per_transaction and putting in a GUC called max_locks instead. The current system seems more confusing than helpful; when the default proves insufficient, the recourse is usually to figure out how many objects we need to lock, and then divide by max_connections to figure out how to set max_locks_per_transaction, which is really backing into the problem from the wrong end. But I don't know what to do about the problem of needing to know how many backends there are. I agree with Andres that it's not very friendly to enforce a restriction that all the same worker processes must be present on the standby. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, May 24, 2013 at 12:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > But I don't know what to do about the problem of needing to know how > many backends there are. I agree with Andres that it's not very > friendly to enforce a restriction that all the same worker processes > must be present on the standby. OK, I have an idea for fixing this. This would be 9.4 material; I'm happy enough to leave this broken for 9.3, as the chances of anyone hitting it in practice seem remote, and if they do, the workaround is not hard. What I'd like to do is change the way that we allocate slots for bgworkers. Instead of computing the number of slots based on how many bgworkers are registered at postmaster startup time, I'd like to allocate a fixed number of slots, controlled by a GUC called max_worker_processes, which will obviously be PGC_POSTMASTER. I would propose to give this GUC a small but non-zero default value; I was thinking of 8. Under this proposal, it would no longer possible to register an unlimited number of background workers without any changes to postgresql.conf; if you want to register more of them than the default value of this parameter, you'll need to increase the parameter. That's a downside, but it's one that I think will affect very few users; how many people are going to be running installations with more than 8 background workers? (And, those that are probably pretty sophisticated users who won't suffer greatly from having to change two GUCs rather than just one.) So what's the upside? One advantage, obviously, is that we fix this bug. max_worker_processes would get propagated the standby, and checked, just as max_connections does today. The standby doesn't have to actually load the same extensions; it just has to allow for at least as many slots, which is cleaner. But that's pretty minor. I see a much larger advantage: we wouldn't be limited to worker procesess that are started up at fixed times in the postmaster lifecycle, as we currently are. Right now, background workers can only be started at three points: BgWorkerStart_PostmasterStart, BgWorkerStart_ConsistentState, and BgWorkerStart_RecoveryFinished. This change would allow us to move towards worker processes that can be started "on the fly", whenever needed, just as long as there are available slots from the total allocation carved out by max_worker_processes. I think you can see where I'm going here: aside from giving extension code more options in terms of how it uses worker processes, this would give us a facility that would be a useful building block toward parallel sort/copy/query/whatever. Even without parallelism in core, it's not too difficult to imagine a contrib module that does some kind of ad-hoc parallelism by firing up worker backends, which would be awkward with the 9.3 infrastructure. But the real big win is that the core facilities, when and as they arrive, can draw from that same pool of workers. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company