Обсуждение: BGWorkers, shared memory pointers, and postmaster restart
Hi all I've been using the dynamic BGWorker support for some recent work, and I think I've found an issue with how postmaster restarts are handled. TL;DR: I don't think there's a safe way to use a BGWorker (static or dynamic) with bgw_restart_time != BGW_NEVER_RESTART and a bgw_main_arg Datum that points into shared memory, and think we might need a API change to fix that. If the postmaster restarts due to a crash, it calls shared memory startup hooks and sets the found pointer to false, so they're re-inited. This makes sense as shm is presumed to be corrupt. Dynamic BGWorkers are killed as a part of restart. However, they're not unregistered, and are relaunched if there's a bgw_restart_time set. They're called with the same bgw_main_arg Datum as with the original launch. If this Datum is a pointer into the shared memory that just got zeroed and re-inited, exciting things happen. In my case the worker restart generally happens after shm is re-inited, and since it gets reinitialized with the same contents the dynamic bgworker registered during postmaster startup restarts normally. I only noticed the problem because my shared memory init hook launches bgworkers once shm is set up, and I was getting two copies of a bgworker after postmaster restart. This also affects static bgworkers that use a pointer into shared memory to pass data on EXEC_BACKEND platforms (Windows). For dynamic bgworkers, it seems like it'd be OK to just require extensions to re-register them after a postmaster restart, or add a BGW_UNREGISTER_ON_POSTMASTER_RESTART flag to let that be controlled so bgworkers that didn't receive pointers into shm didn't have to deal with it. So any pointer into shared memory that's being re-initialized would be discarded when the bgworker was unregistered during postmater restart. Dynamic bgworkers are new in 9.4, so we still have the freedom to change behaviour for them. But that won't fix static bgworkers that have a pointer into shm as an argument. In non-EXEC_BACKEND platforms we can just pass a pointer to palloc'd postmaster memory, but that won't work if we have to exec() after fork(). I don't think it's reasonable to say "well, don't do that" - if you can only send a single pass-by-value Datum to a bgworker's main function, their utility is greatly reduced. I could always set up an exit hook / SIGQUIT handler that tries to unregister dynamic bgworkers using their BGWorkerHandle s, from the worker that initially registered them. If the worker that registered them is the one that crashed and triggered a postmaster restart this won't do any good, so it's a half-solution at best. I can't stash the BGWorkerHandle s for the launched workers in shm and unregister them during postmaster restart either. For one thing, shm is presumed corrupt. For another, BGWorkerHandle is an incomplete struct with no exposed size, so I can't copy it into shm anyway. This seems like a bit of a pickle. Am I missing something obvious? The only way around it that I can currently see is to have a single static bgworker with no pointer argument launch and manage all the other workers required for the extension. It would launch them all with bgw_restart_time = BGW_NEVER_RESTART and set its self as the bgw_notify_pid . If they die, it unregisters them then registers a new one. Effectively, it's doing the work the bgworker code does already, except that it makes sure the bgworkers don't get relaunched on postmaster restart. Since it doesn't depend on shm being valid, this should work, but it's messy and seems like there should be a better way. Do we need to change how we define BGWorkers to require that a BGWorker with a Datum that points to shared memory be automatically unregistered by the postmaster on restart? This would have to apply to static BGWorkers too, so we'd want to think about adding a flag for it and maybe even backpatching the flag into 9.3; it'd only affect extensions that actually used the combination described above. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 04/16/2014 02:37 PM, Craig Ringer wrote: > Hi all > > I've been using the dynamic BGWorker support for some recent work, and I > think I've found an issue with how postmaster restarts are handled. > > TL;DR: I don't think there's a safe way to use a BGWorker (static or > dynamic) with bgw_restart_time != BGW_NEVER_RESTART and a bgw_main_arg > Datum that points into shared memory, and think we might need a API > change to fix that. Andres sensibly points out that part of this is easily solved by passing the bgworker an index into an array in a named shmem block. A simple pass-by-value Datum that can be turned into a pointer to a shmem struct. This still doesn't solve the other half of the issue, which is how to handle dynamic bgworkers after a postmaster restart. They're effectively lost/leaked: there's no way to retain a bgworker handle across restart, and no way to list bgworkers, nor is there any idempotent way to say "Start a worker to do <x> only if it doesn't already exist" (unique names, magic cookie hashes, whatever). With the current API the only solution to the second half that I see is to have bgworkers run in non-restart mode and manage them yourself. Not ideal. Instead we need one of: - A flag like BGW_UNREGISTER_ON_RESTART; - To always unregister dynamic bgws on postmaster shm clear + restart; - A way to list bgws, inspect their BackgroundWorker structs and obtain their handles; or - A way to idempotently register a bgw only if it doesn't already exist -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-04-16 19:11:37 +0800, Craig Ringer wrote: > On 04/16/2014 02:37 PM, Craig Ringer wrote: > > Hi all > > > > I've been using the dynamic BGWorker support for some recent work, and I > > think I've found an issue with how postmaster restarts are handled. > > > > TL;DR: I don't think there's a safe way to use a BGWorker (static or > > dynamic) with bgw_restart_time != BGW_NEVER_RESTART and a bgw_main_arg > > Datum that points into shared memory, and think we might need a API > > change to fix that. > > Andres sensibly points out that part of this is easily solved by passing > the bgworker an index into an array in a named shmem block. A simple > pass-by-value Datum that can be turned into a pointer to a shmem struct. > > This still doesn't solve the other half of the issue, which is how to > handle dynamic bgworkers after a postmaster restart. They're effectively > lost/leaked: there's no way to retain a bgworker handle across restart, > and no way to list bgworkers, nor is there any idempotent way to say > "Start a worker to do <x> only if it doesn't already exist" (unique > names, magic cookie hashes, whatever). > > With the current API the only solution to the second half that I see is > to have bgworkers run in non-restart mode and manage them yourself. Not > ideal. > > Instead we need one of: > > - A flag like BGW_UNREGISTER_ON_RESTART; > > - To always unregister dynamic bgws on postmaster shm clear + restart; > > - A way to list bgws, inspect their BackgroundWorker structs and obtain > their handles; or > > - A way to idempotently register a bgw only if it doesn't already exist I think we should go for always unregistering dynamic bgws. There's really little justification for keeping them around after a crash cycle. While not the nicest place architecturally, it seems easy enough to do in BackgroundWorkerShmemInit() which happens to be called conveniently in a crash/restart cycle... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 04/16/2014 07:21 PM, Andres Freund wrote: > On 2014-04-16 19:11:37 +0800, Craig Ringer wrote: >> On 04/16/2014 02:37 PM, Craig Ringer wrote: >>> Hi all >>> >>> I've been using the dynamic BGWorker support for some recent work, and I >>> think I've found an issue with how postmaster restarts are handled. >>> >>> TL;DR: I don't think there's a safe way to use a BGWorker (static or >>> dynamic) with bgw_restart_time != BGW_NEVER_RESTART and a bgw_main_arg >>> Datum that points into shared memory, and think we might need a API >>> change to fix that. >> >> Andres sensibly points out that part of this is easily solved by passing >> the bgworker an index into an array in a named shmem block. A simple >> pass-by-value Datum that can be turned into a pointer to a shmem struct. >> >> This still doesn't solve the other half of the issue, which is how to >> handle dynamic bgworkers after a postmaster restart. They're effectively >> lost/leaked: there's no way to retain a bgworker handle across restart, >> and no way to list bgworkers, nor is there any idempotent way to say >> "Start a worker to do <x> only if it doesn't already exist" (unique >> names, magic cookie hashes, whatever). >> >> With the current API the only solution to the second half that I see is >> to have bgworkers run in non-restart mode and manage them yourself. Not >> ideal. >> >> Instead we need one of: >> >> - A flag like BGW_UNREGISTER_ON_RESTART; >> >> - To always unregister dynamic bgws on postmaster shm clear + restart; >> >> - A way to list bgws, inspect their BackgroundWorker structs and obtain >> their handles; or >> >> - A way to idempotently register a bgw only if it doesn't already exist > > I think we should go for always unregistering dynamic bgws. There's > really little justification for keeping them around after a crash cycle. Seems simplest too, and extensions can reasonably expect to have to relaunch things after postmaster restart. We just have to document *where* they should do that, and that only dynamic bgworkers get cleared on postmaster restart, not static ones. I'd like to propose the following text as an addition to the comment in bgworker.h: The bgw_main_arg passed to a bgworker should be pass-by-value Datum(and not a pointer). A pointer to TopMemoryContext postmastermemoryis permitted for static bgworkers, but won't work onEXEC_BACKEND platforms. If any nontrivial data must bepassed to abgworker, a named shared memory segment should be allocated tocontain it. For multiple workers, the bgw_main_argmay be an indexinto an array of fixed-width structs in that shm segment. Note that shared memory is re-initialized if the postmaster restartsafter a backend crash. Backends should always use a sharedmemorystartup hook to initialize their shared memory so that it isre-initialized on postmaster restart. Don't think there's any good example code on how to do this in-tree, and it's too long for a sane comment. Wonder if I should add an example to contrib/worker_spi/ . There's no race anywhere btw. bgworkers are killed, then shm is cleared, then shm callbacks are rerun, then bgworkers are launched. So long as shm is reinited appropriately by an extension then static bgworkers that refer to that shm should be OK. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 16, 2014 at 7:11 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> TL;DR: I don't think there's a safe way to use a BGWorker (static or >> dynamic) with bgw_restart_time != BGW_NEVER_RESTART and a bgw_main_arg >> Datum that points into shared memory, and think we might need a API >> change to fix that. > > Andres sensibly points out that part of this is easily solved by passing > the bgworker an index into an array in a named shmem block. A simple > pass-by-value Datum that can be turned into a pointer to a shmem struct. Yes. I think the answer to your original complaint is that we don't currently support that, and adding support for that is material for a future release. > This still doesn't solve the other half of the issue, which is how to > handle dynamic bgworkers after a postmaster restart. They're effectively > lost/leaked: there's no way to retain a bgworker handle across restart, > and no way to list bgworkers, nor is there any idempotent way to say > "Start a worker to do <x> only if it doesn't already exist" (unique > names, magic cookie hashes, whatever). Fair point. > With the current API the only solution to the second half that I see is > to have bgworkers run in non-restart mode and manage them yourself. Not > ideal. > > Instead we need one of: > > - A flag like BGW_UNREGISTER_ON_RESTART; I would be OK with this, maybe modulo the name. > - To always unregister dynamic bgws on postmaster shm clear + restart; I don't particularly care for this. Let's suppose the background worker is a long-running daemon, like a PG equivalent of cron. In static-background worker land, the admin has to restart the cluster to get this going. In dynamic-background worker land, he can load it on the fly. But once he gets it up and running, he wants it to stay up and running, surviving crashes and everything. That's a big part of the value of having a background worker interface in the first place. > - A way to list bgws, inspect their BackgroundWorker structs and obtain > their handles; or This is certainly a good idea. > - A way to idempotently register a bgw only if it doesn't already exist This is probably a good idea, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 04/17/2014 12:16 AM, Robert Haas wrote: > On Wed, Apr 16, 2014 at 7:11 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> - A flag like BGW_UNREGISTER_ON_RESTART; > > I would be OK with this, maybe modulo the name. > >> - To always unregister dynamic bgws on postmaster shm clear + restart; > > I don't particularly care for this. Let's suppose the background > worker is a long-running daemon, like a PG equivalent of cron. In > static-background worker land, the admin has to restart the cluster to > get this going. In dynamic-background worker land, he can load it on > the fly. But once he gets it up and running, he wants it to stay up > and running, surviving crashes and everything. That's a big part of > the value of having a background worker interface in the first place. It'd be the job of the extension that provides the bgworker to make sure that it gets relaunched on postmaster restart. I tend to agree though. The problem I'm describing only causes issues for extensions that launch dynamic bgworkers during extension startup. Extensions that launch bgworkers in response to SQL commands and don't rely on passing state to the bgworker via shared memory shouldn't have to deal with restarting them on postmaster restart. > >> - A way to list bgws, inspect their BackgroundWorker structs and obtain >> their handles; or > > This is certainly a good idea. > >> - A way to idempotently register a bgw only if it doesn't already exist > > This is probably a good idea, too. I think we need _one_ solution for 9.4, and need it soon. The simplest is probably to flush all dynamic bgworkers. But I think you're probably right - that solves the problem discussed here, but is likely to make life harder for other use cases. A last-minute API addition for bgworker listing/inspection might not be a great idea - it's too late for it to see much testing and analysis and it might introduce bigger API problems than it solves. Duplicate-free registration might be OK, but there are some questions around how we'd handle differing parameters, what should be the determinant for uniquenes, whether we should go for idempotency or return/raise an error to indicate it already exists, etc. So similar issue with doing it at the last minute. To me, that says "let's add a flag to allow a dynamic bgworker to be unregistered on postmaster restart". Seems simple and low risk. I'll follow up with a proposed patch, then we can spend some quality shed time on the flag name ;-) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 16, 2014 at 8:46 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 04/17/2014 12:16 AM, Robert Haas wrote: >> On Wed, Apr 16, 2014 at 7:11 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > >>> - A flag like BGW_UNREGISTER_ON_RESTART; >> >> I would be OK with this, maybe modulo the name. >> >>> - To always unregister dynamic bgws on postmaster shm clear + restart; >> >> I don't particularly care for this. Let's suppose the background >> worker is a long-running daemon, like a PG equivalent of cron. In >> static-background worker land, the admin has to restart the cluster to >> get this going. In dynamic-background worker land, he can load it on >> the fly. But once he gets it up and running, he wants it to stay up >> and running, surviving crashes and everything. That's a big part of >> the value of having a background worker interface in the first place. > > It'd be the job of the extension that provides the bgworker to make sure > that it gets relaunched on postmaster restart. > > I tend to agree though. The problem I'm describing only causes issues > for extensions that launch dynamic bgworkers during extension startup. > > Extensions that launch bgworkers in response to SQL commands and don't > rely on passing state to the bgworker via shared memory shouldn't have > to deal with restarting them on postmaster restart. > >> >>> - A way to list bgws, inspect their BackgroundWorker structs and obtain >>> their handles; or >> >> This is certainly a good idea. >> >>> - A way to idempotently register a bgw only if it doesn't already exist >> >> This is probably a good idea, too. > > I think we need _one_ solution for 9.4, and need it soon. > > The simplest is probably to flush all dynamic bgworkers. But I think > you're probably right - that solves the problem discussed here, but is > likely to make life harder for other use cases. > > A last-minute API addition for bgworker listing/inspection might not be > a great idea - it's too late for it to see much testing and analysis and > it might introduce bigger API problems than it solves. > > Duplicate-free registration might be OK, but there are some questions > around how we'd handle differing parameters, what should be the > determinant for uniquenes, whether we should go for idempotency or > return/raise an error to indicate it already exists, etc. So similar > issue with doing it at the last minute. > > To me, that says "let's add a flag to allow a dynamic bgworker to be > unregistered on postmaster restart". Seems simple and low risk. > > I'll follow up with a proposed patch, then we can spend some quality > shed time on the flag name ;-) I think I can live with that. However, I tend to think that the best solution here is really "don't put try to pass pointers via the BackgroundWorker structure, because it doesn't [ expletive ] work." We've had several instances of that already. When I added support for dynamic background workers, I had to add bgw_library_name and bgw_function_name members to that structure because bgw_main won't work for extension code unless the library is loaded from shared_preload_libraries AND either we're not running under EXEC_BACKEND (i.e. Windows) or the system is kind enough to load the shared library at the same address in both processes, in which case it will accidentally fail to fail. We would have had to give bgw_sighup and bgw_sigterm the same treatment, but since they weren't really necessary in the first place we just ripped them out instead. worker_spi also had a bad case of this disease. It went to elaborate lengths to pass a pointer via bgw_main_arg, but the pointer was to *backend-private memory*, so it was completely broken on EXEC_BACKEND builds. I'm not sure whether it actually failed there, or managed to work just because the background worker backend also ran the _PG_init hook and managed to accidentally place the same data structure at the same address. I fixed that as part of introducing the dynamic background worker facility; now it passes an index instead of a pointer. What you're complaining about here is basically another instance of the same problem. It's not as bad because the main shared memory segment never moves or has any memory freed except after a crash-and-restart cycle, so your suggested plug seems likely to be adequate. But it also requires that you can allocate enough space in shared memory to pass around whatever state you need to carry around, and space in the main shared memory segment is at a premium; storage there also can't be freed. Noah suggested to me a while back that we might do better to change bgw_main_arg to something like char[64] rather than Datum, which would allow passing a reasonable-size payload without having to futz with shared memory. Then, instead of passing a pointer, you can pass the name of a shared memory region to look up and an index into the data structure stored there, or something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company