Re: Background worker assistance & review

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Background worker assistance & review
Дата
Msg-id CAMsr+YHNfU7f+F6Rcvtwjqmb0EefYTXOuzQ3WQnwCa_dXavS5Q@mail.gmail.com
обсуждение исходный текст
Ответ на Background worker assistance & review  (Keith Fiske <keith@omniti.com>)
Ответы Re: Background worker assistance & review  (Keith Fiske <keith@omniti.com>)
Re: Background worker assistance & review  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-general


On 9 April 2015 at 05:35, Keith Fiske <keith@omniti.com> wrote:
I'm working on a background worker (BGW) for my pg_partman extension. I've gotten the basics of it working for my first round, but there's two features I'm missing that I'd like to add before release:

1) Only allow one instance of this BGW to run

Load your extension in shared_preload_libraries, so that _PG_init runs in the postmaster. Register a static background worker then.

If you need one worker per database (because it needs to access the DB) this won't work for you, though. What we do in BDR is have a single static background worker that's launched by the postmaster, which then launches and terminates per-database workers that do the "real work".

Because of a limitation in the bgworker API in releases 9.4 and older, the static worker has to connect to a database if it wants to access shared catalogs like pg_database. This limitation has been lifted in 9.5 though, along with the need to use the database name instead of its oid to connect (which left bgworkers unable to handle RENAME DATABASE).
 
(We still really need a hook on CREATE DATABASE too)

2) Create a bgw_terminate_partman() function to stop it more intuitively than doing a pg_cancel_backend() on the PID

If you want it to be able to be started/stopped dynamically, you should probably use RequestAddinShmemSpace to allocate a small shared memory block. Use that to register the PGPROC for the current worker when the worker starts, and add a boolean field you can use to ask it to terminate its self. You'll also need a LWLock to protect access to the segment, so you don't have races between a worker starting and the user asking to cancel it, etc.

Unfortunately the BackgroundWorkerHandle struct is opaque, so you cannot store it in shared memory when it's returned by RegisterDynamicBackgroundWorker() and use it to later check the worker's status or ask it to exit. You have to use regular backend manipulation functions and PGPROC instead.

Personally, I suggest that you leave the worker as a static worker, and leave it always running when the extension is active. If it isn't doing anything, have it sleep on its latch, then set its latch from other processes when something interesting happens. (You can put the process latch from PGPROC into your shmem segment so you can set it from elsewhere, or allocate a new latch).

This is my first venture into writing C code for postgres, so I'm not familiar with a lot of the internals yet. I read http://www.postgresql.org/docs/9.4/static/bgworker.html and I see it mentioning how you can check the status of a BGW launched dynamically and the function to terminate one, but I'm not clear how how you can get the information on a currently running BGW to do these things.

You can't. It's a pretty significant limitation in the current API. There's no way to enumerate bgworkers via the bgworker API, only via PGPROC.
 
I used the worker_spi example for a lot of this, so if there's any additional guidance for a better way to do what I've done, I'd appreciate it. All I really have it doing now is calling the run_maintenance() function at a defined interval and don't need it doing more than that yet.

The BDR project has an extension with much more in-depth use of background workers, but it's probably *too* complicated. We have a static bgworker that launches and terminates dynamic bgworkers (per-database) that in turn launch and terminate  more dynamic background workers (per-connection to peer databases).

If you're interested, all the code is mirrored on github:


and the relevant parts are:

https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr_apply.c#L2401
https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr.h

... but there's a *lot* of code there.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-general по дате отправления:

Предыдущее
От: Volkan Unsal
Дата:
Сообщение: Re: no pg_hba.conf entry for replication connection from host
Следующее
От: "Deole, Pushkar (Pushkar)"
Дата:
Сообщение: Re: Regarding bytea column in Posgresql