Re: Changes to error handling for background worker initialization?

Поиск
Список
Период
Сортировка
От Jeremy Finzel
Тема Re: Changes to error handling for background worker initialization?
Дата
Msg-id CAMa1XUhFap+AibpAHSkjRwN4cd9o8KYghWtG99JNofrEDzsAGw@mail.gmail.com
обсуждение исходный текст
Ответ на Changes to error handling for background worker initialization?  (Jeremy Finzel <finzelj@gmail.com>)
Список pgsql-hackers
On Mon, Oct 22, 2018 at 9:36 AM Jeremy Finzel <finzelj@gmail.com> wrote:
Hello -

I have an extension that uses background workers.  I pass a database oid as an argument in order to launch the worker using function BackgroundWorkerInitializeConnectionByOid.  In one of my regression tests that was written, I intentionally launch the worker with an invalid oid.  In earlier PG versions the worker would successfully launch but then terminate asynchronously, with a message in the server log.  Now, it does not even successfully launch but immediately errors (hence failing my regression tests).

I have recently installed all later point releases of all versions 9.5-11, so I assume this is due to some code change.  The behavior seems reasonable but I don't find any obvious release notes indicating a patch that would have changed this behavior.  Any thoughts?

Thanks,
Jeremy

I still haven't determined the source of this error, but I have determined that it must not be related to a difference in point release versions as to background worker error handling, because I am seeing different behavior for identical postgres version on my machine vs. others.  I would appreciate any ideas as to how this could possibly happen because I'm not sure the right way now to build this regression test.

The test launches the background worker with an invalid database oid.

Here is what I am seeing running pg 11.1 on my system (same behavior I get on 9.5-10 as well):

 SELECT _launch(9999999::OID) AS pid;
! ERROR:  could not start background process
! HINT:  More details may be available in the server log.

This is what others are seeing (the worker fails asynchronously and you see it in the server log):

 SELECT _launch(9999999::OID) AS pid;
!   pid
! -------
!  18022
! (1 row)

I could share the C code but it's not that interesting.  It just calls BackgroundWorkerInitializeConnectionByOid.  It is essentially a duplicate of worker_spi.  Here is the relevant section:

sprintf(worker.bgw_function_name, "worker_spi_main");
snprintf(worker.bgw_name, BGW_MAXLEN, "worker_spi worker %d", i);
snprintf(worker.bgw_type, BGW_MAXLEN, "worker_spi");
worker.bgw_main_arg = Int32GetDatum(i);
/* set bgw_notify_pid so that we can use WaitForBackgroundWorkerStartup */
worker.bgw_notify_pid = MyProcPid;

if (!RegisterDynamicBackgroundWorker(&worker, &handle))
PG_RETURN_NULL();

status = WaitForBackgroundWorkerStartup(handle, &pid);

if (status == BGWH_STOPPED)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_RESOURCES),
errmsg("could not start background process"),
errhint("More details may be available in the server log.")));

So on my machine, I am getting status == BGWH_STOPPED, whereas with others, they are not getting that behavior.

Thanks,
Jeremy

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #15212: Default values in partition tables don't work asexpected and allow NOT NULL violation
Следующее
От: Tom Lane
Дата:
Сообщение: Re: DSM segment handle generation in background workers