Re: BUG #15346: Replica fails to start after the crash

Поиск
Список
Период
Сортировка
От Alexander Kukushkin
Тема Re: BUG #15346: Replica fails to start after the crash
Дата
Msg-id CAFh8B=mHh7iLGLLCjicUsJgTkz_2_=WsOpR4KvFOfo2bTX4v2g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #15346: Replica fails to start after the crash  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: BUG #15346: Replica fails to start after the crash  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
Hi,

I've figured out what is going on.
On this server we have a background worker, which starts from
shared_preload_libraries.

In order to debug and reproduce it, I removed everything from
background worker code except _PG_init, worker_main and couple of
sighandler functions.

Here is the code:

void
worker_main(Datum main_arg)
{
        pqsignal(SIGHUP, bg_mon_sighup);
        pqsignal(SIGTERM, bg_mon_sigterm);
        if (signal(SIGPIPE, SIG_IGN) == SIG_ERR)
                proc_exit(1);
        BackgroundWorkerUnblockSignals();
        BackgroundWorkerInitializeConnection("postgres", NULL);
        while (!got_sigterm)
        {
                        int rc = WaitLatch(MyLatch,
                                                   WL_LATCH_SET |
WL_TIMEOUT | WL_POSTMASTER_DEATH,
                                                   naptime*1000L);

                        ResetLatch(MyLatch);
                        if (rc & WL_POSTMASTER_DEATH)
                                proc_exit(1);

        }

        proc_exit(1);
}

void
_PG_init(void)
{
        BackgroundWorker worker;
        if (!process_shared_preload_libraries_in_progress)
                return;
        worker.bgw_flags = BGWORKER_SHMEM_ACCESS |
                BGWORKER_BACKEND_DATABASE_CONNECTION;
        worker.bgw_start_time = BgWorkerStart_ConsistentState;
        worker.bgw_restart_time = 1;
        worker.bgw_main = worker_main;
        worker.bgw_notify_pid = 0;
        snprintf(worker.bgw_name, BGW_MAXLEN, "my_worker");
        RegisterBackgroundWorker(&worker);
}

Most of this code is taken from "worker_spi.c".

Basically, it just initializes connection to the postgres database and
sleeps all the time.

If I comment out the 'BackgroundWorkerInitializeConnection("postgres",
NULL);' call, postgres starts without any problem.
What is very strange, because background worker itself is not doing anything...

And one more thing, if I add sleep(15) before calling
BackgroundWorkerInitializeConnection, postgres manages to start
successfully.
Is there a very strange race condition here?

Regards,
--
Alexander Kukushkin


В списке pgsql-bugs по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: BUG #15335: Documentation is wrong about archive_command andexisting files
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #15346: Replica fails to start after the crash