Re: walsender bug: stuck during shutdown

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: walsender bug: stuck during shutdown
Дата
Msg-id abd3220d-bf25-6118-7060-5e9cf7cdfc74@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: walsender bug: stuck during shutdown  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: walsender bug: stuck during shutdown
Список pgsql-hackers

On 2020/11/26 11:45, Alvaro Herrera wrote:
> On 2020-Nov-26, Fujii Masao wrote:
> 
>> On the second thought, walsender doesn't wait forever unless
>> wal_sender_timeout is disabled, even in the case in discussion?
>> Or if there is the case where wal_sender_timeout doesn't work expectedly,
>> we might need to fix that at first.
> 
> Hmm, no, it doesn't wait forever in that sense; tracing with the
> debugger shows that the process is looping continuously.

Yes, so the problem here is that walsender goes into the busy loop
in that case. Seems this happens only in logical replication walsender.
In physical replication walsender, WaitLatchOrSocket() in WalSndLoop()
seems to work as expected and prevent it from entering into busy loop
even in that case.

        /*
         * If postmaster asked us to stop, don't wait anymore.
         *
         * It's important to do this check after the recomputation of
         * RecentFlushPtr, so we can send all remaining data before shutting
         * down.
         */
        if (got_STOPPING)
            break;

The above code in WalSndWaitForWal() seems to cause this issue. But I've
not come up with idea about how to fix yet.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "osumi.takamichi@fujitsu.com"
Дата:
Сообщение: RE: Stronger safeguard for archive recovery not to miss data
Следующее
От: "osumi.takamichi@fujitsu.com"
Дата:
Сообщение: RE: Stronger safeguard for archive recovery not to miss data