Re: walsender bug: stuck during shutdown

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: walsender bug: stuck during shutdown
Дата
Msg-id 20201204182707.GA8461@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: walsender bug: stuck during shutdown  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
On 2020-Nov-26, Fujii Masao wrote:

> Yes, so the problem here is that walsender goes into the busy loop
> in that case. Seems this happens only in logical replication walsender.
> In physical replication walsender, WaitLatchOrSocket() in WalSndLoop()
> seems to work as expected and prevent it from entering into busy loop
> even in that case.
> 
>         /*
>          * If postmaster asked us to stop, don't wait anymore.
>          *
>          * It's important to do this check after the recomputation of
>          * RecentFlushPtr, so we can send all remaining data before shutting
>          * down.
>          */
>         if (got_STOPPING)
>             break;
> 
> The above code in WalSndWaitForWal() seems to cause this issue. But I've
> not come up with idea about how to fix yet.

With DEBUG1 I observe that walsender is getting a lot of 'r' messages
(standby reply) with all zeroes:

2020-12-01 21:01:24.100 -03 [15307] DEBUG:  write 0/0 flush 0/0 apply 0/0

However, while doing that I also observed that if I do send some
activity to the logical replication stream, with the provided program,
it will *still* have the 'write' pointer set to 0/0, and the 'flush'
pointer has moved forward to what was sent.  I'm not clear on what
causes the write pointer to move forward in logical replication.

Still, the previously proposed patch does resolve the problem in either
case.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] [PATCH] Generic type subscripting
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: WIP: WAL prefetch (another approach)