Re: Synchronous commit behavior during network outage

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: Synchronous commit behavior during network outage
Дата
Msg-id 8848B234-F534-44BE-9EE8-43BC6D28B297@yandex-team.ru
обсуждение исходный текст
Ответ на Re: Synchronous commit behavior during network outage  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Synchronous commit behavior during network outage  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers

> 29 июня 2021 г., в 23:35, Jeff Davis <pgsql@j-davis.com> написал(а):
>
> On Tue, 2021-06-29 at 11:48 +0500, Andrey Borodin wrote:
>>> 29 июня 2021 г., в 03:56, Jeff Davis <pgsql@j-davis.com>
>>> написал(а):
>>>
>>> The patch may be somewhat controversial, so I'll wait for feedback
>>> before documenting it properly.
>>
>> The patch seems similar to [0]. But I like your wording :)
>> I'd be happy if we go with any version of these idea.
>
> Thank you, somehow I missed that one, we should combine the CF entries.
>
> My patch also covers the backend termination case. Is there a reason
> you left that case out?
Yes, backend termination is used by HA tool before rewinding the node. Initially I was considering termination as PANIC
andgot a ton of coredumps during failovers on drills. 

There is one more caveat we need to fix: we should prevent instant recovery from happening. HA tool must know that our
processwas restarted.  
Consider following scenario:
1. Node A is primary with sync rep.
2. A is going through network partitioning, somewhere node B is promoted.
3. All backends of A are stuck in sync rep, until HA tool discovers A is failed node.
4. One backend crashes with segfault in some buggy extension or OOM or whatever
5. Postgres server is doing restartless crash recovery making local-but-not-replicated data visible.

We should prevent 5 also as we prevent cancels. HA tool will discover postmaster fail and will recheck in coordinatino
systemthat it can raise up Postgres locally. 

Thanks!

Best regards, Andrey Borodin.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Use pg_nextpower2_* in a few more places
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: cleaning up PostgresNode.pm