How to Qualifying or quantify risk of loss in asynchronous replication

Поиск

Список

Период

Сортировка

От	otheus uibk
Тема	How to Qualifying or quantify risk of loss in asynchronous replication
Дата	15 марта 2016 г. 20:28:56
Msg-id	CALbQNd1G_np279G60HBqZkdJo=3hE+yDRYSYoZSc16gFnCujbg@mail.gmail.com обсуждение исходный текст
Ответы	Re: How to Qualifying or quantify risk of loss in asynchronous replication (Thomas Munro <thomas.munro@enterprisedb.com>)
Список	pgsql-general

Дерево обсуждения

I've been working with PG 9.1.8 for two years now, mainly asynchronous replication. Recently, an IT admin of another group contested that the PG's asynchronous replication can result in loss of data in a 1-node failure. After re-readinG the documentation, I cannot determine to what extent this is true.

Back in 2008, Robert Haas made this post http://postgresql.nabble.com/Sync-Rep-First-Thoughts-on-Code-tp1998339p1998433.html in which he delineates between different levels of replication. 1-safe is guaranteed with PG WALs. Other possibilities include group-safe, both group safe and 1-safe, 2-safe.

How do we qualify PG when WALs are written (and archived) on the master, and streaming replication to a hot standby, albeit asynchronous, is used? Is it Group-safe?

My understanding is "Strictly speaking, No".

But what precisely is the algorithm and timing involved with streaming WALs?

Is it:

* client issues COMMIT

* master receives commit

* master processes transaction internally

* master creates WAL record

| master appends WAL to local WAL file, flushes disk

| master sends WAL record to all streaming clients

* master sends OK to client

* master applies WAL

So is this correct? Is it correct to say: PG async guarantees that the WAL is *sent* to the receivers, but not that they are received, before the client receives acknowledgement?

Can we make a case stronger than that? Assuming the T0 is round trip time between master and client, and T1 is round trip time between master and slave, as long as T1 <= T0, and provided both Slave and Master do not fail, the system is Group-safe?

Otheus

otheus.uibk@gmail.com

otheus.shelling@uibk.ac.at

В списке pgsql-general по дате отправления:

Предыдущее

От: Michael Paquier
Дата: 15 марта 2016 г., 19:21:58
Сообщение: Re: how to switch old replication Master to new Standby after promoting old Standby - pg_rewind log file missing

Следующее

От: John Lumby
Дата: 15 марта 2016 г., 20:43:54
Сообщение: Re: how to switch old replication Master to new Standby after promoting old Standby - pg_rewind log file missing

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

How to Qualifying or quantify risk of loss in asynchronous replication

Предыдущее

Следующее