Обсуждение: Postgres Synchronous replication

Поиск
Список
Период
Сортировка

Postgres Synchronous replication

От
Ravi Krishna
Дата:
I want to understand how PG sync replication works. This is what I know (assuming two node sync replication)

1 - Application issues commit.
2 - PG commits the transaction locally on the primary server.
3 - At this stage the application has not got the commit indication back.
4 - PG transmits the transaction from the local to the remote server.
5 - Remote server sends back acknowledgement 
6 - The app gets commit ack back.

So this means, between step 2 and step 6, the app is not aware that the transaction has already been committed.
This is the reason why, in the event of server crashing between step 2 and step 6, and the remote takes over as the
new primary, the crashed server can not restart as standby and the only option is to recreate the db from the remote
server (which is now acting as the primary).

Am I correct in the understanding?

One more question: In Step 5, does the remote harden the transaction on the disk, or merely receives the transaction in the log buffer and it sends back ACK to the local server.

Thanks

Re: Postgres Synchronous replication

От
Payal Singh
Дата:
AFAIK the commit on master happens only after it receives ack from the slave. This is how synchronous replication ensures that the slave is'in sync'.

Payal Singh,
Database Administrator,
OmniTI Computer Consulting Inc.
Phone: 240.646.0770 x 253

On Thu, May 21, 2015 at 3:56 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
I want to understand how PG sync replication works. This is what I know (assuming two node sync replication)

1 - Application issues commit.
2 - PG commits the transaction locally on the primary server.
3 - At this stage the application has not got the commit indication back.
4 - PG transmits the transaction from the local to the remote server.
5 - Remote server sends back acknowledgement 
6 - The app gets commit ack back.

So this means, between step 2 and step 6, the app is not aware that the transaction has already been committed.
This is the reason why, in the event of server crashing between step 2 and step 6, and the remote takes over as the
new primary, the crashed server can not restart as standby and the only option is to recreate the db from the remote
server (which is now acting as the primary).

Am I correct in the understanding?

One more question: In Step 5, does the remote harden the transaction on the disk, or merely receives the transaction in the log buffer and it sends back ACK to the local server.

Thanks

Re: Postgres Synchronous replication

От
Ravi Krishna
Дата:

AFAIK the commit on master happens only after it receives ack from the slave. This is how synchronous replication ensures that the slave is'in sync'.


If that is the case , then why does PG find it impossible to sync back with the primary after a crash.
Other products offering similar technology do not have this issue.

In my opinion this is quite a serious limitation with PG replication. Every time the primary crashes and the business continues with the promotion of standby as the new primary, the crashed server has to be reinitialized for the set up of the replication.

 

On Thu, May 21, 2015 at 3:56 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
I want to understand how PG sync replication works. This is what I know (assuming two node sync replication)

1 - Application issues commit.
2 - PG commits the transaction locally on the primary server.
3 - At this stage the application has not got the commit indication back.
4 - PG transmits the transaction from the local to the remote server.
5 - Remote server sends back acknowledgement 
6 - The app gets commit ack back.

So this means, between step 2 and step 6, the app is not aware that the transaction has already been committed.
This is the reason why, in the event of server crashing between step 2 and step 6, and the remote takes over as the
new primary, the crashed server can not restart as standby and the only option is to recreate the db from the remote
server (which is now acting as the primary).

Am I correct in the understanding?

One more question: In Step 5, does the remote harden the transaction on the disk, or merely receives the transaction in the log buffer and it sends back ACK to the local server.

Thanks


Re: Postgres Synchronous replication

От
John Scalia
Дата:
Payal is correct. The correct sequence would 1, 4, 5, 2, 3, 6.

That's how I see it anyway.

On Thu, May 21, 2015 at 1:01 PM, Payal Singh <payal@omniti.com> wrote:
AFAIK the commit on master happens only after it receives ack from the slave. This is how synchronous replication ensures that the slave is'in sync'.

Payal Singh,
Database Administrator,
OmniTI Computer Consulting Inc.
Phone: 240.646.0770 x 253

On Thu, May 21, 2015 at 3:56 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
I want to understand how PG sync replication works. This is what I know (assuming two node sync replication)

1 - Application issues commit.
2 - PG commits the transaction locally on the primary server.
3 - At this stage the application has not got the commit indication back.
4 - PG transmits the transaction from the local to the remote server.
5 - Remote server sends back acknowledgement 
6 - The app gets commit ack back.

So this means, between step 2 and step 6, the app is not aware that the transaction has already been committed.
This is the reason why, in the event of server crashing between step 2 and step 6, and the remote takes over as the
new primary, the crashed server can not restart as standby and the only option is to recreate the db from the remote
server (which is now acting as the primary).

Am I correct in the understanding?

One more question: In Step 5, does the remote harden the transaction on the disk, or merely receives the transaction in the log buffer and it sends back ACK to the local server.

Thanks


Re: Postgres Synchronous replication

От
Keith
Дата:


On Thu, May 21, 2015 at 4:10 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:

AFAIK the commit on master happens only after it receives ack from the slave. This is how synchronous replication ensures that the slave is'in sync'.


If that is the case , then why does PG find it impossible to sync back with the primary after a crash.
Other products offering similar technology do not have this issue.

In my opinion this is quite a serious limitation with PG replication. Every time the primary crashes and the business continues with the promotion of standby as the new primary, the crashed server has to be reinitialized for the set up of the replication.

 

On Thu, May 21, 2015 at 3:56 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
I want to understand how PG sync replication works. This is what I know (assuming two node sync replication)

1 - Application issues commit.
2 - PG commits the transaction locally on the primary server.
3 - At this stage the application has not got the commit indication back.
4 - PG transmits the transaction from the local to the remote server.
5 - Remote server sends back acknowledgement 
6 - The app gets commit ack back.

So this means, between step 2 and step 6, the app is not aware that the transaction has already been committed.
This is the reason why, in the event of server crashing between step 2 and step 6, and the remote takes over as the
new primary, the crashed server can not restart as standby and the only option is to recreate the db from the remote
server (which is now acting as the primary).

Am I correct in the understanding?

One more question: In Step 5, does the remote harden the transaction on the disk, or merely receives the transaction in the log buffer and it sends back ACK to the local server.

Thanks




This issue has been address with pg_rewind in the upcoming 9.5 major version release

http://hlinnaka.iki.fi/2015/03/23/pg_rewind-in-postgresql-9-5/

Re: Postgres Synchronous replication

От
"Gilberto Castillo"
Дата:

> On Thu, May 21, 2015 at 4:10 PM, Ravi Krishna <sravikrishna3@gmail.com>
> wrote:
>
>>
>> AFAIK the commit on master happens only after it receives ack from the
>>> slave. This is how synchronous replication ensures that the slave is'in
>>> sync'.
>>>
>>
>>
>> If that is the case , then why does PG find it impossible to sync back
>> with the primary after a crash.
>> Other products offering similar technology do not have this issue.
>>
>> In my opinion this is quite a serious limitation with PG replication.
>> Every time the primary crashes and the business continues with the
>> promotion of standby as the new primary, the crashed server has to be
>> reinitialized for the set up of the replication.
>>
>>
>>
>>>
>>> On Thu, May 21, 2015 at 3:56 PM, Ravi Krishna <sravikrishna3@gmail.com>
>>> wrote:
>>>
>>>> I want to understand how PG sync replication works. This is what I
>>>> know
>>>> (assuming two node sync replication)
>>>>
>>>> 1 - Application issues commit.
>>>> 2 - PG commits the transaction locally on the primary server.
>>>> 3 - At this stage the application has not got the commit indication
>>>> back.
>>>> 4 - PG transmits the transaction from the local to the remote server.
>>>> 5 - Remote server sends back acknowledgement
>>>> 6 - The app gets commit ack back.
>>>>
>>>> So this means, between step 2 and step 6, the app is not aware that
>>>> the
>>>> transaction has already been committed.
>>>> This is the reason why, in the event of server crashing between step 2
>>>> and step 6, and the remote takes over as the
>>>> new primary, the crashed server can not restart as standby and the
>>>> only
>>>> option is to recreate the db from the remote
>>>> server (which is now acting as the primary).
>>>>
>>>> Am I correct in the understanding?
>>>>
>>>> One more question: In Step 5, does the remote harden the transaction
>>>> on
>>>> the disk, or merely receives the transaction in the log buffer and it
>>>> sends
>>>> back ACK to the local server.
>>>>
>>>> Thanks
>>>>
>>>
>>>
>>
>
> This issue has been address with pg_rewind in the upcoming 9.5 major
> version release
>
> http://hlinnaka.iki.fi/2015/03/23/pg_rewind-in-postgresql-9-5/

pg_rewind is used 9.3 onwards

Saludos,
Gilberto Castillo
ETECSA, La Habana, Cuba
---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>

Re: Postgres Synchronous replication

От
Scott Ribe
Дата:
On May 21, 2015, at 2:10 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
>
> Every time the primary crashes…

While I agree it’s a limitation, in 14 years I’ve not seen PG crash once.


--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
https://www.linkedin.com/in/scottribe/
(303) 722-0567 voice







Re: Postgres Synchronous replication

От
Keith
Дата:


On Thu, May 21, 2015 at 5:19 PM, Gilberto Castillo <gilberto.castillo@etecsa.cu> wrote:


> On Thu, May 21, 2015 at 4:10 PM, Ravi Krishna <sravikrishna3@gmail.com>
> wrote:
>
>>
>> AFAIK the commit on master happens only after it receives ack from the
>>> slave. This is how synchronous replication ensures that the slave is'in
>>> sync'.
>>>
>>
>>
>> If that is the case , then why does PG find it impossible to sync back
>> with the primary after a crash.
>> Other products offering similar technology do not have this issue.
>>
>> In my opinion this is quite a serious limitation with PG replication.
>> Every time the primary crashes and the business continues with the
>> promotion of standby as the new primary, the crashed server has to be
>> reinitialized for the set up of the replication.
>>
>>
>>
>>>
>>> On Thu, May 21, 2015 at 3:56 PM, Ravi Krishna <sravikrishna3@gmail.com>
>>> wrote:
>>>
>>>> I want to understand how PG sync replication works. This is what I
>>>> know
>>>> (assuming two node sync replication)
>>>>
>>>> 1 - Application issues commit.
>>>> 2 - PG commits the transaction locally on the primary server.
>>>> 3 - At this stage the application has not got the commit indication
>>>> back.
>>>> 4 - PG transmits the transaction from the local to the remote server.
>>>> 5 - Remote server sends back acknowledgement
>>>> 6 - The app gets commit ack back.
>>>>
>>>> So this means, between step 2 and step 6, the app is not aware that
>>>> the
>>>> transaction has already been committed.
>>>> This is the reason why, in the event of server crashing between step 2
>>>> and step 6, and the remote takes over as the
>>>> new primary, the crashed server can not restart as standby and the
>>>> only
>>>> option is to recreate the db from the remote
>>>> server (which is now acting as the primary).
>>>>
>>>> Am I correct in the understanding?
>>>>
>>>> One more question: In Step 5, does the remote harden the transaction
>>>> on
>>>> the disk, or merely receives the transaction in the log buffer and it
>>>> sends
>>>> back ACK to the local server.
>>>>
>>>> Thanks
>>>>
>>>
>>>
>>
>
> This issue has been address with pg_rewind in the upcoming 9.5 major
> version release
>
> http://hlinnaka.iki.fi/2015/03/23/pg_rewind-in-postgresql-9-5/

pg_rewind is used 9.3 onwards

Saludos,
Gilberto Castillo
ETECSA, La Habana, Cuba

---
This message was processed by Kaspersky Mail Gateway 5.6.28/RELEASE running at host imx3.etecsa.cu
Visit our web-site: <http://www.kaspersky.com>, <http://www.viruslist.com>


True, but it's included in the official release as of 9.5 :)

Re: Postgres Synchronous replication

От
Keith
Дата:


On Thu, May 21, 2015 at 4:29 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
On May 21, 2015, at 2:10 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
>
> Every time the primary crashes…

While I agree it’s a limitation, in 14 years I’ve not seen PG crash once.


--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
https://www.linkedin.com/in/scottribe/
(303) 722-0567 voice







--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

It's not just an issue of crashing, it's an issue of failover in general, which happens quite often in high-availability scenarios. Unless a second slave is available, it can be a bit nerve-racking after you failover due to some network or system issue, you've only got a single postgres instance up for your application until you rebuild the old one.

Re: Postgres Synchronous replication

От
Ravi Krishna
Дата:
>>
Every time the primary crashes…
 
>
While I agree it’s a limitation, in 14 years I’ve not seen PG crash once

What about machine crashing? With Intel/Linux based commodity hardware, we go with the assumption that machines will fail and indeed they do. For our DB2 and Oracle servers, it is a breeze to re-integrate the failed server.

Anyhow pg_rewind seem to solve that issue.

 

Re: Postgres Synchronous replication

От
Scott Ribe
Дата:
On May 21, 2015, at 3:00 PM, Ravi Krishna <sravikrishna3@gmail.com> wrote:
>
> What about machine crashing?

Hasn’t been a problem. But of course it’s a numbers game, and if you have enough machines in play, you will certainly
havefailures. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
https://www.linkedin.com/in/scottribe/
(303) 722-0567 voice







Re: Postgres Synchronous replication

От
Felipe Santos
Дата:
"does the remote harden the transaction on the disk"

Yes, but it is surely written to the LOG FILES, there's no warranty that it has already been written to the DATA FILES, although that's not a problem, since LOG FILES are designed to be played forward in the case of a server crash, replaying the commited transactions to the data files.

Re: Postgres Synchronous replication

От
Ravi Krishna
Дата:
> "does the remote harden the transaction on the disk"
>
> Yes, but it is surely written to the LOG FILES, there's no warranty that it
> has already been written to the DATA FILES, although that's not a problem,
> since LOG FILES are designed to be played forward in the case of a server
> crash, replaying the commited transactions to the data files.

By hardening I meant only the WAL logs and not the data files.