Обсуждение: Replication lag in Postgres

Поиск
Список
Период
Сортировка

Replication lag in Postgres

От
Mukesh Tanuku
Дата:
Hello everyone. 
Firstly thanks to the community members who are addressing all the queries that are posted. Those give us more insights about the issues/doubts in the postgres. 

I have a question with postgres HA setup.
We are setting up a 2 node postgres cluster with async streaming replication, we want to define a RPO (Recovery point objective) in case of primary failure. 

How can we best define the RPO in this setup? since it's an async streaming replication setup there might be a chance of data loss which is proportional to the replication delay. 

Is there any way we can configure the delay duration, like for example to make sure every 10 mins the standby sync has to happen with primary? 

Thank you
Regards 
Mukesh T

Re: Replication lag in Postgres

От
Laurenz Albe
Дата:
On Fri, 2024-07-12 at 20:41 +0530, Mukesh Tanuku wrote:
> I have a question with postgres HA setup.
> We are setting up a 2 node postgres cluster with async streaming replication, we want to
> define a RPO (Recovery point objective) in case of primary failure. 
>
> How can we best define the RPO in this setup? since it's an async streaming replication
> setup there might be a chance of data loss which is proportional to the replication delay. 
>
> Is there any way we can configure the delay duration, like for example to make sure every
> 10 mins the standby sync has to happen with primary? 

When there is a delay, it is usually because replay at the standby is delayed.
The WAL information is still replicated.  You won't lose that information on
failover; it will just make the failover take longer.

Unless you have a network problem, you should never lose more than a fraction
of a second.

Yours,
Laurenz Albe



Re: Replication lag in Postgres

От
Mukesh Tanuku
Дата:
Thank you for the information Laurenz Albe

On Fri, Jul 12, 2024 at 9:13 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2024-07-12 at 20:41 +0530, Mukesh Tanuku wrote:
> I have a question with postgres HA setup.
> We are setting up a 2 node postgres cluster with async streaming replication, we want to
> define a RPO (Recovery point objective) in case of primary failure. 
>
> How can we best define the RPO in this setup? since it's an async streaming replication
> setup there might be a chance of data loss which is proportional to the replication delay. 
>
> Is there any way we can configure the delay duration, like for example to make sure every
> 10 mins the standby sync has to happen with primary? 

When there is a delay, it is usually because replay at the standby is delayed.
The WAL information is still replicated.  You won't lose that information on
failover; it will just make the failover take longer.

Unless you have a network problem, you should never lose more than a fraction
of a second.

Yours,
Laurenz Albe

Re: Replication lag in Postgres

От
Muhammad Imtiaz
Дата:
Hi,

I recommend the following configurations/options in this case:

• wal_sender_timeout: This setting determines how long the primary server waits for the standby server to acknowledge receipt of WAL data. Adjusting this can help ensure timely data transfer.

• wal_keep_size: Ensures that enough WAL files are retained for the standby to catch up if it falls behind.

• checkpoint_timeout: Adjust the checkpoint frequency to ensure WAL files are regularly flushed and sent to the standby server regularly.

• pg_receivewal: Use this tool to continuously archive WAL files to a safe location.It will helpful if there is a delay in streaming replication, you have a backup of WAL files.

Regards,
Muhammad Imtiaz

On Fri, 12 Jul 2024, 20:11 Mukesh Tanuku, <mukesh.postgres@gmail.com> wrote:
Hello everyone. 
Firstly thanks to the community members who are addressing all the queries that are posted. Those give us more insights about the issues/doubts in the postgres. 

I have a question with postgres HA setup.
We are setting up a 2 node postgres cluster with async streaming replication, we want to define a RPO (Recovery point objective) in case of primary failure. 

How can we best define the RPO in this setup? since it's an async streaming replication setup there might be a chance of data loss which is proportional to the replication delay. 

Is there any way we can configure the delay duration, like for example to make sure every 10 mins the standby sync has to happen with primary? 

Thank you
Regards 
Mukesh T