Обсуждение: Setting up streaming replication on large database (20+ TB) for the first time

Поиск
Список
Период
Сортировка

Setting up streaming replication on large database (20+ TB) for the first time

От
"Ivan N. Ivanov"
Дата:
I have a large database (~25 TB) and I want to set up streaming replication for the first time.

My problem is that after completion of the pg_basebackup (which completed for 2 days with --wal-method=none) now PG is replaying the WAL files from the WAL archive directory but it can not keep up. The replaying of WAL files is the same as the physical time, for example:

2022-08-17 22:42:57 EEST [13507-6] [] DETAIL:  Last completed transaction was at log time 2022-08-15 18:24:02.155289+03.
2022-08-17 22:48:35 EEST [13507-12] [] DETAIL:  Last completed transaction was at log time 2022-08-15 18:29:54.962822+03.
2022-08-17 22:54:35 EEST [13507-16] [] DETAIL:  Last completed transaction was at log time 2022-08-15 18:34:20.099468+03.

From ~22:43 to ~22:48 there are 5 minutes. And completed transactions are at ~18:24 and ~18:29 (5 minutes).

I have even put all WAL files from the archive directly in the pg_wal directory of the replica and now PostgreSQL skips the cp command from restore_command, i.e. I have removed the restore_command and now the WAL files are only recovering, this is the only operation, but it is slow:

postgres: startup   recovering 000000010003FC7900000013
postgres: startup   recovering 000000010003FC7900000014
postgres: startup   recovering 000000010003FC7900000015
...

And it cannot keep up and my replication cannot start since it is 2 days behind the master... The replica has the same SSD disks as the master.

Is there a better way to do this? How to speed up recovering of WAL files? I have increased shared_buffers as much as I can... Is there something that I miss from the recovery process?

I do not have problems setting up replications for the first time for small database (10 GB - 100 GB), but for 25 TB I can not set the replication, because of this lag.

Re: Setting up streaming replication on large database (20+ TB) for the first time

От
Christophe Pettus
Дата:

> On Aug 17, 2022, at 13:06, Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
> 
> How to speed up recovering of WAL files?

Since you are running on your own hardware, you might take a look at:

    https://github.com/TritonDataCenter/pg_prefaulter



Re: [(catch-ext)] Re: Setting up streaming replication on large database (20+ TB) for the first time

От
"Ivan N. Ivanov"
Дата:
Thank you for your answer! I have found this tool and I will try it tomorrow to see if this "read-ahead" feature will speed up the process.

On Wed, Aug 17, 2022 at 11:09 PM Christophe Pettus <xof@thebuild.com> wrote:


> On Aug 17, 2022, at 13:06, Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
>
> How to speed up recovering of WAL files?

Since you are running on your own hardware, you might take a look at:

        https://github.com/TritonDataCenter/pg_prefaulter

Re: Setting up streaming replication on large database (20+ TB) for the first time

От
Ron
Дата:
pg_backrest will certainly backup your data faster. It might be able to be 
used as a seed instead of pg_basebackup.

On 8/17/22 15:06, Ivan N. Ivanov wrote:
> I have a large database (~25 TB) and I want to set up streaming 
> replication for the first time.
>
> My problem is that after completion of the pg_basebackup (which completed 
> for 2 days with --wal-method=none) now PG is replaying the WAL files from 
> the WAL archive directory but it can not keep up. The replaying of WAL 
> files is the same as the physical time, for example:
>
> 2022-08-17 22:42:57 EEST [13507-6] [] DETAIL:  Last completed transaction 
> was at log time 2022-08-15 18:24:02.155289+03.
> 2022-08-17 22:48:35 EEST [13507-12] [] DETAIL:  Last completed transaction 
> was at log time 2022-08-15 18:29:54.962822+03.
> 2022-08-17 22:54:35 EEST [13507-16] [] DETAIL:  Last completed transaction 
> was at log time 2022-08-15 18:34:20.099468+03.
>
> From ~22:43 to ~22:48 there are 5 minutes. And completed transactions are 
> at ~18:24 and ~18:29 (5 minutes).
>
> I have even put all WAL files from the archive directly in the pg_wal 
> directory of the replica and now PostgreSQL skips the cp command from 
> restore_command, i.e. I have removed the restore_command and now the WAL 
> files are only recovering, this is the only operation, but it is slow:
>
> postgres: startup   recovering 000000010003FC7900000013
> postgres: startup   recovering 000000010003FC7900000014
> postgres: startup   recovering 000000010003FC7900000015
> ...
>
> And it cannot keep up and my replication cannot start since it is 2 days 
> behind the master... The replica has the same SSD disks as the master.
>
> Is there a better way to do this? How to speed up recovering of WAL files? 
> I have increased shared_buffers as much as I can... Is there something 
> that I miss from the recovery process?
>
> I do not have problems setting up replications for the first time for 
> small database (10 GB - 100 GB), but for 25 TB I can not set the 
> replication, because of this lag.
>

-- 
Angular momentum makes the world go 'round.



Re: Setting up streaming replication on large database (20+ TB) for the first time

От
Vijaykumar Jain
Дата:
I just did a backup and restore of a replica using pgbackrest.
db size 28tb

nvme/ssd storage
96 cpu, 380 gb mem

zst compression, 24 workers (backup, 12 workers restore)

2.5 hours to backup
2 hours to restore.
Wal replay is something I forgot to tune, but I could now use https://pgbackrest.org/configuration.html#section-archive/option-archive-get-queue-max to speed up pulls too. 
Everything is on prem, no cloud FYI and gentoo.



On Thu, Aug 18, 2022, 11:23 AM Ron <ronljohnsonjr@gmail.com> wrote:
pg_backrest will certainly backup your data faster. It might be able to be
used as a seed instead of pg_basebackup.

On 8/17/22 15:06, Ivan N. Ivanov wrote:
> I have a large database (~25 TB) and I want to set up streaming
> replication for the first time.
>
> My problem is that after completion of the pg_basebackup (which completed
> for 2 days with --wal-method=none) now PG is replaying the WAL files from
> the WAL archive directory but it can not keep up. The replaying of WAL
> files is the same as the physical time, for example:
>
> 2022-08-17 22:42:57 EEST [13507-6] [] DETAIL:  Last completed transaction
> was at log time 2022-08-15 18:24:02.155289+03.
> 2022-08-17 22:48:35 EEST [13507-12] [] DETAIL:  Last completed transaction
> was at log time 2022-08-15 18:29:54.962822+03.
> 2022-08-17 22:54:35 EEST [13507-16] [] DETAIL:  Last completed transaction
> was at log time 2022-08-15 18:34:20.099468+03.
>
> From ~22:43 to ~22:48 there are 5 minutes. And completed transactions are
> at ~18:24 and ~18:29 (5 minutes).
>
> I have even put all WAL files from the archive directly in the pg_wal
> directory of the replica and now PostgreSQL skips the cp command from
> restore_command, i.e. I have removed the restore_command and now the WAL
> files are only recovering, this is the only operation, but it is slow:
>
> postgres: startup   recovering 000000010003FC7900000013
> postgres: startup   recovering 000000010003FC7900000014
> postgres: startup   recovering 000000010003FC7900000015
> ...
>
> And it cannot keep up and my replication cannot start since it is 2 days
> behind the master... The replica has the same SSD disks as the master.
>
> Is there a better way to do this? How to speed up recovering of WAL files?
> I have increased shared_buffers as much as I can... Is there something
> that I miss from the recovery process?
>
> I do not have problems setting up replications for the first time for
> small database (10 GB - 100 GB), but for 25 TB I can not set the
> replication, because of this lag.
>

--
Angular momentum makes the world go 'round.


Re: [(catch-ext)] Re: Setting up streaming replication on large database (20+ TB) for the first time

От
"Ivan N. Ivanov"
Дата:
Thank you, people. The big problem in my case, which I have not mentioned, is that I think the network is a bottleneck, because I am running pg_basebackup through internet from local country to Amazon instance in Germany and the speed in copying is around 50 MB/sec max, that is why it takes 2 days for copying.

I will try using high compression for the basebackup to reduce the time.

pgbackrest is an alternative, too

Thank you again!



On Wed, Aug 17, 2022 at 11:13 PM Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
Thank you for your answer! I have found this tool and I will try it tomorrow to see if this "read-ahead" feature will speed up the process.

On Wed, Aug 17, 2022 at 11:09 PM Christophe Pettus <xof@thebuild.com> wrote:


> On Aug 17, 2022, at 13:06, Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
>
> How to speed up recovering of WAL files?

Since you are running on your own hardware, you might take a look at:

        https://github.com/TritonDataCenter/pg_prefaulter

Re: [(catch-ext)] Re: Setting up streaming replication on large database (20+ TB) for the first time

От
Vijaykumar Jain
Дата:
Sorry for top posting, from phone.

But pgbackrest exactly helped with that. With compression and parallel process in backup, the backup and restore was quick. I used this, where I took a backup and immediately did a restore so less wals to replay, else wal replay is indeed slow.

On Thu, Aug 18, 2022, 1:03 PM Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
Thank you, people. The big problem in my case, which I have not mentioned, is that I think the network is a bottleneck, because I am running pg_basebackup through internet from local country to Amazon instance in Germany and the speed in copying is around 50 MB/sec max, that is why it takes 2 days for copying.

I will try using high compression for the basebackup to reduce the time.

pgbackrest is an alternative, too

Thank you again!



On Wed, Aug 17, 2022 at 11:13 PM Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
Thank you for your answer! I have found this tool and I will try it tomorrow to see if this "read-ahead" feature will speed up the process.

On Wed, Aug 17, 2022 at 11:09 PM Christophe Pettus <xof@thebuild.com> wrote:


> On Aug 17, 2022, at 13:06, Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
>
> How to speed up recovering of WAL files?

Since you are running on your own hardware, you might take a look at:

        https://github.com/TritonDataCenter/pg_prefaulter
Yes, pgbackrest seems the best solution for faster backup and restore.

We can close the discussion here for now, replaying million WAL files is just slow and this is normal.

On Thu, Aug 18, 2022 at 12:10 PM Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:
Sorry for top posting, from phone.

But pgbackrest exactly helped with that. With compression and parallel process in backup, the backup and restore was quick. I used this, where I took a backup and immediately did a restore so less wals to replay, else wal replay is indeed slow.

On Thu, Aug 18, 2022, 1:03 PM Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
Thank you, people. The big problem in my case, which I have not mentioned, is that I think the network is a bottleneck, because I am running pg_basebackup through internet from local country to Amazon instance in Germany and the speed in copying is around 50 MB/sec max, that is why it takes 2 days for copying.

I will try using high compression for the basebackup to reduce the time.

pgbackrest is an alternative, too

Thank you again!



On Wed, Aug 17, 2022 at 11:13 PM Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
Thank you for your answer! I have found this tool and I will try it tomorrow to see if this "read-ahead" feature will speed up the process.

On Wed, Aug 17, 2022 at 11:09 PM Christophe Pettus <xof@thebuild.com> wrote:


> On Aug 17, 2022, at 13:06, Ivan N. Ivanov <ivan.ni@telebid-pro.com> wrote:
>
> How to speed up recovering of WAL files?

Since you are running on your own hardware, you might take a look at:

        https://github.com/TritonDataCenter/pg_prefaulter