Re: [GENERAL] Rsync to a recovering streaming replica?

Поиск
Список
Период
Сортировка
От Scott Mead
Тема Re: [GENERAL] Rsync to a recovering streaming replica?
Дата
Msg-id CAKq0gvK+FT2vMxyG370gYoFmSXf_J4i7k7KJyaKZJfgM8QwQ-w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [GENERAL] Rsync to a recovering streaming replica?  (Igor Polishchuk <ora4dba@gmail.com>)
Ответы Re: [GENERAL] Rsync to a recovering streaming replica?  (Igor Polishchuk <ora4dba@gmail.com>)
Список pgsql-general


On Wed, Sep 27, 2017 at 1:59 PM, Igor Polishchuk <ora4dba@gmail.com> wrote:
Sorry, here are the missing details, if it helps:
Postgres 9.6.5 on CentOS 7.2.1511

> On Sep 27, 2017, at 10:56, Igor Polishchuk <ora4dba@gmail.com> wrote:
>
> Hello,
> I have a multi-terabyte streaming replica on a bysy database. When I set it up, repetative rsyncs take at least 6 hours each.
> So, when I start the replica, it begins streaming, but it is many hours behind right from the start. It is working for hours, and cannot reach a consistent state
> so the database is not getting opened for queries. I have plenty of WAL files available in the master’s pg_xlog, so the replica never uses archived logs.
> A question:
> Should I be able to run one more rsync from the master to my replica while it is streaming?
> The idea is to overcome the throughput limit imposed by a single recovery process on the replica and allow to catch up quicker.
> I remember doing it many years ago on Pg 8.4, and also heard from other people doing it. In all cases, it seamed working.
> I’m just not sure if there is no high risk of introducing some hidden data corruption, which I may not notice for a while on such a huge database.
> Any educated opinions on the subject here?

It really comes down to the amount of I/O (network and disk) your system can handle while under load.  I've used 2 methods to do this in the past:


  parsync (parallel rsync)is nice, it does all the hard work for you of parellizing rsync.  It's just a pain to get all the prereqs installed.


- rsync --itemize-changes
  Essentially, use this to get a list of files, manually split them out and fire up a number of rsyncs.  parsync does this for you, but, if you can't get it going for any reason, this works.


The real trick, after you do your parallel rsync, make sure that you run one final rsync to sync-up any missed items.

Remember, it's all about I/O.  The more parallel threads you use, the harder you'll beat up the disks / network on the master, which could impact production.

Good luck

--Scott





 
>
> Thank you
> Igor Polishchuk



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
--
Scott Mead
Sr. Architect
OpenSCG

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Mead
Дата:
Сообщение: Re: [GENERAL] WAL Archive command.
Следующее
От: Hans Schou
Дата:
Сообщение: [GENERAL] pg_upgrade?: Upgrade method from/to any version on random OS?