Re: Dealing with latency to replication slave; what to do?

Поиск
Список
Период
Сортировка
От Rory Falloon
Тема Re: Dealing with latency to replication slave; what to do?
Дата
Msg-id CANP_6+NRfinHdxixJz3YoxAgc6oTk8=OdCGce_m-EjF8=emH+Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Dealing with latency to replication slave; what to do?  (Andres Freund <andres@anarazel.de>)
Ответы Re: Dealing with latency to replication slave; what to do?
Список pgsql-general
Hi Andres,

regarding your first reply, I was inferring that from the fact I saw those messages at the same time the replication stream fell behind. What other logs would be more pertinent to this situation?



On Tue, Jul 24, 2018 at 4:02 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-07-24 15:39:32 -0400, Rory Falloon wrote:
> Looking for any tips here on how to best maintain a replication slave which
> is operating under some latency between networks - around 230ms. On a good
> day/week, replication will keep up for a number of days, but however, when
> the link is under higher than average usage, keeping replication active can
> last merely minutes before falling behind again.
>
> 2018-07-24 18:46:14 GMTLOG:  database system is ready to accept read only
> connections
> 2018-07-24 18:46:15 GMTLOG:  started streaming WAL from primary at
> 2B/93000000 on timeline 1
> 2018-07-24 18:59:28 GMTLOG:  incomplete startup packet
> 2018-07-24 19:15:36 GMTLOG:  incomplete startup packet
> 2018-07-24 19:15:36 GMTLOG:  incomplete startup packet
> 2018-07-24 19:15:37 GMTLOG:  incomplete startup packet
>
> As you can see above, it lasted about half an hour before falling out of
> sync.

How can we see that from the above? The "incomplete startup messages"
are independent of streaming rep? I think you need to show us more logs.


> On the master, I have wal_keep_segments=128. What is happening when I see
> "incomplete startup packet" - is it simply the slave has fallen behind,
> and  cannot 'catch up' using the wal segments quick enough? I assume the
> slave is using the wal segments to replay changes and assuming there are
> enough wal segments to cover the period it cannot stream properly, it will
> eventually recover?

You might want to look into replication slots to ensure the primary
keeps the necessary segments, but not more, around.  You might also want
to look at wal_compression, to reduce the bandwidth usage.

Greetings,

Andres Freund

В списке pgsql-general по дате отправления:

Предыдущее
От: Raphaël Berbain
Дата:
Сообщение: width_bucket issue
Следующее
От: Christophe Pettus
Дата:
Сообщение: Re: width_bucket issue