Re: Replication failure, slave requesting old segments

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Replication failure, slave requesting old segments
Дата
Msg-id 20180813141154.GM3326@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Replication failure, slave requesting old segments  (Adrian Klaver <adrian.klaver@aklaver.com>)
Список pgsql-general
Greetings,

* Adrian Klaver (adrian.klaver@aklaver.com) wrote:
> On 08/13/2018 05:08 AM, Phil Endecott wrote:
> >Adrian Klaver wrote:
> >Really?  I thought the intention was that the system should be
> >able to recover reliably when the slave reconnects after a
> >period of downtime, subject only to there being sufficient
> >network/CPU/disk bandwidth etc. for it to eventually catch up.

That's correct.

> See also my reply to Stephen earlier. Basically you are trying to coordinate
> two different operations. They start from the same source pg_xlog(pg_wal
> 10+) but arrive on a different time scale and from different locations.
> Without sufficient sanity checks it is possible they diverge enough on one
> or both paths to render the process unstable.

This isn't what's happening.  We're not talking about a timeline change
here or a replica being promoted to be a primary in general.  There's no
diverging happening- it's the same consistent WAL stream, just coming
from two different sources, which PG is specifically designed to handle
and should be handling seamlessly.

> I would say that:
>
> "If you set up a WAL archive that's accessible from the standby, these
> solutions are not required, since the standby can always use the archive to
> catch up provided it retains enough segments."
>
> should be more like:
>
> "If you set up a WAL archive that's accessible from the standby, these
> solutions are not required, since the standby can always use the archive to
> catch up provided it retains enough segments. *This is dependent on
> verification that the archiving is working properly. A belt and suspenders
> approach would be to set wal_keep_segments to a value > 0 in the event
> archiving is not properly functioning*"
> "

I don't think I can disagree more with this additional wording, and I
*really* don't think we should be encouraging people to set a high
wal_keep_segments.  The specific case here looks like it just need to be
set to, exactly, '1', to ensure that the primary hasn't removed the last
WAL file that it archived.

Thanks!

Stephen

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Replication failure, slave requesting old segments
Следующее
От: "Phil Endecott"
Дата:
Сообщение: Re: Replication failure, slave requesting old segments