Обсуждение: Streaming rep - why log shipping is necessary?

Поиск
Список
Период
Сортировка

Streaming rep - why log shipping is necessary?

От
marcin mank
Дата:
Hello,
I was reading the SR docs, and have the following question:
Is there a fundamental reason why archive_command etc. is required in
streaming replication mode?

Can`t setting up the standby be more like:
pg_start_streaming_backup() on the master (this will be queuing up
files in pg_xlog)
copy the data dir
set up the slave to connect to the master via streaming protocol
set up the master to allow connections from the slave
start slave (slave pulls the necessary WAL records from the master via
streaming, and signals the master that it`s done backing up)
When standby starts accepting connections, we know that the standby is OK.

archive_command, restore_command, etc. would be configured empty in this mode.

The failure mode for this is the pg_xlog directory filling up on the
master before the backup is done. But then, we can tell people to use
the more combersome, current setup.


Greetings
Marcin Mańk


Re: Streaming rep - why log shipping is necessary?

От
Heikki Linnakangas
Дата:
marcin mank wrote:
> I was reading the SR docs, and have the following question:
> Is there a fundamental reason why archive_command etc. is required in
> streaming replication mode?
> 
> Can`t setting up the standby be more like:
> pg_start_streaming_backup() on the master (this will be queuing up
> files in pg_xlog)
> copy the data dir
> set up the slave to connect to the master via streaming protocol
> set up the master to allow connections from the slave
> start slave (slave pulls the necessary WAL records from the master via
> streaming, and signals the master that it`s done backing up)
> When standby starts accepting connections, we know that the standby is OK.
> 
> archive_command, restore_command, etc. would be configured empty in this mode.
> 
> The failure mode for this is the pg_xlog directory filling up on the
> master before the backup is done. But then, we can tell people to use
> the more combersome, current setup.

The problem is first of all that there is no pg_start_streaming_backup()
command, but it's not only an issue during backup; the standby needs to
fall back to the archive if it falls behind so that the WAL files it
needs have already been recycled in the master.

There was the idea of adding a 'replication_lag_segments' setting, so
that the master always keeps n megabytes of WAL available for the
standby servers. See
http://archives.postgresql.org/pgsql-hackers/2010-01/msg02073.php. Not
sure what happened to it, Fujii was working on it but I guess he got
busy with other things.

If you're adventurous enough, it's actually possible to set an
archive_command that checks the status of the standby and returns
failure as long as the standby still needs the given WAL segment. That
way the primary doesn't recycle segments that are still needed by the
standby, and you can get away without restore_command in the standby.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Streaming rep - why log shipping is necessary?

От
marcin mank
Дата:
On Thu, Feb 25, 2010 at 10:08 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> the standby needs to
> fall back to the archive if it falls behind so that the WAL files it
> needs have already been recycled in the master.

Oh, so the master does not have to keep track of the state of the
standbys. That`s a nice design.

> If you're adventurous enough, it's actually possible to set an
> archive_command that checks the status of the standby and returns
> failure as long as the standby still needs the given WAL segment. That
> way the primary doesn't recycle segments that are still needed by the
> standby, and you can get away without restore_command in the standby.

That would be a nice addition to pg_standby, like
pg_standby --check-streaming-standby postgres:qwerty@10.0.0.1
--check-streaming-standby postgres:qwerty@10.0.0.2:5433

Greetings
Marcin Mańk


Re: Streaming rep - why log shipping is necessary?

От
Josh Berkus
Дата:
>> If you're adventurous enough, it's actually possible to set an
>> archive_command that checks the status of the standby and returns
>> failure as long as the standby still needs the given WAL segment. That
>> way the primary doesn't recycle segments that are still needed by the
>> standby, and you can get away without restore_command in the standby.

I'd prefer something a little different ... is there any way to tell
which log segments a standby still needs, *from* the standby?

Given performance considerations, I'd prefer to set up HS/SR with log
shipping because I don't want any slaves asking the master for a really
old log and interfering with its write performance.  However, that
leaves the issue of "How do I decide when I can delete archived log
segments off the slave because the slave is past them?"

Currently, I'm recommending some interval of time, but that's very brute
force and error-prone.  I'd prefer some elegant way to determine "log
segment contains no unapplied transactions."  Is there one?

--Josh Berkus


Re: Streaming rep - why log shipping is necessary?

От
Fujii Masao
Дата:
On Fri, Feb 26, 2010 at 2:34 AM, Josh Berkus <josh@agliodbs.com> wrote:
>
>>> If you're adventurous enough, it's actually possible to set an
>>> archive_command that checks the status of the standby and returns
>>> failure as long as the standby still needs the given WAL segment. That
>>> way the primary doesn't recycle segments that are still needed by the
>>> standby, and you can get away without restore_command in the standby.
>
> I'd prefer something a little different ... is there any way to tell
> which log segments a standby still needs, *from* the standby?

pg_controldata can tell that. The log segment containing the "Latest
checkpoint's REDO location" that pg_controldata reports is the oldest
one still required for the standby. So we can remove the older log
segments than it from the archive.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center