Re: Replication server timeout patch

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Replication server timeout patch
Дата
Msg-id AANLkTi=6WrTTM3yDfJneT7nKr8RDuS4wJvu0+kO7JSrk@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Replication server timeout patch  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: Replication server timeout patch  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 11.02.2011 22:11, Robert Haas wrote:
>>
>> On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina<drfarina@acm.org>  wrote:
>>>
>>> I split this out of the synchronous replication patch for independent
>>> review. I'm dashing out the door, so I haven't put it on the CF yet or
>>> anything, but I just wanted to get it out there...I'll be around in
>>> Not Too Long to finish any other details.
>>
>> This looks like a useful and separately committable change.
>
> Hmm, so this patch implements a watchdog, where the master disconnects the
> standby if the heartbeat from the standby stops for more than
> 'replication_[server]_timeout' seconds. The standby sends the heartbeat
> every wal_receiver_status_interval seconds.
>
> It would be nice if the master and standby could negotiate those settings.
> As the patch stands, it's easy to have a pathological configuration where
> replication_server_timeout < wal_receiver_status_interval, so that the
> master repeatedly disconnects the standby because it doesn't reply in time.
> Maybe the standby should report how often it's going to send a heartbeat,
> and master should wait for that long + some safety margin. Or maybe the
> master should tell the standby how often it should send the heartbeat?

I guess the biggest use case for that behavior would be in a case
where you have two standbys, one of which doesn't send a heartbeat and
the other of which does.  Then you really can't rely on a single
timeout.

Maybe we could change the server parameter to indicate what multiple
of wal_receiver_status_interval causes a hangup, and then change the
client to notify the server what value it's using.  But that gets
complicated, because the value could be changed while the standby is
running.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: btree_gist (was: CommitFest progress - or lack thereof)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: psql patch: tab-complete :variables also at buffer start