Re: Sync Rep for 2011CF1

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Sync Rep for 2011CF1
Дата	16 февраля 2011 г. 16:29:52
Msg-id	AANLkTikyW6GX3Mh2qTN=SfoQ=N10oWS3FHcKPp9OKCNa@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Sync Rep for 2011CF1 (Simon Riggs <simon@2ndQuadrant.com>)
Ответы	Re: Sync Rep for 2011CF1 (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Feb 16, 2011 at 11:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2011-02-16 at 17:40 +0200, Heikki Linnakangas wrote:
>> On 16.02.2011 17:36, Simon Riggs wrote:
>> > On Tue, 2011-02-15 at 12:08 -0500, Robert Haas wrote:
>> >> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
>> >>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
>> >>> <heikki.linnakangas@enterprisedb.com>  wrote:
>> >>>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
>> >>>> sends a status update every time the WAL is flushed. If the walreceiver is
>> >>>> busy receiving and flushing, that would happen once per WAL segment, which
>> >>>> seems sensible.
>> >>>
>> >>> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
>> >>> via XLogWalRcvFlush(). This looks unsafe.
>> >>
>> >> Good catch.  Is the cleanest solution to pass a boolean parameter to
>> >> XLogWalRcvFlush() indicating whether we're in the midst of dying?
>> >
>> > Surely if you do this then sync rep will fail to respond correctly if
>> > WalReceiver dies.
>> >
>> > Why is it OK to write to disk, but not OK to reply?
>>
>> Because the connection might be dead. A broken connection is a likely
>> cause of walreceiver death.
>
> Would it not be possible to check that?

I'm not actually sure that it matters that much whether we do or not.
ISTM that the WAL receiver is normally going to exit the main loop (in
WalReceiverMain) right here:
       /* Process any requests or signals received recently */       ProcessWalRcvInterrupts();

But to get to that point, we either have to be making our first pass
through the loop (in which case nothing interesting has happened yet)
or we have to have just completed an iteration through the loop (in
which case we just sent a reply).  I think that the only thing that
can have changed since the last reply is the replay position, which
this version of the sync rep patch doesn't care about anyway.  Even if
it did, I'm not sure it'd be worth complicating the die path to
squeeze in one final reply.

Actually, on further reflection, I'm not even sure why we bother with
the fsync.  It seems like a useful safeguard but I'm not seeing how we
can get to that point without having fsync'd everything anyway.  Am I
missing something?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Bruce Momjian
Дата: 16 февраля 2011 г., 16:29:29
Сообщение: Re: Debian readline/libedit breakage

Следующее

От: Robert Haas
Дата: 16 февраля 2011 г., 16:31:04
Сообщение: Re: contrib loose ends: 9.0 to 9.1 incompatibilities

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Sync Rep for 2011CF1

Предыдущее

Следующее