Re: Fast promotion failure

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Fast promotion failure
Дата
Msg-id CA+U5nM+zJjyA7aKXYT4eMD5npdEfbiqx3nBaAFvmJKLEeytGrQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fast promotion failure  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: Fast promotion failure
Список pgsql-hackers
On 21 May 2013 07:46, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 21.05.2013 00:00, Simon Riggs wrote:
>>
>> When we set the new timeline we should be updating all values that
>> might be used elsewhere. If we do that, then no matter when or how we
>> run GetXLogReplayRecPtr, it can't ever get it wrong in any backend.
>>
>> --- a/src/backend/access/transam/xlog.c
>> +++ b/src/backend/access/transam/xlog.c
>> @@ -5838,8 +5838,16 @@ StartupXLOG(void)
>>         }
>>
>>         /* Save the selected TimeLineID in shared memory, too */
>> -       XLogCtl->ThisTimeLineID = ThisTimeLineID;
>> -       XLogCtl->PrevTimeLineID = PrevTimeLineID;
>> +       {
>> +               /* use volatile pointer to prevent code rearrangement */
>> +               volatile XLogCtlData *xlogctl = XLogCtl;
>> +
>> +               SpinLockAcquire(&xlogctl->info_lck);
>> +               XLogCtl->ThisTimeLineID = ThisTimeLineID;
>> +               XLogCtl->lastReplayedTLI = ThisTimeLineID;
>> +               XLogCtl->PrevTimeLineID = PrevTimeLineID;
>> +               SpinLockRelease(&xlogctl->info_lck);
>> +       }
>
>
> Hmm. lastReplayedTLI is supposed to be the timeline of the last record that
> was replayed, ie. the timeline corresponding lastReplayedEndRecPtr. I think
> it would work, but it feels like it could be an easy source of bugs in the
> future.

I'm OK with that principle, as long as we don't touch ThisTimeLineID,
which has been the source of multiple bugs.

So we should set the TLI explicitly before installing, like attached patch.

Otherwise we'd need multiple permanent TLIs which would be overkill.

I feel there are problems because we set the newly selected TLI from
startup process into shared memory, then some time later we set
SharedRecoveryInProgress = false. That timing window isn't good, but I
don't see a different way.

> It might be a good idea to change walsender to not store that in
> ThisTimeLineID, though. It used to make sense for ThisTimeLineID to track
> the current recovery timeline in 9.2 and below, but now that walsender can
> be sending any older timeline, using ThisTimeLineID to store the latest one
> seems confusing.

Agreed, but looks like too much code to touch that lightly.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Move unused buffers to freelist
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Move unused buffers to freelist