Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Дата
Msg-id CAB7nPqT_AUQS=41n6yy_=w2Txt=LBvq+hZr+2jD4=oevHe7e6Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Thu, Feb 4, 2016 at 6:38 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Feb 4, 2016 at 4:10 PM, Andres Freund <andres@anarazel.de> wrote:
>> On 2016-02-04 18:21:41 +0530, Amit Kapila wrote:
>>> I think generally it is good idea, but one thing worth a thought is that
>>> by doing so, we need to acquire all WAL Insertion locks every
>>> LOG_SNAPSHOT_INTERVAL_MS to check the last_insert_pos for
>>> every slot, do you think it is matter of concern in any way for write
>>> workloads or it won't effect as we need to do this periodically?
>>
>> Michael and I just had an in-person discussion, and one of the topics
>> was that. The plan was basically to adapt the patch to:
>> 1) Store the progress lsn inside the wal insert lock
>> 2) Change the HasActivity API to return an the last LSN at which there
>>    was activity, instead of a boolean.
>> 3) Individually acquire each insert locks's lwlock to get it's progress
>>    LSN, but not the exclusive insert lock. We need the lwllock to avoid
>>    a torn 8byte read on some platforms.
>
> 4) Switch the condition to decide if a checkpoint should be skipped
> using the last activity position compared with ProcLastRecPtr in
> CreateCheckpoint to see if any activity has occurred since the
> checkpoint record was inserted, and do not care anymore if the
> previous record and current record are on different segments. This
> would basically work.

OK, attached is a patch that I believe addresses those issues. The
patch still has a couple of LOG entries that we had better remove in
the version that gets pushed, but they have proved to be useful for me
when testing the patch with a low checkpoint_timeout value to see if
checkpoints are properly skipped on an idle system. I found myself
adding another routine called GetLastCheckpointRecPtr() for the
bgwriter because ControlFile is not declared externally even if it is
in shared memory. I think that's better this way.

The original bug report referred to a low archive_timeout causing
standby snapshots to be logged when segments are switched. So we may
want at the end to not update the progress LSN for segment switch
records as well, but I have let that out of the patch for the time
being to address the primary concern of unnecessary checkpoints for
wal_level >= hs. We could address it with a later patch (planning to
do so), let's keep a step-by-step approach.

The patch attached will apply on master, on 9.5 there is one minor
conflict. For older versions we will need another reworked patch. I am
fine to produce those once we are fine with the shape of what gets
into master and 9.5. 9.4 uses WAL insert locks so things are rather
similar. For ~9.3, I think that we are going to need a single variable
in XLogCtl or similar to track the progress, but I have not looked
into that in details yet.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: silent data loss with ext4 / all current versions
Следующее
От: Brendan Jurd
Дата:
Сообщение: Re: proposal: make NOTIFY list de-duplication optional