Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Дата
Msg-id CAB7nPqQKEK-dynCwenXi9MxwLYF6s7aHv8FGu52aWLZC=mjfPA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers


On Wed, Feb 10, 2016 at 4:11 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Feb 10, 2016 at 12:16 PM, Michael Paquier <michael.paquier@gmail.com> wrote:


On Wed, Feb 10, 2016 at 12:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Feb 10, 2016 at 7:17 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>
> Consider below code:
>
> /*
> + * Get progress before acquiring insert locks to shorten the locked
> + * section waiting ahead.
> + */
> + progress_lsn = GetProgressRecPtr();
> +
> + /*
>   * We must block concurrent insertions while examining insert state to
>   * determine the checkpoint REDO pointer.
>   */
>   WALInsertLockAcquireExclusive();
>
> In GetProgressRecPtr(), first it take WALInsertLocks one-by-one to
> retrieve the latest value of progressAt and then again it will take
> all the WALInsertLocks in WALInsertLockAcquireExclusive() and then
> do some work.  Now I think it is okay to retrieve the latest of progressAt
> after WALInsertLockAcquireExclusive() as you don't need to take the
> locks again.

Sure, that would be OK. Now I am not on board if this means to have the checkpoint take an exclusive locks for a bit longer duration if that's not actually necessary.


Taking and releasing locks again and again (which is done in patch)
matters much more than adding few instructions under it and I think
if you would have written the code in-a-way as in patch in some of
the critical path, it would have been regressed badly, but because
checkpoint doesn't happen often, reproducing regression is difficult.
Anyway, there is no-point in too much argument, I think you have
understood what I wanted to say and if you think the way code is
currently written is better, then lets leave as it is for somebody else
to give suggestion on it.

Okay.
 

 

>> > - last_snapshot_lsn != GetXLogInsertRecPtr())
>> > +
>> > GetLastCheckpointRecPtr() < GetProgressRecPtr())
>> >
>> > How the above check in patch suppose to work?
>> > I think it could so happen once bgwriter logs snapshot, both checkpoint
>> > and progressAt won't get updated any further and I think this check will
>> > allow to log snapshots in such a case as well.
>>
>> The only purpose of this check is to do the following thing: if no WAL
>> activity has happened since last checkpoint, there is no need to log a
>> standby snapshot from the perspective of the bgwriter. In case the
>> system is idle, we want to skip logging that and avoid unnecessary
>> checkpoints because those records would have been generated. If the
>> system is not idle, or to put it in other words there has been at
>> least one record since the last checkpoint, we would log a standby
>> snapshot, enforcing as well a checkpoint to happen the next time
>> CreateCheckpoint() is gone through, and a standby snapshot is logged
>> as well after the checkpoint contents are flushed. I am not sure I
>> understand what you are getting at...
>
> Let me try to say again, suppose ControlFile->checkPoint is at
> 100 and latest value of progressAt returned by GetProgressRecPtr
> is 105, so the first time the above check happens, it will allow
> to log standby snapshot which is okay, now assume there is no
> activity, neither there is any checkpoint and again after
> LOG_SNAPSHOT_INTERVAL_MS interval, when the above check
> gets executed, it will pass and log the standby snapshot which is
> *not* okay, because we don't want to log standby snapshot when
> there is no activity.  Am I missing something?

Ah, right. Sorry for not getting your point. OK now I got it. So basically what the code does not is that if there is just one record after a checkpoint we would log every 15s a standby snapshot until the next checkpoint even if nothing happens, but we can save a bunch of standby records. So your point is that we need to save the result of GetProgressRecPtr() at each loop in the bgwriter when a standby snapshot is taken, say in a static XLogRecPtr called last_progress_lsn. And then at the next loop we compare it on the current result of GetProgressRecPtr(), so the condition to check if a standby snapshot should be generated or not in the bgwriter becomes that:
(now >= timeout &&
 GetLastCheckpointRecPtr() < current_progress_ptr &&
 last_progress_ptr < current_progress_ptr)

Why do you think including checkpoint related check is
required, how about something like below:

(now >= timeout &&
 last_progress_ptr < current_progress_ptr)

We need as well to be sure that no standby snapshots are logged just after a checkpoint in this code path when last_progress_ptr is referring to an LSN position *before* the last checkpoint LSN. There is already one snapshot in CreateCheckpoint() that is logged, there is no need of an extra one, explaining why the check on progress since last checkpoint is necessary to me.
--
Michael

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Freeze avoidance of very large table.