Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Дата	10 февраля 2016 г. 06:41:44
Msg-id	CAA4eK1J+heuBJ-QpaXzhzW7OcZoxvdePHAjP_EA_KA7cg7CmLA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby (Michael Paquier <michael.paquier@gmail.com>)
Ответы	Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Список	pgsql-hackers

Дерево обсуждения

On Wed, Feb 10, 2016 at 7:17 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
>
> On Tue, Feb 9, 2016 at 10:42 PM, Amit Kapila wrote:
> > On Tue, Feb 9, 2016 at 6:08 PM, Michael Paquier wrote:
> >> Well, the idea is to improve the system responsiveness. Imagine that
> >> the call to GetProgressRecPtr() is done within the exclusive lock
> >> portion, but that while scanning the WAL insert lock entries another
> >> backend is waiting to take a lock on them, and this backend is trying
> >> to insert the first record that updates the progress LSN since the
> >> last checkpoint. In this case the checkpoint would be skipped.
> >> However, imagine that such a record is able to get through it while
> >> scanning the progress values in the WAL insert locks, in which case a
> >> checkpoint would be generated.
> >
> > Such case was not covered without your patch and I don't see the
> > need of same especially at the cost of taking locks.
>
> In this code path that's quite cheap, and it clearly improves the
> decision-making when wal_level >= hs which is now rather broken (say
> not optimized much).
>
> >> In any case, I think that we should try
> >> to get exclusive lock areas as small as possible if we have ways to do
> >> so.
> >>
> >
> > Sure, but not when you are already going to take lock for longer
> > duration.
>
> Why would an exclusive lock be taken for a longer duration in the
> checkpoint portion?

Consider below code:

+ * Get progress before acquiring insert locks to shorten the locked

+ * section waiting ahead.

+ */

+ progress_lsn = GetProgressRecPtr();

+ /*

* We must block concurrent insertions while examining insert state to

* determine the checkpoint REDO pointer.

WALInsertLockAcquireExclusive();

In GetProgressRecPtr(), first it take WALInsertLocks one-by-one to

retrieve the latest value of progressAt and then again it will take

all the WALInsertLocks in WALInsertLockAcquireExclusive() and then

do some work. Now I think it is okay to retrieve the latest of progressAt

after WALInsertLockAcquireExclusive() as you don't need to take the

locks again.

>
> > - last_snapshot_lsn != GetXLogInsertRecPtr())
> > +
> > GetLastCheckpointRecPtr() < GetProgressRecPtr())
> >
> > How the above check in patch suppose to work?
> > I think it could so happen once bgwriter logs snapshot, both checkpoint
> > and progressAt won't get updated any further and I think this check will
> > allow to log snapshots in such a case as well.
>
> The only purpose of this check is to do the following thing: if no WAL
> activity has happened since last checkpoint, there is no need to log a
> standby snapshot from the perspective of the bgwriter. In case the
> system is idle, we want to skip logging that and avoid unnecessary
> checkpoints because those records would have been generated. If the
> system is not idle, or to put it in other words there has been at
> least one record since the last checkpoint, we would log a standby
> snapshot, enforcing as well a checkpoint to happen the next time
> CreateCheckpoint() is gone through, and a standby snapshot is logged
> as well after the checkpoint contents are flushed. I am not sure I
> understand what you are getting at...
>

Let me try to say again, suppose ControlFile->checkPoint is at

100 and latest value of progressAt returned by GetProgressRecPtr

is 105, so the first time the above check happens, it will allow

to log standby snapshot which is okay, now assume there is no

activity, neither there is any checkpoint and again after

LOG_SNAPSHOT_INTERVAL_MS interval, when the above check

gets executed, it will pass and log the standby snapshot which is

*not* okay, because we don't want to log standby snapshot when

there is no activity. Am I missing something?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Kyotaro HORIGUCHI
Дата: 10 февраля 2016 г., 06:41:23
Сообщение: Re: Support for N synchronous standby servers - take 2

Следующее

От: Amit Kapila
Дата: 10 февраля 2016 г., 07:15:02
Сообщение: Re: Speed up Clog Access by increasing CLOG buffers

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Предыдущее

Следующее