Re: SyncRepLock acquired exclusively in default configuration

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: SyncRepLock acquired exclusively in default configuration
Дата
Msg-id CA+fd4k7UorG-rxxq=RKU=2TzCJF0QtacFLz4EAQ=6UbqeGWBpA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: SyncRepLock acquired exclusively in default configuration  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
On Wed, 19 Aug 2020 at 21:41, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/08/12 15:32, Masahiko Sawada wrote:
> > On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
> >>
> >>
> >>
> >>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>>
> >>> I think this gets to the root of the issue. If we check the flag
> >>> without a lock, we might see a slightly stale value. But, considering
> >>> that there's no particular amount of time within which configuration
> >>> changes are guaranteed to take effect, maybe that's OK. However, there
> >>> is one potential gotcha here: if the walsender declares the standby to
> >>> be synchronous, a user can see that, right? So maybe there's this
> >>> problem: a user sees that the standby is synchronous and expects a
> >>> transaction committing afterward to provoke a wait, but really it
> >>> doesn't. Now the user is unhappy, feeling that the system didn't
> >>> perform according to expectations.
> >>
> >> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to
somethingother than 0. 
> >>
> >> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost. 
> >
> > I think that if the standby is quite behind the primary and in case of
> > the primary crashes, the likelihood of losing commits might get
> > higher. The user can see the standby became synchronous standby via
> > pg_stat_replication but commit completes without a wait because the
> > checkpointer doesn't update sync_standbys_defined yet. If the primary
> > crashes before standby catching up and the user does failover, the
> > committed transaction will be lost, even though the user expects that
> > transaction commit has been replicated to the standby synchronously.
> > And this can happen even without the patch, right?
>
> I think you're right. This issue can happen even without the patch.
>
> Maybe we should not mark the standby as "sync" whenever sync_standbys_defined
> is false even if synchronous_standby_names is actually set and walsenders have
> already detect that?

It seems good. I guess that we can set 'async' to sync_status and 0 to
sync_priority when sync_standbys_defined is not true regardless of
walsender's actual priority value. We print the message "standby
\"%s\" now has synchronous standby priority %u" in SyncRepInitConfig()
regardless of sync_standbys_defined but maybe it's fine as the message
isn't incorrect and it's DEBUG1 message.

> Or we need more aggressive approach;
> make the checkpointer update sync_standby_priority values of
> all the walsenders? ISTM that the latter looks overkill...

I think so too.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Problems with the FSM, heap fillfactor, and temporal locality
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Fix typo in procarrary.c