Re: Minimal logical decoding on standbys

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Minimal logical decoding on standbys
Дата
Msg-id 20190523174821.s3kgumgmo35opavg@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Minimal logical decoding on standbys  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы Re: Minimal logical decoding on standbys  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Список pgsql-hackers
Hi,

On 2019-05-23 23:08:55 +0530, Amit Khandekar wrote:
> On Thu, 23 May 2019 at 21:29, Andres Freund <andres@anarazel.de> wrote:
> > On 2019-05-23 17:39:21 +0530, Amit Khandekar wrote:
> > > On Tue, 21 May 2019 at 21:49, Andres Freund <andres@anarazel.de> wrote:
> > > Yeah, I agree we should add such checks to minimize the possibility of
> > > reading logical records from a master that has insufficient wal_level.
> > > So to summarize :
> > > a. CheckLogicalDecodingRequirements() : Add Controlfile wal_level checks
> > > b. Call this function call in CreateInitDecodingContext() as well.
> > > c. While decoding XLOG_PARAMETER_CHANGE record, emit recovery conflict
> > > error if there is an existing logical slot.
> > >
> > > This made me think more of the race conditions. For instance, in
> > > pg_create_logical_replication_slot(), just after
> > > CheckLogicalDecodingRequirements and before actually creating the
> > > slot, suppose concurrently Controlfile->wal_level is changed from
> > > logical to replica.  So suppose a new slot does get created. Later the
> > > slot is read, so in pg_logical_slot_get_changes_guts(),
> > > CheckLogicalDecodingRequirements() is called where it checks
> > > ControlFile->wal_level value. But just before it does that,
> > > ControlFile->wal_level concurrently changes back to logical, because
> > > of replay of another param-change record. So this logical reader will
> > > think that the wal_level is sufficient, and will proceed to read the
> > > records, but those records are *before* the wal_level change, so these
> > > records don't have logical data.
> >
> > I don't think that's an actual problem, because there's no decoding
> > before the slot exists and CreateInitDecodingContext() has determined
> > the start LSN. And by that point the slot exists, slo
> > XLOG_PARAMETER_CHANGE replay can error out.
> 
> So between the start lsn and the lsn for
> parameter-change(logical=>replica) record, there can be some records ,
> and these don't have logical data. So the slot created will read from
> the start lsn, and proceed to read these records, before reading the
> parameter-change record.

I don't think that's possible. By the time CreateInitDecodingContext()
is called, the slot *already* exists (but in a state that'll cause it to
be throw away on error). But the restart point has not yet been
determined. Thus, if there is a XLOG_PARAMETER_CHANGE with a wal_level
change it can error out. And to handle the race of wal_level changing
between CheckLogicalDecodingRequirements() and the slot creation, we
recheck in CreateInitDecodingContext().

Think we might nee dto change ReplicationSlotReserveWal() to use the
replay, rather than the redo pointer for logical slots though.


> Can you re-write the below phrase please ? I suspect there is some
> letters missing there :
> "And by that point the slot exists, slo XLOG_PARAMETER_CHANGE replay
> can error out"

I think it's just one additional letter, namely s/slo/so/


> Are you saying we want to error out when the postgres replays the
> param change record and there is existing logical slot ? I thought you
> were suggesting earlier that it's the decoder.c code which should
> error out when reading the param-change record.

Yes, that's what I'm saying. See this portion of my previous email on
the topic:

On 2019-05-21 09:19:37 -0700, Andres Freund wrote:
> On 2019-04-16 12:27:46 +0530, Amit Khandekar wrote:
> > What I am thinking is :
> > In CheckLogicalDecodingRequirements(), besides checking wal_level,
> > also check ControlFile->wal_level when InHotStandby. I mean, when we
> > are InHotStandby, both wal_level and ControlFile->wal_level should be
> > >= WAL_LEVEL_LOGICAL. This will allow us to error out when using logical
> > slot when master has incompatible wal_level.
> 
> That still allows the primary to change wal_level after logical decoding
> has started, so we need the additional checks.
> 
> I'm not yet sure how to best deal with the fact that wal_level might be
> changed by the primary at basically all times. We would eventually get
> an error when logical decoding reaches the XLOG_PARAMETER_CHANGE. But
> that's not necessarily sufficient - if a primary changes its wal_level
> to lower, it could remove information logical decoding needs *before*
> logical decoding reaches the XLOG_PARAMETER_CHANGE record.
> 
> So I suspect we need conflict handling in xlog_redo's
> XLOG_PARAMETER_CHANGE case. If we there check against existing logical
> slots, we ought to be safe.
> 
> Therefore I think the check in CheckLogicalDecodingRequirements() needs
> to be something like:
> 
> if (RecoveryInProgress())
> {
>     if (!InHotStandby)
>         ereport(ERROR, "logical decoding on a standby required hot_standby to be enabled");
>     /*
>      * This check is racy, but whenever XLOG_PARAMETER_CHANGE indicates that
>      * wal_level has changed, we verify that there are no existin glogical
>      * replication slots. And to avoid races around creating a new slot,
>      * CheckLogicalDecodingRequirements() is called once before creating the slot,
>      * andd once when logical decoding is initially starting up.
>      */
>     if (ControlFile->wal_level != LOGICAL)
>         ereport(ERROR, "...");
> }
> 
> And then add a second CheckLogicalDecodingRequirements() call into
> CreateInitDecodingContext().
> 
> What do you think?


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark Dilger
Дата:
Сообщение: fsync failure in durable_unlink ignored in xlog.c?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: fsync failure in durable_unlink ignored in xlog.c?