Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Дата
Msg-id 20220720.095808.772209644652657995.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответы Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
At Tue, 19 Jul 2022 17:31:07 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in 
> On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
> > > Good work. I wonder without comments this may create a problem in the
> > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > > freeing the memory any less robust. Also, for consistency, we can use
> > > a similar check based on xcnt in the SnapBuildRestore to free the
> > > memory in the below code:
> > > + /* set catalog modifying transactions */
> > > + if (builder->catchange.xip)
> > > + pfree(builder->catchange.xip);
> >
> > But xip must be positive there.  We can add a comment explains that.
> >
> 
> Yes, if we add the comment for it, probably we need to explain a gcc's
> optimization but it seems to be too much to me.

Ah, sorry. I confused with other place in SnapBuildPurgeCommitedTxn.
I agree to you, that we don't need additional comment *there*.

> > +       catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder);
> >
> > catchange_xip is allocated in the current context, but ondisk is
> > allocated in builder->context.  I see it kind of inconsistent (even if
> > the current context is same with build->context).
> 
> Right. I thought that since the lifetime of catchange_xip is short,
> until the end of SnapBuildSerialize() function we didn't need to
> allocate it in builder->context. But given ondisk, we need to do that
> for catchange_xip as well. Will fix it.

Thanks.

> > +       if (builder->committed.xcnt > 0)
> > +       {
> >
> > It seems to me comitted.xip is always non-null, so we don't need this.
> > I don't strongly object to do that, though.
> 
> But committed.xcnt could be 0, right? We don't need to copy anything
> by calling memcpy with size = 0 in this case. Also, it looks more
> consistent with what we do for catchange_xcnt.

Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
allocate the array with a fixed size. SnapBuildAddCommittedTxn still
assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
ondisk.builder.commited.xip populated with a valid array pointer. But
the patch allows committed.xip be NULL, thus in that case,
SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
failure.

> > +       Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns));
> >
> > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..).
> > (xcnt == rb->catchange_ntxns) is not what should be checked here. The
> > assert just requires that catchange_txns and catchange_ntxns are
> > consistent so it should be checked just after dlist_empty.. I think.
> >
> 
> If we want to check if catchange_txns and catchange_ntxns are
> consistent, should we check (xcnt == rb->catchange_ntxns) as well, no?
> This function requires the caller to use rb->catchange_ntxns as the
> length of the returned array. I think this assertion ensures that the
> actual length of the array is consistent with the length we
> pre-calculated.

Sorry again. Please forget the comment about xcnt == rb->catchange_ntxns..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yugo NAGATA
Дата:
Сообщение: Re: Add a test for "cannot truncate foreign table"
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages