Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Дата
Msg-id CAD21AoCZ0xpS-PSLa1ot=TijntYsnHKZ=6u6QAocPcO3E08Q5w@mail.gmail.com
обсуждение исходный текст
Ответ на RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  ("shiy.fnst@fujitsu.com" <shiy.fnst@fujitsu.com>)
Ответы Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Thu, Jul 14, 2022 at 11:16 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jul 12, 2022 at 5:58 PM shiy.fnst@fujitsu.com
> > <shiy.fnst@fujitsu.com> wrote:
> > >
> > > It happened when executing the following code because it tried to free a
> > NULL
> > > pointer (catchange_xip).
> > >
> > >         /* be tidy */
> > >         if (ondisk)
> > >                 pfree(ondisk);
> > > +       if (catchange_xip)
> > > +               pfree(catchange_xip);
> > >  }
> > >
> > > It seems to be related to configure option. I could reproduce it when using
> > > `./configure --enable-debug`.
> > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> > ggdb"`.
> >
> > Hmm, I could not reproduce this problem even if I use ./configure
> > --enable-debug. And it's weird that we checked if catchange_xip is not
> > null but we did pfree for it:
> >
> > #1  pfree (pointer=0x0) at mcxt.c:1177
> > #2  0x000000000078186b in SnapBuildSerialize (builder=0x1fd5e78,
> > lsn=25719712) at snapbuild.c:1792
> >
> > Is it reproducible in your environment?
>
> Thanks for your reply! Yes, it is reproducible. And I also reproduced it on the
> v4 patch you posted [1].

Thank you for testing!

>
> [1]
https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com
>
> > If so, could you test it again
> > with the following changes?
> >
> > diff --git a/src/backend/replication/logical/snapbuild.c
> > b/src/backend/replication/logical/snapbuild.c
> > index d015c06ced..a6e76e3781 100644
> > --- a/src/backend/replication/logical/snapbuild.c
> > +++ b/src/backend/replication/logical/snapbuild.c
> > @@ -1788,7 +1788,7 @@ out:
> >     /* be tidy */
> >     if (ondisk)
> >         pfree(ondisk);
> > -   if (catchange_xip)
> > +   if (catchange_xip != NULL)
> >         pfree(catchange_xip);
> >  }
> >
>
> I tried this and could still reproduce the problem.

Does the backtrace still show we attempt to pfree a null-pointer?

>
> Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
> the value of catchange_xcnt instead of catchange_xip before pfree.

Could you check if this problem occurred when we reached there via
goto pass, i.e., did we call ReorderBufferGetCatalogChangesXacts() or
not?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Remove support for Visual Studio 2013
Следующее
От: torikoshia
Дата:
Сообщение: Re: Add connection active, idle time to pg_stat_activity