RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

Поиск
Список
Период
Сортировка
От shiy.fnst@fujitsu.com
Тема RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Дата
Msg-id OSZPR01MB63102143FE2431519DEF7AFDFD889@OSZPR01MB6310.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> On Tue, Jul 12, 2022 at 5:58 PM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
> >
> > It happened when executing the following code because it tried to free a
> NULL
> > pointer (catchange_xip).
> >
> >         /* be tidy */
> >         if (ondisk)
> >                 pfree(ondisk);
> > +       if (catchange_xip)
> > +               pfree(catchange_xip);
> >  }
> >
> > It seems to be related to configure option. I could reproduce it when using
> > `./configure --enable-debug`.
> > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> ggdb"`.
> 
> Hmm, I could not reproduce this problem even if I use ./configure
> --enable-debug. And it's weird that we checked if catchange_xip is not
> null but we did pfree for it:
> 
> #1  pfree (pointer=0x0) at mcxt.c:1177
> #2  0x000000000078186b in SnapBuildSerialize (builder=0x1fd5e78,
> lsn=25719712) at snapbuild.c:1792
> 
> Is it reproducible in your environment?

Thanks for your reply! Yes, it is reproducible. And I also reproduced it on the
v4 patch you posted [1].

[1]
https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com

> If so, could you test it again
> with the following changes?
> 
> diff --git a/src/backend/replication/logical/snapbuild.c
> b/src/backend/replication/logical/snapbuild.c
> index d015c06ced..a6e76e3781 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -1788,7 +1788,7 @@ out:
>     /* be tidy */
>     if (ondisk)
>         pfree(ondisk);
> -   if (catchange_xip)
> +   if (catchange_xip != NULL)
>         pfree(catchange_xip);
>  }
> 

I tried this and could still reproduce the problem.

Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
the value of catchange_xcnt instead of catchange_xip before pfree.

[2] https://www.postgresql.org/message-id/CAA4eK1%2BXPdm8G%3DEhUJA12Pi1YvQAfcz2%3DkTd9a4BjVx4%3Dgk-MA%40mail.gmail.com

diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index c482e906b0..68b9c4ef7d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1573,7 +1573,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
        Size            needed_length;
        SnapBuildOnDisk *ondisk = NULL;
        TransactionId   *catchange_xip = NULL;
-       size_t          catchange_xcnt;
+       size_t          catchange_xcnt = 0;
        char       *ondisk_c;
        int                     fd;
        char            tmppath[MAXPGPATH];
@@ -1788,7 +1788,7 @@ out:
        /* be tidy */
        if (ondisk)
                pfree(ondisk);
-       if (catchange_xip)
+       if (catchange_xcnt != 0)
                pfree(catchange_xip);
 }


Regards,
Shi yu

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Следующее
От: John Naylor
Дата:
Сообщение: Re: i.e. and e.g.