Обсуждение: pg_dump crash due to incomplete ordering of DO_SUBSCRIPTION_REL objects

Поиск
Список
Период
Сортировка

pg_dump crash due to incomplete ordering of DO_SUBSCRIPTION_REL objects

От
vignesh C
Дата:
Hi,

While verifying upgrade of subscriber instance, I noticed pg_dump
crash caused by incomplete sorting logic for DO_SUBSCRIPTION_REL
objects in DOTypeNameCompare(). When multiple subscription–relation
entries belong to the same subscription, the comparison does not
establish a complete ordering. In this case, the comparison falls
through to the generic assertion path. The attached patch fixes this
by extending the comparison for DO_SUBSCRIPTION_REL objects to include
deterministic ordering keys. After the subscription name comparison,
entries are ordered by the referenced table's schema name and then by
table name.

This issue has started failing after commit:
commit 0decd5e89db9f5edb9b27351082f0d74aae7a9b6
Sort dump objects independent of OIDs, for the 7 holdout object types.

This can be reproduced by having logical replication setup with
subscription subscribing to few tables.

Thanks,
Vignesh

Вложения

Re: pg_dump crash due to incomplete ordering of DO_SUBSCRIPTION_REL objects

От
Noah Misch
Дата:
On Mon, Dec 15, 2025 at 11:35:35PM +0530, vignesh C wrote:
> While verifying upgrade of subscriber instance, I noticed pg_dump
> crash caused by incomplete sorting logic for DO_SUBSCRIPTION_REL
> objects in DOTypeNameCompare(). When multiple subscription–relation
> entries belong to the same subscription, the comparison does not
> establish a complete ordering. In this case, the comparison falls
> through to the generic assertion path. The attached patch fixes this
> by extending the comparison for DO_SUBSCRIPTION_REL objects to include
> deterministic ordering keys. After the subscription name comparison,
> entries are ordered by the referenced table's schema name and then by
> table name.
> 
> This issue has started failing after commit:
> commit 0decd5e89db9f5edb9b27351082f0d74aae7a9b6
> Sort dump objects independent of OIDs, for the 7 holdout object types.
> 
> This can be reproduced by having logical replication setup with
> subscription subscribing to few tables.

That makes sense.  Thanks.  Do you have commands we could add to
src/test/regress/sql/subscription.sql to cover this code?



Re: pg_dump crash due to incomplete ordering of DO_SUBSCRIPTION_REL objects

От
vignesh C
Дата:
On Tue, 16 Dec 2025 at 00:00, Noah Misch <noah@leadboat.com> wrote:
>
> On Mon, Dec 15, 2025 at 11:35:35PM +0530, vignesh C wrote:
> > While verifying upgrade of subscriber instance, I noticed pg_dump
> > crash caused by incomplete sorting logic for DO_SUBSCRIPTION_REL
> > objects in DOTypeNameCompare(). When multiple subscription–relation
> > entries belong to the same subscription, the comparison does not
> > establish a complete ordering. In this case, the comparison falls
> > through to the generic assertion path. The attached patch fixes this
> > by extending the comparison for DO_SUBSCRIPTION_REL objects to include
> > deterministic ordering keys. After the subscription name comparison,
> > entries are ordered by the referenced table's schema name and then by
> > table name.
> >
> > This issue has started failing after commit:
> > commit 0decd5e89db9f5edb9b27351082f0d74aae7a9b6
> > Sort dump objects independent of OIDs, for the 7 holdout object types.
> >
> > This can be reproduced by having logical replication setup with
> > subscription subscribing to few tables.
>
> That makes sense.  Thanks.  Do you have commands we could add to
> src/test/regress/sql/subscription.sql to cover this code?

This dumping of subscription relation is specific to upgrading to
preserve the subscription relation. So I felt we will not be able to
add tests to subscription.sql, instead how about adding one more table
to 004_subscription.pl where subscription upgrade tests are verified
like the attached patch.

Regards,
Vignesh

Вложения

Re: pg_dump crash due to incomplete ordering of DO_SUBSCRIPTION_REL objects

От
Noah Misch
Дата:
On Tue, Dec 16, 2025 at 05:22:36PM +0530, vignesh C wrote:
> On Tue, 16 Dec 2025 at 00:00, Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Dec 15, 2025 at 11:35:35PM +0530, vignesh C wrote:
> > > This issue has started failing after commit:
> > > commit 0decd5e89db9f5edb9b27351082f0d74aae7a9b6
> > > Sort dump objects independent of OIDs, for the 7 holdout object types.

> > That makes sense.  Thanks.  Do you have commands we could add to
> > src/test/regress/sql/subscription.sql to cover this code?
> 
> This dumping of subscription relation is specific to upgrading to
> preserve the subscription relation. So I felt we will not be able to
> add tests to subscription.sql, instead how about adding one more table
> to 004_subscription.pl where subscription upgrade tests are verified
> like the attached patch.

That's a good way to test it.

> --- a/src/bin/pg_dump/pg_dump_sort.c
> +++ b/src/bin/pg_dump/pg_dump_sort.c
> @@ -454,6 +454,20 @@ DOTypeNameCompare(const void *p1, const void *p2)
>          if (cmpval != 0)
>              return cmpval;
>      }
> +    else if (obj1->objType == DO_SUBSCRIPTION_REL)
> +    {
> +        SubRelInfo *srobj1 = *(SubRelInfo *const *) p1;
> +        SubRelInfo *srobj2 = *(SubRelInfo *const *) p2;
> +
> +        /* Sort by schema name (subscription name was already considered) */

Let's change the values getSubscriptionRelations() stores in
SubrRelInfo.obj.namespace and SubrRelInfo.obj.name to be more like
DO_PUBLICATION_REL:

        pubrinfo[j].dobj.namespace = tbinfo->dobj.namespace;
        pubrinfo[j].dobj.name = tbinfo->dobj.name;
        pubrinfo[j].publication = pubinfo;

DO_SUBSCRIPTION_REL (new in v17) is gratuitously different:

        subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
        subrinfo[i].tblinfo = tblinfo;
        subrinfo[i].subinfo = subinfo;

What do you think?  This would change sort order from (subname, rel) to (rel,
subname).  Historically, we've avoided churning dump order, in case folks diff
dump output over time.  I bet that practice is less common for binary dumps.
Since DO_SUBSCRIPTION_REL is only for binary dumps, let's just change it.

> +        cmpval = strcmp(srobj1->tblinfo->dobj.namespace->dobj.name,
> +                        srobj2->tblinfo->dobj.namespace->dobj.name);

The subrinfo change will make this comparison go away.  If it had stayed, it
should be ready for namespace==NULL like pgTypeNameCompare() is.  (I think
pgTypeNameCompare() could drop that defense, because the getTypes() call to
findNamespace() will pg_fatal() on absence.  Until it does drop that defense,
the rest of pg_dump_sort.c should handle namespace==NULL.)

> --- a/src/bin/pg_upgrade/t/004_subscription.pl
> +++ b/src/bin/pg_upgrade/t/004_subscription.pl

> -# Wait till the table tab_upgraded1 reaches 'ready' state
> +# Wait till the tables tab_upgraded and tab_upgraded1 reaches 'ready' state

s/reaches/reach/ there.

To find other text needing edits, I searched this file for tab_upgraded.  The
following comment still implies "all tables" encompasses just two:

# ------------------------------------------------------
# Check that pg_upgrade is successful when all tables are in ready or in
# init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
# in init state) along with retaining the replication origin's remote lsn,
# subscription's running status, failover option, and retain_dead_tuples
# option.
# ------------------------------------------------------



Re: pg_dump crash due to incomplete ordering of DO_SUBSCRIPTION_REL objects

От
vignesh C
Дата:
On Wed, 17 Dec 2025 at 00:54, Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Dec 16, 2025 at 05:22:36PM +0530, vignesh C wrote:
> > On Tue, 16 Dec 2025 at 00:00, Noah Misch <noah@leadboat.com> wrote:
> > > On Mon, Dec 15, 2025 at 11:35:35PM +0530, vignesh C wrote:
> > > > This issue has started failing after commit:
> > > > commit 0decd5e89db9f5edb9b27351082f0d74aae7a9b6
> > > > Sort dump objects independent of OIDs, for the 7 holdout object types.
>
> > > That makes sense.  Thanks.  Do you have commands we could add to
> > > src/test/regress/sql/subscription.sql to cover this code?
> >
> > This dumping of subscription relation is specific to upgrading to
> > preserve the subscription relation. So I felt we will not be able to
> > add tests to subscription.sql, instead how about adding one more table
> > to 004_subscription.pl where subscription upgrade tests are verified
> > like the attached patch.
>
> That's a good way to test it.
>
> > --- a/src/bin/pg_dump/pg_dump_sort.c
> > +++ b/src/bin/pg_dump/pg_dump_sort.c
> > @@ -454,6 +454,20 @@ DOTypeNameCompare(const void *p1, const void *p2)
> >               if (cmpval != 0)
> >                       return cmpval;
> >       }
> > +     else if (obj1->objType == DO_SUBSCRIPTION_REL)
> > +     {
> > +             SubRelInfo *srobj1 = *(SubRelInfo *const *) p1;
> > +             SubRelInfo *srobj2 = *(SubRelInfo *const *) p2;
> > +
> > +             /* Sort by schema name (subscription name was already considered) */
>
> Let's change the values getSubscriptionRelations() stores in
> SubrRelInfo.obj.namespace and SubrRelInfo.obj.name to be more like
> DO_PUBLICATION_REL:
>
>                 pubrinfo[j].dobj.namespace = tbinfo->dobj.namespace;
>                 pubrinfo[j].dobj.name = tbinfo->dobj.name;
>                 pubrinfo[j].publication = pubinfo;
>
> DO_SUBSCRIPTION_REL (new in v17) is gratuitously different:
>
>                 subrinfo[i].dobj.name = pg_strdup(subinfo->dobj.name);
>                 subrinfo[i].tblinfo = tblinfo;
>                 subrinfo[i].subinfo = subinfo;

Modified

> What do you think?  This would change sort order from (subname, rel) to (rel,
> subname).  Historically, we've avoided churning dump order, in case folks diff
> dump output over time.

Although grouping by publication and subscription would be cleaner,
publication relations have been ordered first by relation and then by
publication. Retaining the same ordering for subscription relations
preserves consistency.

> I bet that practice is less common for binary dumps.
> Since DO_SUBSCRIPTION_REL is only for binary dumps, let's just change it.
>
> > +             cmpval = strcmp(srobj1->tblinfo->dobj.namespace->dobj.name,
> > +                                             srobj2->tblinfo->dobj.namespace->dobj.name);
>
> The subrinfo change will make this comparison go away.  If it had stayed, it
> should be ready for namespace==NULL like pgTypeNameCompare() is.  (I think
> pgTypeNameCompare() could drop that defense, because the getTypes() call to
> findNamespace() will pg_fatal() on absence.  Until it does drop that defense,
> the rest of pg_dump_sort.c should handle namespace==NULL.)
>
> > --- a/src/bin/pg_upgrade/t/004_subscription.pl
> > +++ b/src/bin/pg_upgrade/t/004_subscription.pl
>
> > -# Wait till the table tab_upgraded1 reaches 'ready' state
> > +# Wait till the tables tab_upgraded and tab_upgraded1 reaches 'ready' state
>
> s/reaches/reach/ there.

Modified

> To find other text needing edits, I searched this file for tab_upgraded.  The
> following comment still implies "all tables" encompasses just two:
>
> # ------------------------------------------------------
> # Check that pg_upgrade is successful when all tables are in ready or in
> # init state (tab_upgraded1 table is in ready state and tab_upgraded2 table is
> # in init state) along with retaining the replication origin's remote lsn,
> # subscription's running status, failover option, and retain_dead_tuples
> # option.
> # ------------------------------------------------------

Updated the comments

The attached v3 version patch has the changes for the same.

Regards,
Vignesh

Вложения