Обсуждение: BUG #16655: pg_dump segfault when excluding postgis table

Поиск
Список
Период
Сортировка

BUG #16655: pg_dump segfault when excluding postgis table

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16655
Logged by:          Cameron Daniel
Email address:      cam.daniel@gmail.com
PostgreSQL version: 13.0
Operating system:   Debian Buster
Description:

I'm getting a segfault when trying to pg_dump a database containing the
postgis extension, but only when excluding the spatial_ref_sys table. I am
using Docker but the base image is standard Debian and Postgres is being
installed from the official APT repos.

The Docker config is here for reference -
https://github.com/ccakes/nomad-pgsql-patroni/

Reproduction steps: https://paste.rs/bHR
Backtrace: https://paste.rs/Ym3

The segfault appears to be in pg_dump, the server just logs "unexpected EOF
on client connection with an open transaction" and continues running fine
otherwise.

Let me know if there's any extra info I can provide


Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> I'm getting a segfault when trying to pg_dump a database containing the
> postgis extension, but only when excluding the spatial_ref_sys table.

Yeah, I can reproduce this here.  You don't really need postgis.
What I did was

1. Install the src/test/modules/test_pg_dump module into an
empty database.

2. alter table regress_table_dumpable add check(col1 > 0);

3. pg_dump -T regress_table_dumpable d1

While I haven't traced through it in complete detail, what seems to
be happening is

A. checkExtensionMembership sees that the table belongs to an extension,
hence marks it

            dobj->dump = ext->dobj.dump_contains & (DUMP_COMPONENT_ACL |
                                                    DUMP_COMPONENT_SECLABEL |
                                                    DUMP_COMPONENT_POLICY);

(This ends up setting only the SECLABEL and POLICY bits, I didn't
check why.)

B. processExtensionTables sees that the table is excluded by an exclusion
switch, so it sets configtbl->interesting = false.

(I've not verified whether B happens before or after A.  But we definitely
end up with a TableInfo having dobj.dump = 40, interesting = false.)

C. getTableAttrs sees that the table is marked interesting = false,
so it doesn't bother loading any subsidiary data; particularly not
the checkexprs[] array.  But ncheck is positive because getTables filled
it.

D. Since dobj.dump is non-zero, we eventually wind up at dumpTableSchema,
which crashes because checkexprs is NULL.  It's a bit surprising that
it doesn't crash sooner, but whatever.  We should absolutely not be in
that code at all for a table we didn't load all the subsidiary data for.

So I think the basic problem here is that checkExtensionMembership and
processExtensionTables are not on the same page.  We can't have
interesting = false for a table that any of the dobj.dump bits are set
for.

Arguably, we should get rid of the "interesting" flag entirely now that
we have dobj.dump.  I can't see that it's anything but a foot-gun.

Stephen, I think you touched this code last; any thoughts?

            regards, tom lane



RE: [EXTERNAL] BUG #16655: pg_dump segfault when excluding postgis table

От
"Godfrin, Philippe E"
Дата:

Is this hosted on AWS or EKS?

Phil godfrin

 

From: PG Bug reporting form <noreply@postgresql.org>
Sent: Monday, October 5, 2020 3:55 PM
To: pgsql-bugs@lists.postgresql.org
Cc: cam.daniel@gmail.com
Subject: [EXTERNAL] BUG #16655: pg_dump segfault when excluding postgis table

 

Use caution when interacting with this [EXTERNAL] email!

The following bug has been logged on the website:

Bug reference: 16655
Logged by: Cameron Daniel
Email address: cam.daniel@gmail.com
PostgreSQL version: 13.0
Operating system: Debian Buster
Description:

I'm getting a segfault when trying to pg_dump a database containing the
postgis extension, but only when excluding the spatial_ref_sys table. I am
using Docker but the base image is standard Debian and Postgres is being
installed from the official APT repos.

The Docker config is here for reference -
https://github.com/ccakes/nomad-pgsql-patroni/

Reproduction steps: https://paste.rs/bHR
Backtrace: https://paste.rs/Ym3

The segfault appears to be in pg_dump, the server just logs "unexpected EOF
on client connection with an open transaction" and continues running fine
otherwise.

Let me know if there's any extra info I can provide

Re: [EXTERNAL] BUG #16655: pg_dump segfault when excluding postgis table

От
Cameron Daniel
Дата:
Neither - on-prem Nomad setup

Cheers
Cameron

On Tue, 6 Oct 2020 at 14:48, Godfrin, Philippe E <Philippe.Godfrin@nov.com> wrote:

Is this hosted on AWS or EKS?

Phil godfrin

 

From: PG Bug reporting form <noreply@postgresql.org>
Sent: Monday, October 5, 2020 3:55 PM
To: pgsql-bugs@lists.postgresql.org
Cc: cam.daniel@gmail.com
Subject: [EXTERNAL] BUG #16655: pg_dump segfault when excluding postgis table

 

Use caution when interacting with this [EXTERNAL] email!



The following bug has been logged on the website:

Bug reference: 16655
Logged by: Cameron Daniel
Email address: cam.daniel@gmail.com
PostgreSQL version: 13.0
Operating system: Debian Buster
Description:

I'm getting a segfault when trying to pg_dump a database containing the
postgis extension, but only when excluding the spatial_ref_sys table. I am
using Docker but the base image is standard Debian and Postgres is being
installed from the official APT repos.

The Docker config is here for reference -
https://github.com/ccakes/nomad-pgsql-patroni/

Reproduction steps: https://paste.rs/bHR
Backtrace: https://paste.rs/Ym3

The segfault appears to be in pg_dump, the server just logs "unexpected EOF
on client connection with an open transaction" and continues running fine
otherwise.

Let me know if there's any extra info I can provide

Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
> > I'm getting a segfault when trying to pg_dump a database containing the
> > postgis extension, but only when excluding the spatial_ref_sys table.
>
> Yeah, I can reproduce this here.  You don't really need postgis.
> What I did was
>
> 1. Install the src/test/modules/test_pg_dump module into an
> empty database.
>
> 2. alter table regress_table_dumpable add check(col1 > 0);

While we certainly shouldn't just segfault, I don't think this is
actually something that's intended or should be supported- the extension
defines the table structure and users are really only expected to change
permissions (and related bits) on the table (and, to some extent, even
then only if they're familiar enough with the extension to know that the
extension understands that a user may change the privileges
post-install).

all-in-all, it's not really a great situation but it's at least better
than it had been (when we didn't try to deal with users changing
privileges on extension objects and people complained about that).

> 3. pg_dump -T regress_table_dumpable d1
>
> While I haven't traced through it in complete detail, what seems to
> be happening is
>
> A. checkExtensionMembership sees that the table belongs to an extension,
> hence marks it
>
>             dobj->dump = ext->dobj.dump_contains & (DUMP_COMPONENT_ACL |
>                                                     DUMP_COMPONENT_SECLABEL |
>                                                     DUMP_COMPONENT_POLICY);
>
> (This ends up setting only the SECLABEL and POLICY bits, I didn't
> check why.)
>
> B. processExtensionTables sees that the table is excluded by an exclusion
> switch, so it sets configtbl->interesting = false.
>
> (I've not verified whether B happens before or after A.  But we definitely
> end up with a TableInfo having dobj.dump = 40, interesting = false.)
>
> C. getTableAttrs sees that the table is marked interesting = false,
> so it doesn't bother loading any subsidiary data; particularly not
> the checkexprs[] array.  But ncheck is positive because getTables filled
> it.
>
> D. Since dobj.dump is non-zero, we eventually wind up at dumpTableSchema,
> which crashes because checkexprs is NULL.  It's a bit surprising that
> it doesn't crash sooner, but whatever.  We should absolutely not be in
> that code at all for a table we didn't load all the subsidiary data for.
>
> So I think the basic problem here is that checkExtensionMembership and
> processExtensionTables are not on the same page.  We can't have
> interesting = false for a table that any of the dobj.dump bits are set
> for.
>
> Arguably, we should get rid of the "interesting" flag entirely now that
> we have dobj.dump.  I can't see that it's anything but a foot-gun.
>
> Stephen, I think you touched this code last; any thoughts?

I had contemplated trying to get rid of the 'interesting' flag but, my
recollection anyway, is that it needed to be set sometimes even though
dobj.dump wasn't.  Has been a number of years though and either my
memory or my review at the time might be faulty.

I do agree with the general idea of trying to get rid of it though, and
instead using dobj.dump to decide when we need to load additional info.

In the end though, with this case at least, either way, we don't
currently track what constraints are installed by the extension and what
are installed by users after, and so we can't possibly recreate the
object in the same way that the user has- if we include all CHECK
constraints then we'll end up with duplicates for any that the extension
created originally, and if we don't include any then the restored table
will be missing any the user added.

Ultimately, if getting rid of 'interesting' and replacing it with
checking dobj.dump works, great, but that isn't going to actually make
this case 'work' for the user, it'll just make pg_dump not crash (which
is certainly good) but the added CHECK constraint will get lost.

Thinking about it a bit more, I'd be inclined to argue that what we
should really do here is mark tables created by extensions in a similar
manner that system tables are- basically, if you try to ALTER them, you
get an error unless you've set some extra bit that says you know what
you're doing and that you understand your change will get lost if you
dump/reload or pg_upgrade.

Thanks,

Stephen

Вложения

Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Yeah, I can reproduce this here.  You don't really need postgis.
>> What I did was
>> 1. Install the src/test/modules/test_pg_dump module into an
>> empty database.
>> 2. alter table regress_table_dumpable add check(col1 > 0);

> While we certainly shouldn't just segfault, I don't think this is
> actually something that's intended or should be supported- the extension
> defines the table structure and users are really only expected to change
> permissions (and related bits) on the table (and, to some extent, even
> then only if they're familiar enough with the extension to know that the
> extension understands that a user may change the privileges
> post-install).

I think you misunderstand my point here.  pg_dump would still segfault
if the CHECK constraint had been created by the extension (which, indeed,
is what I'm thinking of doing to convert this into a regression test).
Presumably, the reason it's failing on postgis is that spatial_ref_sys
has some extension-defined CHECK constraints.

I'm not intending to suggest that pg_dump ought to understand this
situation enough to dump the CHECK constraint --- I'm just describing a
quick way of reproducing the crash, without having to install postgis.
I think the actual case of interest is "-T extension_table should not
result in a crash when extension_table has CHECK constraints".

>> So I think the basic problem here is that checkExtensionMembership and
>> processExtensionTables are not on the same page.  We can't have
>> interesting = false for a table that any of the dobj.dump bits are set
>> for.
>> Arguably, we should get rid of the "interesting" flag entirely now that
>> we have dobj.dump.  I can't see that it's anything but a foot-gun.

> I had contemplated trying to get rid of the 'interesting' flag but, my
> recollection anyway, is that it needed to be set sometimes even though
> dobj.dump wasn't.  Has been a number of years though and either my
> memory or my review at the time might be faulty.

> I do agree with the general idea of trying to get rid of it though, and
> instead using dobj.dump to decide when we need to load additional info.

OK, I'll study this some more.

            regards, tom lane



Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Tom Lane
Дата:
I wrote:
> So I think the basic problem here is that checkExtensionMembership and
> processExtensionTables are not on the same page.  We can't have
> interesting = false for a table that any of the dobj.dump bits are set
> for.

After studying this further, I've concluded that there are two independent
issues here.  I was wrong to imagine that we can't get to dumpTableSchema
for tables that we're not going to dump; the actual intent of the code is
that it's going to run through that code anyway but not emit the
ArchiveEntry unless the DUMP_COMPONENT_DEFINITION flag is set.  (This
supports cases where we only want to dump comments, for example.)  So
dumpTableSchema really needs to cope with cases where getTableAttrs
has not loaded constraint info.  There are a lot of other loops in the
code that expect that checkexprs[] is populated when ncheck > 0, too.
Maybe none of them are reachable for a not-interesting table, but
I don't think that's the way to bet.  I think the right answer is to
redefine ncheck as being zero if we haven't loaded checkexprs[], as
per 0001 below.

0001 is sufficient to get past the proposed test case in 0002.  However,
I think processExtensionTables is also buggy here.  It does need to set
the "interesting" flag to true if it decides we need to dump the table's
data, because we have to have the per-attribute data to support that,
and "interesting" might not ever get set true on the table otherwise.
But *it is not OK to set "interesting" to false just because we don't
decide to dump the data*.  If it's been set true for some other reason,
that other reason didn't just vanish.  So I think we also need 0003.

BTW, the reason we can't get rid of the "interesting" flag is that
we need to set it on parent tables of tables we need to dump,
even if we're not going to dump the parents themselves (hence their
dobj.dump is zero).  Otherwise we won't collect the subsidiary data
we need to figure out which parts of the child's definition are
inherited.

            regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 76320468ba..445057a00e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -6772,7 +6772,7 @@ getTables(Archive *fout, int *numTables)
             tblinfo[i].reloftype = NULL;
         else
             tblinfo[i].reloftype = pg_strdup(PQgetvalue(res, i, i_reloftype));
-        tblinfo[i].ncheck = atoi(PQgetvalue(res, i, i_relchecks));
+        tblinfo[i].relchecks = atoi(PQgetvalue(res, i, i_relchecks));
         if (PQgetisnull(res, i, i_owning_tab))
         {
             tblinfo[i].owning_tab = InvalidOid;
@@ -8728,7 +8728,7 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables)
          * Get info about table CHECK constraints.  This is skipped for a
          * data-only dump, as it is only needed for table schemas.
          */
-        if (!dopt->dataOnly && tbinfo->ncheck > 0)
+        if (!dopt->dataOnly && tbinfo->relchecks > 0)
         {
             ConstraintInfo *constrs;
             int            numConstrs;
@@ -8780,17 +8780,18 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables)
             res = ExecuteSqlQuery(fout, q->data, PGRES_TUPLES_OK);

             numConstrs = PQntuples(res);
-            if (numConstrs != tbinfo->ncheck)
+            if (numConstrs != tbinfo->relchecks)
             {
                 pg_log_error(ngettext("expected %d check constraint on table \"%s\" but found %d",
                                       "expected %d check constraints on table \"%s\" but found %d",
-                                      tbinfo->ncheck),
-                             tbinfo->ncheck, tbinfo->dobj.name, numConstrs);
+                                      tbinfo->relchecks),
+                             tbinfo->relchecks, tbinfo->dobj.name, numConstrs);
                 pg_log_error("(The system catalogs might be corrupted.)");
                 exit_nicely(1);
             }

             constrs = (ConstraintInfo *) pg_malloc(numConstrs * sizeof(ConstraintInfo));
+            tbinfo->ncheck = numConstrs;
             tbinfo->checkexprs = constrs;

             for (int j = 0; j < numConstrs; j++)
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index e0b42e8391..192388aacf 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -281,7 +281,7 @@ typedef struct _tableInfo
     Oid            toast_oid;        /* toast table's OID, or 0 if none */
     uint32        toast_frozenxid;    /* toast table's relfrozenxid, if any */
     uint32        toast_minmxid;    /* toast table's relminmxid */
-    int            ncheck;            /* # of CHECK expressions */
+    int            relchecks;        /* # of CHECK expressions per pg_class */
     char       *reloftype;        /* underlying type for typed table */
     Oid            foreign_server; /* foreign server oid, if applicable */
     /* these two are set only if table is a sequence owned by a column: */
@@ -319,6 +319,7 @@ typedef struct _tableInfo
     bool       *notnull;        /* NOT NULL constraints on attributes */
     bool       *inhNotNull;        /* true if NOT NULL is inherited */
     struct _attrDefInfo **attrdefs; /* DEFAULT expressions */
+    int            ncheck;            /* # of CHECK constraints in checkexprs[] */
     struct _constraintInfo *checkexprs; /* CHECK constraints */
     char       *partkeydef;        /* partition key definition */
     char       *partbound;        /* partition bound definition */
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index 78aa07ce51..714ce2368c 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -135,6 +135,13 @@ my %pgdump_runs = (
             "$tempdir/defaults_tar_format.tar",
         ],
     },
+    exclude_table => {
+        dump_cmd => [
+            'pg_dump',   '--exclude-table=regress_table_dumpable',
+            '--inserts', "--file=$tempdir/exclude_table.sql",
+            'postgres',
+        ],
+    },
     extension_schema => {
         dump_cmd => [
             'pg_dump', '--schema=public', '--inserts',
@@ -225,6 +232,7 @@ my %full_runs = (
     clean_if_exists => 1,
     createdb        => 1,
     defaults        => 1,
+    exclude_table   => 1,
     no_privs        => 1,
     no_owner        => 1,);

@@ -317,7 +325,8 @@ my %tests = (
         regexp => qr/^
             \QCREATE TABLE public.regress_pg_dump_table (\E
             \n\s+\Qcol1 integer NOT NULL,\E
-            \n\s+\Qcol2 integer\E
+            \n\s+\Qcol2 integer,\E
+            \n\s+\QCONSTRAINT regress_pg_dump_table_col2_check CHECK ((col2 > 0))\E
             \n\);\n/xm,
         like => { binary_upgrade => 1, },
     },
@@ -443,7 +452,8 @@ my %tests = (
         regexp => qr/^
             \QCREATE TABLE regress_pg_dump_schema.test_table (\E
             \n\s+\Qcol1 integer,\E
-            \n\s+\Qcol2 integer\E
+            \n\s+\Qcol2 integer,\E
+            \n\s+\QCONSTRAINT test_table_col2_check CHECK ((col2 > 0))\E
             \n\);\n/xm,
         like => { binary_upgrade => 1, },
     },
@@ -589,6 +599,9 @@ my %tests = (
         like => {
             extension_schema => 1,
         },
+        unlike => {
+            exclude_table => 1,
+        },
     },);

 #########################################
diff --git a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
index 90e461ed35..c7a35c3afa 100644
--- a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
+++ b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
@@ -5,7 +5,7 @@

 CREATE TABLE regress_pg_dump_table (
     col1 serial,
-    col2 int
+    col2 int check (col2 > 0)
 );

 CREATE SEQUENCE regress_pg_dump_seq;
@@ -14,7 +14,7 @@ CREATE SEQUENCE regress_seq_dumpable;
 SELECT pg_catalog.pg_extension_config_dump('regress_seq_dumpable', '');

 CREATE TABLE regress_table_dumpable (
-    col1 int
+    col1 int check (col1 > 0)
 );
 SELECT pg_catalog.pg_extension_config_dump('regress_table_dumpable', '');

@@ -34,7 +34,7 @@ CREATE ACCESS METHOD regress_test_am TYPE INDEX HANDLER bthandler;
 -- this extension.
 CREATE TABLE regress_pg_dump_schema.test_table (
     col1 int,
-    col2 int
+    col2 int check (col2 > 0)
 );
 GRANT SELECT ON regress_pg_dump_schema.test_table TO regress_dump_test_role;

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 76320468ba..e3d61eafd1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17905,6 +17906,7 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[],

                 if (dumpobj)
                 {
+                    configtbl->interesting = true;
                     makeTableDataInfo(dopt, configtbl);
                     if (configtbl->dataObj != NULL)
                     {
@@ -17912,8 +17914,6 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[],
                             configtbl->dataObj->filtercond = pg_strdup(extconditionarray[j]);
                     }
                 }
-
-                configtbl->interesting = dumpobj;
             }
         }
         if (extconfigarray)

Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Tom Lane
Дата:
I wrote:
> 0001 is sufficient to get past the proposed test case in 0002.  However,
> I think processExtensionTables is also buggy here.  It does need to set
> the "interesting" flag to true if it decides we need to dump the table's
> data, because we have to have the per-attribute data to support that,
> and "interesting" might not ever get set true on the table otherwise.
> But *it is not OK to set "interesting" to false just because we don't
> decide to dump the data*.  If it's been set true for some other reason,
> that other reason didn't just vanish.  So I think we also need 0003.

Upon excavating in the git history, I discovered that that error was
introduced in 3eb3d3e78 [1], which explains why the OP thought it was
new in v13 --- no other branch has shipped the buggy code yet.

It's not entirely clear to me why we don't get a crash on the case in
pre-v12 branches, but I'm inclined to apply these patches all the way
back anyway, as they definitely seem like robustness improvements
whether or not a given branch accidentally fails to fail.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/18048b44-3414-b983-8c7c-9165b177900d%402ndQuadrant.com



Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Tom Lane
Дата:
I wrote:
> It's not entirely clear to me why we don't get a crash on the case in
> pre-v12 branches, but I'm inclined to apply these patches all the way
> back anyway, as they definitely seem like robustness improvements
> whether or not a given branch accidentally fails to fail.

OK, so after a bit of "git bisect"ing, things are clearer:

* Before 3eb3d3e78, there was no code path (or at least no known one)
that would allow control to arrive at dumpTableSchema without having
done getTableAttrs.  So my original thought that that was a logical
prerequisite was correct.

* Before v12, dumpTableData_insert would kind-of-accidentally work
without having done getTableAttrs, because it got all the needed
info from the PGresult of its "SELECT * FROM ONLY table" query,
rather than looking at the per-column data in the TableInfo.
v12 added a check on the attgenerated[] data, causing that to not
work anymore, which was the reported bug leading to 3eb3d3e78.

* However, the dumpTableData_copy code path also was only
accidentally working without getTableAttrs.  That left numattrs
zero in the TableInfo, causing fmtCopyColumnList to return an
empty string, which works almost all the time for both the
data fetching and the eventual reload.  It'd only fail obviously
in corner cases where the column order needed to be different at
reload; so it's not so surprising that we'd not noticed.
Nonetheless, *both* the COPY and INSERT code paths are dependent
on getTableAttrs, contrary to the discussion in the other thread;
and the COPY case's dependency goes back further.

I conclude that:

* My 0003 is the actually important fix.  0001 just adds some
robustness, which is not a bad thing but it doesn't seem critical.

* I should revise 0002 so that it exercises the COPY path and
checks that the correct column list is given.  I expect this will
show that there's a problem further back than v12.

* We should add Asserts in both dumpTableData and dumpTableSchema
to verify that "interesting" is set, in hopes of catching any
future mistakes of this ilk.

            regards, tom lane



Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> Yeah, I can reproduce this here.  You don't really need postgis.
> >> What I did was
> >> 1. Install the src/test/modules/test_pg_dump module into an
> >> empty database.
> >> 2. alter table regress_table_dumpable add check(col1 > 0);
>
> > While we certainly shouldn't just segfault, I don't think this is
> > actually something that's intended or should be supported- the extension
> > defines the table structure and users are really only expected to change
> > permissions (and related bits) on the table (and, to some extent, even
> > then only if they're familiar enough with the extension to know that the
> > extension understands that a user may change the privileges
> > post-install).
>
> I think you misunderstand my point here.  pg_dump would still segfault
> if the CHECK constraint had been created by the extension (which, indeed,
> is what I'm thinking of doing to convert this into a regression test).
> Presumably, the reason it's failing on postgis is that spatial_ref_sys
> has some extension-defined CHECK constraints.

Yes, I had misunderstood what the use-case being discussed here was,
having focused on just the stripped-down test case you were using.

> I'm not intending to suggest that pg_dump ought to understand this
> situation enough to dump the CHECK constraint --- I'm just describing a
> quick way of reproducing the crash, without having to install postgis.
> I think the actual case of interest is "-T extension_table should not
> result in a crash when extension_table has CHECK constraints".

I agree we shouldn't be crashing in such a case.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> BTW, the reason we can't get rid of the "interesting" flag is that
> we need to set it on parent tables of tables we need to dump,
> even if we're not going to dump the parents themselves (hence their
> dobj.dump is zero).  Otherwise we won't collect the subsidiary data
> we need to figure out which parts of the child's definition are
> inherited.

Ah, yes, that matches my recollection.

Should we ever drop support for inheritance, perhaps then we could be
rid of it.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I wrote:
> > It's not entirely clear to me why we don't get a crash on the case in
> > pre-v12 branches, but I'm inclined to apply these patches all the way
> > back anyway, as they definitely seem like robustness improvements
> > whether or not a given branch accidentally fails to fail.
>
> OK, so after a bit of "git bisect"ing, things are clearer:
>
> * Before 3eb3d3e78, there was no code path (or at least no known one)
> that would allow control to arrive at dumpTableSchema without having
> done getTableAttrs.  So my original thought that that was a logical
> prerequisite was correct.
>
> * Before v12, dumpTableData_insert would kind-of-accidentally work
> without having done getTableAttrs, because it got all the needed
> info from the PGresult of its "SELECT * FROM ONLY table" query,
> rather than looking at the per-column data in the TableInfo.
> v12 added a check on the attgenerated[] data, causing that to not
> work anymore, which was the reported bug leading to 3eb3d3e78.
>
> * However, the dumpTableData_copy code path also was only
> accidentally working without getTableAttrs.  That left numattrs
> zero in the TableInfo, causing fmtCopyColumnList to return an
> empty string, which works almost all the time for both the
> data fetching and the eventual reload.  It'd only fail obviously
> in corner cases where the column order needed to be different at
> reload; so it's not so surprising that we'd not noticed.
> Nonetheless, *both* the COPY and INSERT code paths are dependent
> on getTableAttrs, contrary to the discussion in the other thread;
> and the COPY case's dependency goes back further.
>
> I conclude that:
>
> * My 0003 is the actually important fix.  0001 just adds some
> robustness, which is not a bad thing but it doesn't seem critical.
>
> * I should revise 0002 so that it exercises the COPY path and
> checks that the correct column list is given.  I expect this will
> show that there's a problem further back than v12.
>
> * We should add Asserts in both dumpTableData and dumpTableSchema
> to verify that "interesting" is set, in hopes of catching any
> future mistakes of this ilk.

This all sounds like a reasonable approach to me.  I've gone back and
looked through things a bit and agree that processExtensionTables really
should be setting interesting to true for extension config tables when
we decide we want to include them.  Your 0003 patch looks correct to me,
and it does seem like we need to go all the way back with that.

Thanks,

Stephen

Вложения

Re: BUG #16655: pg_dump segfault when excluding postgis table

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> This all sounds like a reasonable approach to me.  I've gone back and
> looked through things a bit and agree that processExtensionTables really
> should be setting interesting to true for extension config tables when
> we decide we want to include them.  Your 0003 patch looks correct to me,
> and it does seem like we need to go all the way back with that.

Pushed now, thanks for looking it over.

I ended up dropping 0001 (the ncheck refactoring).  That's not really
relevant to the bug fix, and it occurred to me that dumping core if
the checkexprs data isn't there isn't such a bad thing.  0001 would
have led us to silently act as though the table has no CHECK constraints,
contrary to reality, if we reached one of those loops without having
loaded the requisite data.  Crashing is better --- think of it as a
free Assert ;-).

            regards, tom lane