Обсуждение: BUG #17756: Invalid replica indentity set order in a dump

Поиск
Список
Период
Сортировка

BUG #17756: Invalid replica indentity set order in a dump

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17756
Logged by:          Sergey Belyashov
Email address:      sergey.belyashov@gmail.com
PostgreSQL version: 14.5
Operating system:   Debian Linux x86_64
Description:

Some database have a partitioned table with unique index used as REPLICA
IDENTITY. pg_dump places ALTER TABLE tbl REPLICA IDENTITY USING INDEX
some_idx after partial index creation. But Postgresql fails to restore such
dump because replica identity cannot be set on invalid index some_idx:
partition indices are not created.


Re: BUG #17756: Invalid replica indentity set order in a dump

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> Some database have a partitioned table with unique index used as REPLICA
> IDENTITY. pg_dump places ALTER TABLE tbl REPLICA IDENTITY USING INDEX
> some_idx after partial index creation. But Postgresql fails to restore such
> dump because replica identity cannot be set on invalid index some_idx:
> partition indices are not created.

Please provide a concrete example, preferably a SQL script to create
a database that triggers the problem.  There are enough variables
here that nobody is likely to be excited about trying to reverse-
engineer a test case from only this amount of detail.

            regards, tom lane



Re: BUG #17756: Invalid replica indentity set order in a dump

От
Sergey Belyashov
Дата:
SQL:
create database testdb;
\c testdb
create table tbl (id integer not null primary key) partition by list (id);
create table tbl_1 partition of tbl for values in (1);
alter table tbl replica identity using index tbl_pkey;

Next do:
$ pg_dump testdb >testdb.sql
$ psql testdb -c "drop table tbl"
$ psql testdb <testdb.sql

result:
...
ALTER TABLE
ERROR:  cannot use invalid index "tbl_pkey" as replica identity
...

Reproduced in Postgresql 15.1 too.

Sergey Belyashov

пт, 20 янв. 2023 г. в 18:42, Tom Lane <tgl@sss.pgh.pa.us>:
>
> PG Bug reporting form <noreply@postgresql.org> writes:
> > Some database have a partitioned table with unique index used as REPLICA
> > IDENTITY. pg_dump places ALTER TABLE tbl REPLICA IDENTITY USING INDEX
> > some_idx after partial index creation. But Postgresql fails to restore such
> > dump because replica identity cannot be set on invalid index some_idx:
> > partition indices are not created.
>
> Please provide a concrete example, preferably a SQL script to create
> a database that triggers the problem.  There are enough variables
> here that nobody is likely to be excited about trying to reverse-
> engineer a test case from only this amount of detail.
>
>                         regards, tom lane



Re: BUG #17756: Invalid replica indentity set order in a dump

От
Tom Lane
Дата:
Sergey Belyashov <sergey.belyashov@gmail.com> writes:
> SQL:
> create database testdb;
> \c testdb
> create table tbl (id integer not null primary key) partition by list (id);
> create table tbl_1 partition of tbl for values in (1);
> alter table tbl replica identity using index tbl_pkey;

> Next do:
> $ pg_dump testdb >testdb.sql
> $ psql testdb -c "drop table tbl"
> $ psql testdb <testdb.sql

> result:
> ...
> ALTER TABLE
> ERROR:  cannot use invalid index "tbl_pkey" as replica identity

Thanks for the test case.  So the problem occurs because pg_dump dumps
the commands in this order:

    ALTER TABLE ONLY public.tbl
        ADD CONSTRAINT tbl_pkey PRIMARY KEY (id);

    ALTER TABLE ONLY public.tbl REPLICA IDENTITY USING INDEX tbl_pkey;

    ALTER TABLE ONLY public.tbl_1
        ADD CONSTRAINT tbl_1_pkey PRIMARY KEY (id);

    ALTER INDEX public.tbl_pkey ATTACH PARTITION public.tbl_1_pkey;

but the backend won't take the ALTER REPLICA IDENTITY command
until after the ATTACH PARTITION.

We could probably make pg_dump emit things in the order that works,
but it'd be a significant amount of extra complication there
(the ALTER REPLICA IDENTITY command couldn't be treated as just
part of the index definition).

I wonder why it is that the backend rejects this sequence.
I see that you can do this:

regression=# create table tbl (id integer not null primary key) partition by list (id);
CREATE TABLE
regression=# alter table tbl replica identity using index tbl_pkey;
ALTER TABLE

and it doesn't seem like the partitioned index is notably more
valid in this state than in the one that pg_dump has created.
So I think it might be better to fix the backend to allow this
sequence of operations.

            regards, tom lane



Re: BUG #17756: Invalid replica indentity set order in a dump

От
Tom Lane
Дата:
I wrote:
> I wonder why it is that the backend rejects this sequence.
> I see that you can do this:
> regression=# create table tbl (id integer not null primary key) partition by list (id);
> CREATE TABLE
> regression=# alter table tbl replica identity using index tbl_pkey;
> ALTER TABLE
> and it doesn't seem like the partitioned index is notably more
> valid in this state than in the one that pg_dump has created.
> So I think it might be better to fix the backend to allow this
> sequence of operations.

I looked into this, and I think we basically could just drop the
restriction in ATExecReplicaIdentity that rejects marking an invalid
index as replica identity.  If it is invalid, then the relcache won't
treat it as being the live replica identity (cf RelationGetIndexList),
so nothing will happen until it becomes valid; but there's no reason
why we can't mark it as identity in advance of that.  We support
not-yet-valid primary keys, so why not replica identity?

In checking this, I found a comment in index.c claiming

             * ALTER TABLE assumes that indisreplident cannot be set for
             * invalid indexes.

which was added by Michael's 9511fb37a.  The back story for that
indicates that Michael was worried about multiple indexes of a
relation becoming marked with indisreplident.  However, as far
as I can see the only actual issue there is that
relation_mark_replica_identity is too trusting, and that looks
like either sloppy coding or premature optimization.  I think
we could just rip out its early-exit path and fix it to positively
guarantee that no more than one index is marked indisreplident.

In short, I propose the attached.  This'd need to be back-patched,
which 9511fb37a wasn't, so maybe we should also back-patch 9511fb37a?
I don't think it's really necessary though.

            regards, tom lane

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index e6579f2979..41b16cb89b 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3482,9 +3482,8 @@ index_set_state_flags(Oid indexId, IndexStateFlagsAction action)
              * CONCURRENTLY that failed partway through.)
              *
              * Note: the CLUSTER logic assumes that indisclustered cannot be
-             * set on any invalid index, so clear that flag too.  Similarly,
-             * ALTER TABLE assumes that indisreplident cannot be set for
-             * invalid indexes.
+             * set on any invalid index, so clear that flag too.  For
+             * cleanliness, also clear indisreplident.
              */
             indexForm->indisvalid = false;
             indexForm->indisclustered = false;
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7c697a285b..1293545947 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15773,7 +15773,10 @@ ATExecDropOf(Relation rel, LOCKMODE lockmode)
  * relation_mark_replica_identity: Update a table's replica identity
  *
  * Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a suitable
- * index. Otherwise, it should be InvalidOid.
+ * index. Otherwise, it must be InvalidOid.
+ *
+ * Caller had better hold an exclusive lock on the relation, as the results
+ * of running two of these concurrently wouldn't be pretty.
  */
 static void
 relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
@@ -15785,7 +15788,6 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
     HeapTuple    pg_index_tuple;
     Form_pg_class pg_class_form;
     Form_pg_index pg_index_form;
-
     ListCell   *index;

     /*
@@ -15807,29 +15809,7 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
     heap_freetuple(pg_class_tuple);

     /*
-     * Check whether the correct index is marked indisreplident; if so, we're
-     * done.
-     */
-    if (OidIsValid(indexOid))
-    {
-        Assert(ri_type == REPLICA_IDENTITY_INDEX);
-
-        pg_index_tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
-        if (!HeapTupleIsValid(pg_index_tuple))
-            elog(ERROR, "cache lookup failed for index %u", indexOid);
-        pg_index_form = (Form_pg_index) GETSTRUCT(pg_index_tuple);
-
-        if (pg_index_form->indisreplident)
-        {
-            ReleaseSysCache(pg_index_tuple);
-            return;
-        }
-        ReleaseSysCache(pg_index_tuple);
-    }
-
-    /*
-     * Clear the indisreplident flag from any index that had it previously,
-     * and set it for any index that should have it now.
+     * Update the per-index indisreplident flags correctly.
      */
     pg_index = table_open(IndexRelationId, RowExclusiveLock);
     foreach(index, RelationGetIndexList(rel))
@@ -15843,19 +15823,23 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
             elog(ERROR, "cache lookup failed for index %u", thisIndexOid);
         pg_index_form = (Form_pg_index) GETSTRUCT(pg_index_tuple);

-        /*
-         * Unset the bit if set.  We know it's wrong because we checked this
-         * earlier.
-         */
-        if (pg_index_form->indisreplident)
+        if (thisIndexOid == indexOid)
         {
-            dirty = true;
-            pg_index_form->indisreplident = false;
+            /* Set the bit if not already set. */
+            if (!pg_index_form->indisreplident)
+            {
+                dirty = true;
+                pg_index_form->indisreplident = true;
+            }
         }
-        else if (thisIndexOid == indexOid)
+        else
         {
-            dirty = true;
-            pg_index_form->indisreplident = true;
+            /* Unset the bit if set. */
+            if (pg_index_form->indisreplident)
+            {
+                dirty = true;
+                pg_index_form->indisreplident = false;
+            }
         }

         if (dirty)
@@ -15867,7 +15851,9 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
             /*
              * Invalidate the relcache for the table, so that after we commit
              * all sessions will refresh the table's replica identity index
-             * before attempting any UPDATE or DELETE on the table.
+             * before attempting any UPDATE or DELETE on the table.  (If we
+             * changed the table's pg_class row above, then a relcache inval
+             * is already queued due to that; but we might not have.)
              */
             CacheInvalidateRelcache(rel);
         }
@@ -15952,12 +15938,6 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot use partial index \"%s\" as replica identity",
                         RelationGetRelationName(indexRel))));
-    /* And neither are invalid indexes. */
-    if (!indexRel->rd_index->indisvalid)
-        ereport(ERROR,
-                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-                 errmsg("cannot use invalid index \"%s\" as replica identity",
-                        RelationGetRelationName(indexRel))));

     /* Check index for nullable columns. */
     for (key = 0; key < IndexRelationGetNumberOfKeyAttributes(indexRel); key++)