Обсуждение: Replica Identity check of partition table on subscriber
Hi hackers, I saw a problem in logical replication, when the target table on subscriber is a partitioned table, it only checks whether the Replica Identity of partitioned table is consistent with the publisher, and doesn't check Replica Identity of the partition. For example: -- publisher -- create table tbl (a int not null, b int); create unique INDEX ON tbl (a); alter table tbl replica identity using INDEX tbl_a_idx; create publication pub for table tbl; -- subscriber -- -- table tbl (parent table) has RI index, while table child has no RI index. create table tbl (a int not null, b int) partition by range (a); create table child partition of tbl default; create unique INDEX ON tbl (a); alter table tbl replica identity using INDEX tbl_a_idx; create subscription sub connection 'port=5432 dbname=postgres' publication pub; -- publisher -- insert into tbl values (11,11); update tbl set a=a+1; It caused an assertion failure on subscriber: TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line: 2088,PID: 1616523) The backtrace is attached. We got the assertion failure because idxoid is invalid, as table child has no Replica Identity or Primary Key. We have a check in check_relation_updatable(), but what it checked is table tbl (the parent table) and it passed the check. I think one approach to fix it is to check the target partition in this case, instead of the partitioned table. When trying to fix it, I saw some other problems about updating partition map cache. a) In logicalrep_partmap_invalidate_cb(), the type of the entry in LogicalRepPartMap should be LogicalRepPartMapEntry, instead of LogicalRepRelMapEntry. b) In logicalrep_partition_open(), it didn't check if the entry is valid. c) When the publisher send new relation mapping, only relation map cache will be updated, and partition map cache wouldn't. I think it also should be updated because it has remote relation information, too. Attach two patches which tried to fix them. 0001 patch: fix the above three problems about partition map cache. 0002 patch: fix the assertion failure, check the Replica Identity of the partition if the target table is a partitioned table. Thanks to Hou Zhijie for helping me in the first patch. I will add a test for it later if no one doesn't like this fix. Regards, Shi yu
Вложения
On Wed, Jun 8, 2022 at 2:17 PM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > I saw a problem in logical replication, when the target table on subscriber is a > partitioned table, it only checks whether the Replica Identity of partitioned > table is consistent with the publisher, and doesn't check Replica Identity of > the partition. > ... > > It caused an assertion failure on subscriber: > TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line:2088, PID: 1616523) > > The backtrace is attached. > > We got the assertion failure because idxoid is invalid, as table child has no > Replica Identity or Primary Key. We have a check in check_relation_updatable(), > but what it checked is table tbl (the parent table) and it passed the check. > I can reproduce the problem. This seems to be the problem since commit f1ac27bf (Add logical replication support to replicate into partitioned tables), so adding Amit L. and Peter E. > I think one approach to fix it is to check the target partition in this case, > instead of the partitioned table. > This approach sounds reasonable to me. One minor point: +/* + * Check that replica identity matches. + * + * We allow for stricter replica identity (fewer columns) on subscriber as + * that will not stop us from finding unique tuple. IE, if publisher has + * identity (id,timestamp) and subscriber just (id) this will not be a + * problem, but in the opposite scenario it will. + * + * Don't throw any error here just mark the relation entry as not updatable, + * as replica identity is only for updates and deletes but inserts can be + * replicated even without it. + */ +static void +logicalrep_check_updatable(LogicalRepRelMapEntry *entry) Can we name this function as logicalrep_rel_mark_updatable as we are doing that? If so, change the comments as well. > When trying to fix it, I saw some other problems about updating partition map > cache. > > a) In logicalrep_partmap_invalidate_cb(), the type of the entry in > LogicalRepPartMap should be LogicalRepPartMapEntry, instead of > LogicalRepRelMapEntry. > > b) In logicalrep_partition_open(), it didn't check if the entry is valid. > > c) When the publisher send new relation mapping, only relation map cache will be > updated, and partition map cache wouldn't. I think it also should be updated > because it has remote relation information, too. > Is there any test case that can show the problem due to your above observations? -- With Regards, Amit Kapila.
Hi Amit, On Thu, Jun 9, 2022 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Jun 8, 2022 at 2:17 PM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > I saw a problem in logical replication, when the target table on subscriber is a > > partitioned table, it only checks whether the Replica Identity of partitioned > > table is consistent with the publisher, and doesn't check Replica Identity of > > the partition. > ... > > > > It caused an assertion failure on subscriber: > > TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line:2088, PID: 1616523) > > > > The backtrace is attached. > > > > We got the assertion failure because idxoid is invalid, as table child has no > > Replica Identity or Primary Key. We have a check in check_relation_updatable(), > > but what it checked is table tbl (the parent table) and it passed the check. > > > > I can reproduce the problem. This seems to be the problem since commit > f1ac27bf (Add logical replication support to replicate into > partitioned tables), so adding Amit L. and Peter E. Thanks, I can see the problem. I have looked at other mentioned problems with the code too and agree they look like bugs. Both patches look to be on the right track to fix the issues, but will look more closely to see if I've anything to add. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Thu, June 9, 2022 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > I think one approach to fix it is to check the target partition in this case, > > instead of the partitioned table. > > > > This approach sounds reasonable to me. One minor point: > +/* > + * Check that replica identity matches. > + * > + * We allow for stricter replica identity (fewer columns) on subscriber as > + * that will not stop us from finding unique tuple. IE, if publisher has > + * identity (id,timestamp) and subscriber just (id) this will not be a > + * problem, but in the opposite scenario it will. > + * > + * Don't throw any error here just mark the relation entry as not updatable, > + * as replica identity is only for updates and deletes but inserts can be > + * replicated even without it. > + */ > +static void > +logicalrep_check_updatable(LogicalRepRelMapEntry *entry) > > Can we name this function as logicalrep_rel_mark_updatable as we are > doing that? If so, change the comments as well. > OK. Modified. > > When trying to fix it, I saw some other problems about updating partition > map > > cache. > > > > a) In logicalrep_partmap_invalidate_cb(), the type of the entry in > > LogicalRepPartMap should be LogicalRepPartMapEntry, instead of > > LogicalRepRelMapEntry. > > > > b) In logicalrep_partition_open(), it didn't check if the entry is valid. > > > > c) When the publisher send new relation mapping, only relation map cache > will be > > updated, and partition map cache wouldn't. I think it also should be updated > > because it has remote relation information, too. > > > > Is there any test case that can show the problem due to your above > observations? > Please see the following case. -- publisher create table tbl (a int primary key, b int); create publication pub for table tbl; -- subscriber create table tbl (a int primary key, b int, c int) partition by range (a); create table child partition of tbl default; -- publisher, make cache insert into tbl values (1,1); update tbl set a=a+1; alter table tbl add column c int; update tbl set c=1 where a=2; -- subscriber postgres=# select * from tbl; a | b | c ---+---+--- 2 | 1 | (1 row) The value of column c updated failed on subscriber. And after applying the first patch, it would work fine. I have added this case to the first patch. Also add a test case for the second patch. Attach the new patches. Regards, Shi yu
Вложения
Hello, On Wed, Jun 8, 2022 at 5:47 PM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > Hi hackers, > > I saw a problem in logical replication, when the target table on subscriber is a > partitioned table, it only checks whether the Replica Identity of partitioned > table is consistent with the publisher, and doesn't check Replica Identity of > the partition. > > For example: > -- publisher -- > create table tbl (a int not null, b int); > create unique INDEX ON tbl (a); > alter table tbl replica identity using INDEX tbl_a_idx; > create publication pub for table tbl; > > -- subscriber -- > -- table tbl (parent table) has RI index, while table child has no RI index. > create table tbl (a int not null, b int) partition by range (a); > create table child partition of tbl default; > create unique INDEX ON tbl (a); > alter table tbl replica identity using INDEX tbl_a_idx; > create subscription sub connection 'port=5432 dbname=postgres' publication pub; > > -- publisher -- > insert into tbl values (11,11); > update tbl set a=a+1; > > It caused an assertion failure on subscriber: > TRAP: FailedAssertion("OidIsValid(idxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)", File: "worker.c", Line:2088, PID: 1616523) > > The backtrace is attached. > > We got the assertion failure because idxoid is invalid, as table child has no > Replica Identity or Primary Key. We have a check in check_relation_updatable(), > but what it checked is table tbl (the parent table) and it passed the check. > > I think one approach to fix it is to check the target partition in this case, > instead of the partitioned table. That makes sense. A normal user update of a partitioned table will only perform CheckCmdReplicaIdentity() for leaf partitions and the logical replication updates should do the same. I may have been confused at the time to think that ALTER TABLE REPLICA IDENTITY makes sure that the replica identities of all relations in a partition tree are forced to be the same at all times, though it seems that the patch to do so [1] didn't actually get in. I guess adding a test case would have helped. > When trying to fix it, I saw some other problems about updating partition map > cache. > > a) In logicalrep_partmap_invalidate_cb(), the type of the entry in > LogicalRepPartMap should be LogicalRepPartMapEntry, instead of > LogicalRepRelMapEntry. Indeed. > b) In logicalrep_partition_open(), it didn't check if the entry is valid. Yeah, that's bad. Actually, it seems that localrelvalid stuff for LogicalRepRelMapEntry came in 3d65b0593c5, but it's likely we missed in that commit that LogicalRepPartMapEntrys contain copies of, not pointers to, the relevant parent's entry. This patch fixes that oversight. > c) When the publisher send new relation mapping, only relation map cache will be > updated, and partition map cache wouldn't. I think it also should be updated > because it has remote relation information, too. Yes, again a result of forgetting that the partition map entries have copies of relation map entries. +logicalrep_partmap_invalidate I wonder why not call this logicalrep_partmap_update() to go with logicalrep_relmap_update()? It seems confusing to have logicalrep_partmap_invalidate() right next to logicalrep_partmap_invalidate_cb(). +/* + * Invalidate the existing entry in the partition map. I think logicalrep_partmap_invalidate() may update *multiple* entries, because the hash table scan may find multiple PartMapEntrys containing a copy of the RelMapEntry with given remoteid, that is, for multiple partitions of a given local parent table mapped to that remote relation. So, please fix the comment as: Invalidate/Update the entries in the partition map that refer to 'remoterel' Likewise: + /* Invalidate the corresponding partition map as well */ Maybe, this should say: Also update all entries in the partition map that refer to 'remoterel'. In 0002: +logicalrep_check_updatable +1 to Amit's suggestion to use "mark" instead of "check". @@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, static void check_relation_updatable(LogicalRepRelMapEntry *rel) { + /* + * If it is a partitioned table, we don't check it, we will check its + * partition later. + */ + if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) + return; Why do this? I mean why if logicalrep_check_updatable() doesn't care if the relation is partitioned or not -- it does all the work regardless. I suggest we don't add this check in check_relation_updatable(). + /* Check if we can do the update or delete. */ Maybe mention "leaf partition", as: Check if we can do the update or delete on the leaf partition. BTW, the following hunk in patch 0002 should really be a part of 0001. @@ -584,7 +594,6 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root, Oid partOid = RelationGetRelid(partrel); AttrMap *attrmap = root->attrmap; bool found; - int i; MemoryContext oldctx; if (LogicalRepPartMap == NULL) > Thanks to Hou Zhijie for helping me in the first patch. Thank you both for the fixes. > I will add a test for it later That would be very welcome. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > +logicalrep_partmap_invalidate > > I wonder why not call this logicalrep_partmap_update() to go with > logicalrep_relmap_update()? It seems confusing to have > logicalrep_partmap_invalidate() right next to > logicalrep_partmap_invalidate_cb(). > I am thinking about why we need to update the relmap in this new function logicalrep_partmap_invalidate()? I think it may be better to do it in logicalrep_partition_open() when actually required, otherwise, we end up doing a lot of work that may not be of use unless the corresponding partition is accessed. Also, it seems awkward to me that we do the same thing in this new function logicalrep_partmap_invalidate() and then also in logicalrep_partition_open() under different conditions. One more point about the 0001, it doesn't seem to have a test that validates logicalrep_partmap_invalidate_cb() functionality. I think for that we need to Alter the local table (table on the subscriber side). Can we try to write a test for it? -- With Regards, Amit Kapila.
On Saturday, June 11, 2022 9:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> > wrote: > > > > +logicalrep_partmap_invalidate > > > > I wonder why not call this logicalrep_partmap_update() to go with > > logicalrep_relmap_update()? It seems confusing to have > > logicalrep_partmap_invalidate() right next to > > logicalrep_partmap_invalidate_cb(). > > > > I am thinking about why we need to update the relmap in this new function > logicalrep_partmap_invalidate()? I think it may be better to do it in > logicalrep_partition_open() when actually required, otherwise, we end up doing > a lot of work that may not be of use unless the corresponding partition is > accessed. Also, it seems awkward to me that we do the same thing in this new > function > logicalrep_partmap_invalidate() and then also in > logicalrep_partition_open() under different conditions. > > One more point about the 0001, it doesn't seem to have a test that validates > logicalrep_partmap_invalidate_cb() functionality. I think for that we need to Alter > the local table (table on the subscriber side). Can we try to write a test for it? Thanks for Amit. L and Amit. K for your comments ! I agree with this point. Here is the version patch set which try to address all these comments. In addition, when reviewing the code, I found some other related problems in the code. 1) entry->attrmap = make_attrmap(map->maplen); for (attno = 0; attno < entry->attrmap->maplen; attno++) { AttrNumber root_attno = map->attnums[attno]; entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1]; } In this case, It’s possible that 'attno' points to a dropped column in which case the root_attno would be '0'. I think in this case we should just set the entry->attrmap->attnums[attno] to '-1' instead of accessing the attrmap->attnums[]. I included this change in 0001 because the testcase which can reproduce these problems are related(we need to ALTER the partition on subscriber to reproduce it). 2) if (entry->attrmap) pfree(entry->attrmap); I think we should use free_attrmap instead of pfree here to avoid memory leak. And we also need to check the attrmap in logicalrep_rel_open() and logicalrep_partition_open() and free it if needed. I am not sure shall we put this in the 0001 patch, so attach a separate patch for this. We can merge later this if needed. Best regards, Hou zj
Вложения
On Sat, Jun 11, 2022 at 2:36 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Saturday, June 11, 2022 9:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> > > wrote: > > > > > > +logicalrep_partmap_invalidate > > > > > > I wonder why not call this logicalrep_partmap_update() to go with > > > logicalrep_relmap_update()? It seems confusing to have > > > logicalrep_partmap_invalidate() right next to > > > logicalrep_partmap_invalidate_cb(). > > > > > > > I am thinking about why we need to update the relmap in this new function > > logicalrep_partmap_invalidate()? I think it may be better to do it in > > logicalrep_partition_open() when actually required, otherwise, we end up doing > > a lot of work that may not be of use unless the corresponding partition is > > accessed. Also, it seems awkward to me that we do the same thing in this new > > function > > logicalrep_partmap_invalidate() and then also in > > logicalrep_partition_open() under different conditions. > > > > One more point about the 0001, it doesn't seem to have a test that validates > > logicalrep_partmap_invalidate_cb() functionality. I think for that we need to Alter > > the local table (table on the subscriber side). Can we try to write a test for it? > > > Thanks for Amit. L and Amit. K for your comments ! I agree with this point. > Here is the version patch set which try to address all these comments. > > In addition, when reviewing the code, I found some other related > problems in the code. > > 1) > entry->attrmap = make_attrmap(map->maplen); > for (attno = 0; attno < entry->attrmap->maplen; attno++) > { > AttrNumber root_attno = map->attnums[attno]; > > entry->attrmap->attnums[attno] = attrmap->attnums[root_attno - 1]; > } > > In this case, It’s possible that 'attno' points to a dropped column in which > case the root_attno would be '0'. I think in this case we should just set the > entry->attrmap->attnums[attno] to '-1' instead of accessing the > attrmap->attnums[]. I included this change in 0001 because the testcase which > can reproduce these problems are related(we need to ALTER the partition on > subscriber to reproduce it). > Hmm, this appears to be a different issue. Can we separate out the bug-fix code for the subscriber-side issue caused by the DDL on the subscriber? Few other comments: + * Note that we don't update the remoterel information in the entry here, + * we will update the information in logicalrep_partition_open to save + * unnecessary work. + */ +void +logicalrep_partmap_invalidate(LogicalRepRelation *remoterel) /to save/to avoid Also, I agree with Amit L. that it is confusing to have logicalrep_partmap_invalidate() right next to logicalrep_partmap_invalidate_cb() and both have somewhat different kinds of logic. So, we can either name it as logicalrep_partmap_reset_relmap() or logicalrep_partmap_update() unless you have any other better suggestions? Accordingly, change the comment atop this function. -- With Regards, Amit Kapila.
On Monday, June 13, 2022 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Jun 11, 2022 at 2:36 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Saturday, June 11, 2022 9:36 AM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote > > > <amitlangote09@gmail.com> > > > wrote: > > > > > > > > +logicalrep_partmap_invalidate > > > > > > > > I wonder why not call this logicalrep_partmap_update() to go with > > > > logicalrep_relmap_update()? It seems confusing to have > > > > logicalrep_partmap_invalidate() right next to > > > > logicalrep_partmap_invalidate_cb(). > > > > > > > > > > I am thinking about why we need to update the relmap in this new > > > function logicalrep_partmap_invalidate()? I think it may be better > > > to do it in > > > logicalrep_partition_open() when actually required, otherwise, we > > > end up doing a lot of work that may not be of use unless the > > > corresponding partition is accessed. Also, it seems awkward to me > > > that we do the same thing in this new function > > > logicalrep_partmap_invalidate() and then also in > > > logicalrep_partition_open() under different conditions. > > > > > > One more point about the 0001, it doesn't seem to have a test that > > > validates > > > logicalrep_partmap_invalidate_cb() functionality. I think for that > > > we need to Alter the local table (table on the subscriber side). Can we try to > write a test for it? > > > > > > Thanks for Amit. L and Amit. K for your comments ! I agree with this point. > > Here is the version patch set which try to address all these comments. > > > > In addition, when reviewing the code, I found some other related > > problems in the code. > > > > 1) > > entry->attrmap = make_attrmap(map->maplen); > > for (attno = 0; attno < entry->attrmap->maplen; attno++) > > { > > AttrNumber root_attno = > map->attnums[attno]; > > > > entry->attrmap->attnums[attno] = > attrmap->attnums[root_attno - 1]; > > } > > > > In this case, It’s possible that 'attno' points to a dropped column in > > which case the root_attno would be '0'. I think in this case we should > > just set the > > entry->attrmap->attnums[attno] to '-1' instead of accessing the > > attrmap->attnums[]. I included this change in 0001 because the > > attrmap->testcase which > > can reproduce these problems are related(we need to ALTER the > > partition on subscriber to reproduce it). > > > > Hmm, this appears to be a different issue. Can we separate out the bug-fix > code for the subscriber-side issue caused by the DDL on the subscriber? > > Few other comments: > + * Note that we don't update the remoterel information in the entry > +here, > + * we will update the information in logicalrep_partition_open to save > + * unnecessary work. > + */ > +void > +logicalrep_partmap_invalidate(LogicalRepRelation *remoterel) > > /to save/to avoid > > Also, I agree with Amit L. that it is confusing to have > logicalrep_partmap_invalidate() right next to > logicalrep_partmap_invalidate_cb() and both have somewhat different kinds of > logic. So, we can either name it as > logicalrep_partmap_reset_relmap() or logicalrep_partmap_update() unless you > have any other better suggestions? Accordingly, change the comment atop > this function. Thanks for the comments. I have separated out the bug-fix for the subscriber-side. And fix the typo and function name. Attach the new version patch set. Best regards, Hou zj
Вложения
On Sat, Jun 11, 2022 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > +logicalrep_partmap_invalidate > > > > I wonder why not call this logicalrep_partmap_update() to go with > > logicalrep_relmap_update()? It seems confusing to have > > logicalrep_partmap_invalidate() right next to > > logicalrep_partmap_invalidate_cb(). > > > > I am thinking about why we need to update the relmap in this new > function logicalrep_partmap_invalidate()? I think it may be better to > do it in logicalrep_partition_open() when actually required, > otherwise, we end up doing a lot of work that may not be of use unless > the corresponding partition is accessed. Also, it seems awkward to me > that we do the same thing in this new function > logicalrep_partmap_invalidate() and then also in > logicalrep_partition_open() under different conditions. Both logicalrep_rel_open() and logicalrel_partition_open() only ever touch the local Relation, never the LogicalRepRelation. Updating the latter is the responsibility of logicalrep_relmap_update(), which is there to support handling of the RELATION message by apply_handle_relation(). Given that we make a separate copy of the parent's LogicalRepRelMapEntry for each partition to put into the corresponding LogicalRepPartMapEntry, those copies must be updated as well when a RELATION message targeting the parent's entry arrives. So it seems fine that the patch is making it the logicalrep_relmap_update()'s responsibility to update the partition copies using the new logicalrep_partition_invalidate/update() subroutine. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Mon, Jun 13, 2022 at 2:20 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Sat, Jun 11, 2022 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > +logicalrep_partmap_invalidate > > > > > > I wonder why not call this logicalrep_partmap_update() to go with > > > logicalrep_relmap_update()? It seems confusing to have > > > logicalrep_partmap_invalidate() right next to > > > logicalrep_partmap_invalidate_cb(). > > > > > > > I am thinking about why we need to update the relmap in this new > > function logicalrep_partmap_invalidate()? I think it may be better to > > do it in logicalrep_partition_open() when actually required, > > otherwise, we end up doing a lot of work that may not be of use unless > > the corresponding partition is accessed. Also, it seems awkward to me > > that we do the same thing in this new function > > logicalrep_partmap_invalidate() and then also in > > logicalrep_partition_open() under different conditions. > > Both logicalrep_rel_open() and logicalrel_partition_open() only ever > touch the local Relation, never the LogicalRepRelation. > We do make the copy of remote rel in logicalrel_partition_open() when the entry is not found. I feel the same should happen when remote relation is reset/invalidated by the RELATION message. > Updating the > latter is the responsibility of logicalrep_relmap_update(), which is > there to support handling of the RELATION message by > apply_handle_relation(). Given that we make a separate copy of the > parent's LogicalRepRelMapEntry for each partition to put into the > corresponding LogicalRepPartMapEntry, those copies must be updated as > well when a RELATION message targeting the parent's entry arrives. So > it seems fine that the patch is making it the > logicalrep_relmap_update()'s responsibility to update the partition > copies using the new logicalrep_partition_invalidate/update() > subroutine. > I think we can do that way as well but do you see any benefit in it? The way I am suggesting will avoid the effort of updating the remote rel copy till we try to access that particular partition. -- With Regards, Amit Kapila.
On Mon, Jun 13, 2022 at 1:03 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Monday, June 13, 2022 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I have separated out the bug-fix for the subscriber-side. > And fix the typo and function name. > Attach the new version patch set. > The first patch looks good to me. I have slightly modified one of the comments and the commit message. I think we need to backpatch this through 13 where we introduced support to replicate into partitioned tables (commit f1ac27bf). If you guys are fine, I'll push this once the work for PG14.4 is done. -- With Regards, Amit Kapila.
Вложения
On Mon, Jun 13, 2022 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Mon, Jun 13, 2022 at 1:03 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > On Monday, June 13, 2022 1:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I have separated out the bug-fix for the subscriber-side. > > And fix the typo and function name. > > Attach the new version patch set. > > > > The first patch looks good to me. I have slightly modified one of the > comments and the commit message. I think we need to backpatch this > through 13 where we introduced support to replicate into partitioned > tables (commit f1ac27bf). If you guys are fine, I'll push this once > the work for PG14.4 is done. Both the code changes and test cases look good to me. Just a couple of minor nitpicks with test changes: + CREATE UNIQUE INDEX tab5_a_idx ON tab5 (a); + ALTER TABLE tab5 REPLICA IDENTITY USING INDEX tab5_a_idx; + ALTER TABLE tab5_1 REPLICA IDENTITY USING INDEX tab5_1_a_idx;}); Not sure if we follow it elsewhere, but should we maybe avoid using the internally generated index name as in the partition's case above? +# Test the case that target table on subscriber is a partitioned table and +# check that the changes are replicated correctly after changing the schema of +# table on subcriber. The first sentence makes it sound like the tests that follow are the first ones in the file where the target table is partitioned, which is not true, so I think we should drop that part. Also how about being more specific about the test intent, say: Test that replication continues to work correctly after altering the partition of a partitioned target table. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Mon, Jun 13, 2022 at 6:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Mon, Jun 13, 2022 at 2:20 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Sat, Jun 11, 2022 at 10:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > > > +logicalrep_partmap_invalidate > > > > > > > > I wonder why not call this logicalrep_partmap_update() to go with > > > > logicalrep_relmap_update()? It seems confusing to have > > > > logicalrep_partmap_invalidate() right next to > > > > logicalrep_partmap_invalidate_cb(). > > > > > > > > > > I am thinking about why we need to update the relmap in this new > > > function logicalrep_partmap_invalidate()? I think it may be better to > > > do it in logicalrep_partition_open() when actually required, > > > otherwise, we end up doing a lot of work that may not be of use unless > > > the corresponding partition is accessed. Also, it seems awkward to me > > > that we do the same thing in this new function > > > logicalrep_partmap_invalidate() and then also in > > > logicalrep_partition_open() under different conditions. > > > > Both logicalrep_rel_open() and logicalrel_partition_open() only ever > > touch the local Relation, never the LogicalRepRelation. > > We do make the copy of remote rel in logicalrel_partition_open() when > the entry is not found. I feel the same should happen when remote > relation is reset/invalidated by the RELATION message. Hmm, the problem is that a RELATION message will only invalidate the LogicalRepRelation portion of the target parent's entry, while any copies that have been made for partitions that were opened till that point will continue to have the old LogicalRepRelation information. As things stand, logicalrep_partition_open() won't know that the parent entry's LogicalRepRelation may have been modified due to a RELATION message. It will reconstruct the entry only if the partition itself was modified locally, that is, if logicalrep_partman_invalidate_cb() was called on the partition. > > Updating the > > latter is the responsibility of logicalrep_relmap_update(), which is > > there to support handling of the RELATION message by > > apply_handle_relation(). Given that we make a separate copy of the > > parent's LogicalRepRelMapEntry for each partition to put into the > > corresponding LogicalRepPartMapEntry, those copies must be updated as > > well when a RELATION message targeting the parent's entry arrives. So > > it seems fine that the patch is making it the > > logicalrep_relmap_update()'s responsibility to update the partition > > copies using the new logicalrep_partition_invalidate/update() > > subroutine. > > I think we can do that way as well but do you see any benefit in it? > The way I am suggesting will avoid the effort of updating the remote > rel copy till we try to access that particular partition. I don't see any benefit as such to doing it the way the patch does, it's just that that seems to be the only way to go given the way things are. This would have been unnecessary, for example, if the relation map entry had contained a LogicalRepRelation pointer instead of the struct. The partition entries would point to the same entry as the parent's if that were the case and there would be no need to modify the partitions' copies explicitly. Am I missing something? -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Tue, Jun 14, 2022 at 3:31 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Mon, Jun 13, 2022 at 6:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I think we can do that way as well but do you see any benefit in it? > > The way I am suggesting will avoid the effort of updating the remote > > rel copy till we try to access that particular partition. > > I don't see any benefit as such to doing it the way the patch does, > it's just that that seems to be the only way to go given the way > things are. Oh, I see that v4-0002 has this: +/* + * Reset the entries in the partition map that refer to remoterel + * + * Called when new relation mapping is sent by the publisher to update our + * expected view of incoming data from said publisher. + * + * Note that we don't update the remoterel information in the entry here, + * we will update the information in logicalrep_partition_open to avoid + * unnecessary work. + */ +void +logicalrep_partmap_reset_relmap(LogicalRepRelation *remoterel) +{ + HASH_SEQ_STATUS status; + LogicalRepPartMapEntry *part_entry; + LogicalRepRelMapEntry *entry; + + if (LogicalRepPartMap == NULL) + return; + + hash_seq_init(&status, LogicalRepPartMap); + while ((part_entry = (LogicalRepPartMapEntry *) hash_seq_search(&status)) != NULL) + { + entry = &part_entry->relmapentry; + + if (entry->remoterel.remoteid != remoterel->remoteid) + continue; + + logicalrep_relmap_free_entry(entry); + + memset(entry, 0, sizeof(LogicalRepRelMapEntry)); + } +} The previous versions would also call logicalrep_relmap_update() on the entry after the memset, which is no longer done, so that is indeed saving useless work. I also see that both logicalrep_relmap_update() and the above function basically invalidate the whole LogicalRepRelMapEntry before setting the new remote relation info so that the next logicaprep_rel_open() or logicalrep_partition_open() have to refill the other members too. Though, I thought maybe you were saying that we shouldn't need this function for resetting partitions in the first place, which I guess you weren't. v4-0002 looks good btw, except the bitpick about test comment similar to my earlier comment regarding v5-0001: +# Change the column order of table on publisher I think it might be better to say something specific to describe the test intent, like: Test that replication into partitioned target table continues to works correctly when the published table is altered -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Tue, Jun 14, 2022 2:18 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Mon, Jun 13, 2022 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > On Mon, Jun 13, 2022 at 1:03 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > On Monday, June 13, 2022 1:53 PM Amit Kapila > <amit.kapila16@gmail.com> wrote: > > > I have separated out the bug-fix for the subscriber-side. > > > And fix the typo and function name. > > > Attach the new version patch set. > > > > > > > The first patch looks good to me. I have slightly modified one of the > > comments and the commit message. I think we need to backpatch this > > through 13 where we introduced support to replicate into partitioned > > tables (commit f1ac27bf). If you guys are fine, I'll push this once > > the work for PG14.4 is done. > > Both the code changes and test cases look good to me. Just a couple > of minor nitpicks with test changes: Thanks for your comments. > > + CREATE UNIQUE INDEX tab5_a_idx ON tab5 (a); > + ALTER TABLE tab5 REPLICA IDENTITY USING INDEX tab5_a_idx; > + ALTER TABLE tab5_1 REPLICA IDENTITY USING INDEX tab5_1_a_idx;}); > > Not sure if we follow it elsewhere, but should we maybe avoid using > the internally generated index name as in the partition's case above? > I saw some existing tests also use internally generated index name (e.g. replica_identity.sql, ddl.sql and 031_column_list.pl), so maybe it's better to fix them all in a separate patch. I didn't change this. > +# Test the case that target table on subscriber is a partitioned table and > +# check that the changes are replicated correctly after changing the schema > of > +# table on subcriber. > > The first sentence makes it sound like the tests that follow are the > first ones in the file where the target table is partitioned, which is > not true, so I think we should drop that part. Also how about being > more specific about the test intent, say: > > Test that replication continues to work correctly after altering the > partition of a partitioned target table. > OK, modified. Attached the new version of patch set, and the patches for pg14 and pg13. Regards, Shi yu
Вложения
- v6-0001-Fix-cache-look-up-failures-while-applying-changes.patch
- v6-0002-Reset-partition-map-cache-when-receiving-new-rela.patch
- v6-0003-Check-partition-table-replica-identity-on-subscri.patch
- v6-0004-fix-memory-leak-about-attrmap.patch
- v6-pg13-0001-Fix-cache-look-up-failures-while-applying-chang_patch
- v6-pg14-0001-Fix-cache-look-up-failures-while-applying-chang_patch
On Tue, Jun 14, 2022 at 1:02 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jun 14, 2022 at 3:31 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Mon, Jun 13, 2022 at 6:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > I think we can do that way as well but do you see any benefit in it? > > > The way I am suggesting will avoid the effort of updating the remote > > > rel copy till we try to access that particular partition. > > > > I don't see any benefit as such to doing it the way the patch does, > > it's just that that seems to be the only way to go given the way > > things are. > > Oh, I see that v4-0002 has this: > > +/* > + * Reset the entries in the partition map that refer to remoterel > + * > + * Called when new relation mapping is sent by the publisher to update our > + * expected view of incoming data from said publisher. > + * > + * Note that we don't update the remoterel information in the entry here, > + * we will update the information in logicalrep_partition_open to avoid > + * unnecessary work. > + */ > +void > +logicalrep_partmap_reset_relmap(LogicalRepRelation *remoterel) > +{ > + HASH_SEQ_STATUS status; > + LogicalRepPartMapEntry *part_entry; > + LogicalRepRelMapEntry *entry; > + > + if (LogicalRepPartMap == NULL) > + return; > + > + hash_seq_init(&status, LogicalRepPartMap); > + while ((part_entry = (LogicalRepPartMapEntry *) > hash_seq_search(&status)) != NULL) > + { > + entry = &part_entry->relmapentry; > + > + if (entry->remoterel.remoteid != remoterel->remoteid) > + continue; > + > + logicalrep_relmap_free_entry(entry); > + > + memset(entry, 0, sizeof(LogicalRepRelMapEntry)); > + } > +} > > The previous versions would also call logicalrep_relmap_update() on > the entry after the memset, which is no longer done, so that is indeed > saving useless work. I also see that both logicalrep_relmap_update() > and the above function basically invalidate the whole > LogicalRepRelMapEntry before setting the new remote relation info so > that the next logicaprep_rel_open() or logicalrep_partition_open() > have to refill the other members too. > > Though, I thought maybe you were saying that we shouldn't need this > function for resetting partitions in the first place, which I guess > you weren't. > Right. > v4-0002 looks good btw, except the bitpick about test comment similar > to my earlier comment regarding v5-0001: > > +# Change the column order of table on publisher > > I think it might be better to say something specific to describe the > test intent, like: > > Test that replication into partitioned target table continues to works > correctly when the published table is altered > Okay changed this and slightly modify the comments and commit message. I am just attaching the HEAD patches for the first two issues. -- With Regards, Amit Kapila.
Вложения
On Tue, Jun 14, 2022 at 9:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Tue, Jun 14, 2022 at 1:02 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > +# Change the column order of table on publisher > > I think it might be better to say something specific to describe the > > test intent, like: > > > > Test that replication into partitioned target table continues to works > > correctly when the published table is altered > > Okay changed this and slightly modify the comments and commit message. > I am just attaching the HEAD patches for the first two issues. LGTM, thanks. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Tue, Jun 14, 2022 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > v4-0002 looks good btw, except the bitpick about test comment similar > > to my earlier comment regarding v5-0001: > > > > +# Change the column order of table on publisher > > > > I think it might be better to say something specific to describe the > > test intent, like: > > > > Test that replication into partitioned target table continues to works > > correctly when the published table is altered > > > > Okay changed this and slightly modify the comments and commit message. > I am just attaching the HEAD patches for the first two issues. > Thanks for updating the patch. Attached the new patch set which ran pgindent, and the patches for pg14 and pg13. (Only the first two patches of the patch set.) Regards, Shi yu
Вложения
- v8-pg14-0002-Fix-data-inconsistency-between-publisher-and-su_patch
- v8-pg14-0001-Fix-cache-look-up-failures-while-applying-chang_patch
- v8-pg13-0002-Fix-data-inconsistency-between-publisher-and-su_patch
- v8-pg13-0001-Fix-cache-look-up-failures-while-applying-chang_patch
- v8-0002-Fix-data-inconsistency-between-publisher-and-subs.patch
- v8-0001-Fix-cache-look-up-failures-while-applying-changes.patch
On Wed, Jun 15, 2022 at 8:52 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > On Tue, Jun 14, 2022 8:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > v4-0002 looks good btw, except the bitpick about test comment similar > > > to my earlier comment regarding v5-0001: > > > > > > +# Change the column order of table on publisher > > > > > > I think it might be better to say something specific to describe the > > > test intent, like: > > > > > > Test that replication into partitioned target table continues to works > > > correctly when the published table is altered > > > > > > > Okay changed this and slightly modify the comments and commit message. > > I am just attaching the HEAD patches for the first two issues. > > > > Thanks for updating the patch. > > Attached the new patch set which ran pgindent, and the patches for pg14 and > pg13. (Only the first two patches of the patch set.) > I have pushed the first bug-fix patch today. -- With Regards, Amit Kapila.
On Wed, Jun 15, 2022 8:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I have pushed the first bug-fix patch today. > Thanks. Attached the remaining patches which are rebased. Regards, Shi yu
Вложения
Hi, On Thu, Jun 16, 2022 at 2:07 PM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > On Wed, Jun 15, 2022 8:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I have pushed the first bug-fix patch today. > > Attached the remaining patches which are rebased. Thanks. Comments on v9-0001: + * Don't throw any error here just mark the relation entry as not updatable, + * as replica identity is only for updates and deletes but inserts can be + * replicated even without it. I know you're simply copying the old comment, but I think we can rewrite it to be slightly more useful: We just mark the relation entry as not updatable here if the local replica identity is found to be insufficient and leave it to check_relation_updatable() to throw the actual error if needed. + /* Check that replica identity matches. */ + logicalrep_rel_mark_updatable(entry); Maybe the comment (there are 2 instances) should say: Set if the table's replica identity is enough to apply update/delete. Finally, +# Alter REPLICA IDENTITY on subscriber. +# No REPLICA IDENTITY in the partitioned table on subscriber, but what we check +# is the partition, so it works fine. For consistency with other recently added comments, I'd suggest the following wording: Test that replication works correctly as long as the leaf partition has the necessary REPLICA IDENTITY, even though the actual target partitioned table does not. On v9-0002: + /* cleanup the invalid attrmap */ It seems that "invalid" here really means no-longer-useful, so we should use that phrase as a nearby comment does: Release the no-longer-useful attrmap, if any. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Thu, Jun 16, 2022 at 11:43 AM Amit Langote <amitlangote09@gmail.com> wrote: > > On Thu, Jun 16, 2022 at 2:07 PM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > On Wed, Jun 15, 2022 8:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > I have pushed the first bug-fix patch today. > > > > Attached the remaining patches which are rebased. > > Thanks. > > Comments on v9-0001: > > + * Don't throw any error here just mark the relation entry as not updatable, > + * as replica identity is only for updates and deletes but inserts can be > + * replicated even without it. > > I know you're simply copying the old comment, but I think we can > rewrite it to be slightly more useful: > > We just mark the relation entry as not updatable here if the local > replica identity is found to be insufficient and leave it to > check_relation_updatable() to throw the actual error if needed. > I am fine with improving this comment but it would be better if in some way we keep the following part of the comment: "as replica identity is only for updates and deletes but inserts can be replicated even without it." as that makes it more clear why it is okay to just mark the entry as not updatable. One idea could be: "We just mark the relation entry as not updatable here if the local replica identity is found to be insufficient and leave it to check_relation_updatable() to throw the actual error if needed. This is because replica identity is only for updates and deletes but inserts can be replicated even without it.". Feel free to suggest if you have any better ideas? -- With Regards, Amit Kapila.
On Thu, Jun 16, 2022 at 3:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Jun 16, 2022 at 11:43 AM Amit Langote <amitlangote09@gmail.com> wrote: > > + * Don't throw any error here just mark the relation entry as not updatable, > > + * as replica identity is only for updates and deletes but inserts can be > > + * replicated even without it. > > > > I know you're simply copying the old comment, but I think we can > > rewrite it to be slightly more useful: > > > > We just mark the relation entry as not updatable here if the local > > replica identity is found to be insufficient and leave it to > > check_relation_updatable() to throw the actual error if needed. > > I am fine with improving this comment but it would be better if in > some way we keep the following part of the comment: "as replica > identity is only for updates and deletes but inserts can be replicated > even without it." as that makes it more clear why it is okay to just > mark the entry as not updatable. One idea could be: "We just mark the > relation entry as not updatable here if the local replica identity is > found to be insufficient and leave it to check_relation_updatable() to > throw the actual error if needed. This is because replica identity is > only for updates and deletes but inserts can be replicated even > without it.". Feel free to suggest if you have any better ideas? I thought mentioning check_relation_updatable() would make it clear that only updates (and deletes) care about a valid local replica identity, because only apply_handle_{update|delete}() call that function. Anyway, how about this: We just mark the relation entry as not updatable here if the local replica identity is found to be insufficient for applying updates/deletes (inserts don't care!) and leave it to check_relation_updatable() to throw the actual error if needed. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Thu, Jun 16, 2022 at 12:30 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Thu, Jun 16, 2022 at 3:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jun 16, 2022 at 11:43 AM Amit Langote <amitlangote09@gmail.com> wrote: > > > + * Don't throw any error here just mark the relation entry as not updatable, > > > + * as replica identity is only for updates and deletes but inserts can be > > > + * replicated even without it. > > > > > > I know you're simply copying the old comment, but I think we can > > > rewrite it to be slightly more useful: > > > > > > We just mark the relation entry as not updatable here if the local > > > replica identity is found to be insufficient and leave it to > > > check_relation_updatable() to throw the actual error if needed. > > > > I am fine with improving this comment but it would be better if in > > some way we keep the following part of the comment: "as replica > > identity is only for updates and deletes but inserts can be replicated > > even without it." as that makes it more clear why it is okay to just > > mark the entry as not updatable. One idea could be: "We just mark the > > relation entry as not updatable here if the local replica identity is > > found to be insufficient and leave it to check_relation_updatable() to > > throw the actual error if needed. This is because replica identity is > > only for updates and deletes but inserts can be replicated even > > without it.". Feel free to suggest if you have any better ideas? > > I thought mentioning check_relation_updatable() would make it clear > that only updates (and deletes) care about a valid local replica > identity, because only apply_handle_{update|delete}() call that > function. Anyway, how about this: > > We just mark the relation entry as not updatable here if the local > replica identity is found to be insufficient for applying > updates/deletes (inserts don't care!) and leave it to > check_relation_updatable() to throw the actual error if needed. > This sounds better to me than the previous text. -- With Regards, Amit Kapila.
On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > @@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, > static void > check_relation_updatable(LogicalRepRelMapEntry *rel) > { > + /* > + * If it is a partitioned table, we don't check it, we will check its > + * partition later. > + */ > + if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) > + return; > > Why do this? I mean why if logicalrep_check_updatable() doesn't care > if the relation is partitioned or not -- it does all the work > regardless. > > I suggest we don't add this check in check_relation_updatable(). > I think based on this suggestion patch has moved this check to logicalrep_rel_mark_updatable(). For a partitioned table, it won't even validate whether it can mark updatable as false which seems odd to me even though there might not be any bug due to that. Was your suggestion actually intended to move it to logicalrep_rel_mark_updatable? If so, why do you think that is a better place? I think it is important to have this check to avoid giving error via check_relation_updatable() when partitioned tables don't have RI but not clear which is the right place. I think check_relation_updatable() is better place than logicalrep_rel_mark_updatable() but may be there is a reason why that is not a good idea. -- With Regards, Amit Kapila.
On Thu, Jun 16, 2022 at 6:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > @@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, > > static void > > check_relation_updatable(LogicalRepRelMapEntry *rel) > > { > > + /* > > + * If it is a partitioned table, we don't check it, we will check its > > + * partition later. > > + */ > > + if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) > > + return; > > > > Why do this? I mean why if logicalrep_check_updatable() doesn't care > > if the relation is partitioned or not -- it does all the work > > regardless. > > > > I suggest we don't add this check in check_relation_updatable(). > > I think based on this suggestion patch has moved this check to > logicalrep_rel_mark_updatable(). For a partitioned table, it won't > even validate whether it can mark updatable as false which seems odd > to me even though there might not be any bug due to that. Was your > suggestion actually intended to move it to > logicalrep_rel_mark_updatable? No, I didn't intend to suggest that we move this check to logicalrep_rel_mark_updatable(); didn't notice that that's what the latest patch did. What I said is that we shouldn't ignore the updatable flag for a partitioned table in check_relation_updatable(), because logicalrep_rel_mark_updatable() would have set the updatable flag correctly even for partitioned tables. IOW, we should not special-case partitioned tables anywhere. I guess the point of adding the check is to allow the case where a leaf partition's replica identity can be used to apply an update originally targeting its ancestor that doesn't itself have one. I wonder if it wouldn't be better to move the check_relation_updatable() call to apply_handle_{update|delete}_internal()? We know for sure that we only ever get there for leaf tables. If we do that, we won't need the relkind check. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Thu, Jun 16, 2022 at 5:24 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Thu, Jun 16, 2022 at 6:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > @@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, > > > static void > > > check_relation_updatable(LogicalRepRelMapEntry *rel) > > > { > > > + /* > > > + * If it is a partitioned table, we don't check it, we will check its > > > + * partition later. > > > + */ > > > + if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) > > > + return; > > > > > > Why do this? I mean why if logicalrep_check_updatable() doesn't care > > > if the relation is partitioned or not -- it does all the work > > > regardless. > > > > > > I suggest we don't add this check in check_relation_updatable(). > > > > I think based on this suggestion patch has moved this check to > > logicalrep_rel_mark_updatable(). For a partitioned table, it won't > > even validate whether it can mark updatable as false which seems odd > > to me even though there might not be any bug due to that. Was your > > suggestion actually intended to move it to > > logicalrep_rel_mark_updatable? > > No, I didn't intend to suggest that we move this check to > logicalrep_rel_mark_updatable(); didn't notice that that's what the > latest patch did. > > What I said is that we shouldn't ignore the updatable flag for a > partitioned table in check_relation_updatable(), because > logicalrep_rel_mark_updatable() would have set the updatable flag > correctly even for partitioned tables. IOW, we should not > special-case partitioned tables anywhere. > > I guess the point of adding the check is to allow the case where a > leaf partition's replica identity can be used to apply an update > originally targeting its ancestor that doesn't itself have one. > > I wonder if it wouldn't be better to move the > check_relation_updatable() call to > apply_handle_{update|delete}_internal()? We know for sure that we > only ever get there for leaf tables. If we do that, we won't need the > relkind check. > I think this won't work for updates via apply_handle_tuple_routing() unless we call it from some other place(s) as well. It will do FindReplTupleInLocalRel() before doing update/delete for CMD_UPDATE case and will lead to assertion failure. -- With Regards, Amit Kapila.
On Thu, Jun 16, 2022 at 9:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Jun 16, 2022 at 5:24 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Thu, Jun 16, 2022 at 6:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > On Fri, Jun 10, 2022 at 2:26 PM Amit Langote <amitlangote09@gmail.com> wrote: > > > > @@ -1735,6 +1735,13 @@ apply_handle_insert_internal(ApplyExecutionData *edata, > > > > static void > > > > check_relation_updatable(LogicalRepRelMapEntry *rel) > > > > { > > > > + /* > > > > + * If it is a partitioned table, we don't check it, we will check its > > > > + * partition later. > > > > + */ > > > > + if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) > > > > + return; > > > > > > > > Why do this? I mean why if logicalrep_check_updatable() doesn't care > > > > if the relation is partitioned or not -- it does all the work > > > > regardless. > > > > > > > > I suggest we don't add this check in check_relation_updatable(). > > > > > > I think based on this suggestion patch has moved this check to > > > logicalrep_rel_mark_updatable(). For a partitioned table, it won't > > > even validate whether it can mark updatable as false which seems odd > > > to me even though there might not be any bug due to that. Was your > > > suggestion actually intended to move it to > > > logicalrep_rel_mark_updatable? > > > > No, I didn't intend to suggest that we move this check to > > logicalrep_rel_mark_updatable(); didn't notice that that's what the > > latest patch did. > > > > What I said is that we shouldn't ignore the updatable flag for a > > partitioned table in check_relation_updatable(), because > > logicalrep_rel_mark_updatable() would have set the updatable flag > > correctly even for partitioned tables. IOW, we should not > > special-case partitioned tables anywhere. > > > > I guess the point of adding the check is to allow the case where a > > leaf partition's replica identity can be used to apply an update > > originally targeting its ancestor that doesn't itself have one. > > > > I wonder if it wouldn't be better to move the > > check_relation_updatable() call to > > apply_handle_{update|delete}_internal()? We know for sure that we > > only ever get there for leaf tables. If we do that, we won't need the > > relkind check. > > I think this won't work for updates via apply_handle_tuple_routing() > unless we call it from some other place(s) as well. It will do > FindReplTupleInLocalRel() before doing update/delete for CMD_UPDATE > case and will lead to assertion failure. You're right. I guess it's fine then to check the relkind in check_relation_updatable() the way the original patch did, even though it would've been nice if it didn't need to. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Thu, Jun 16, 2022 2:13 PM Amit Langote <amitlangote09@gmail.com> wrote: > > Hi, > > On Thu, Jun 16, 2022 at 2:07 PM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > On Wed, Jun 15, 2022 8:30 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > I have pushed the first bug-fix patch today. > > > > Attached the remaining patches which are rebased. > > Thanks. > > Comments on v9-0001: Thanks for your comments. > > + * Don't throw any error here just mark the relation entry as not updatable, > + * as replica identity is only for updates and deletes but inserts can be > + * replicated even without it. > > I know you're simply copying the old comment, but I think we can > rewrite it to be slightly more useful: > > We just mark the relation entry as not updatable here if the local > replica identity is found to be insufficient and leave it to > check_relation_updatable() to throw the actual error if needed. > Modified as you suggested in another mail [1]. > + /* Check that replica identity matches. */ > + logicalrep_rel_mark_updatable(entry); > > Maybe the comment (there are 2 instances) should say: > > Set if the table's replica identity is enough to apply update/delete. > Modified as suggested. > Finally, > > +# Alter REPLICA IDENTITY on subscriber. > +# No REPLICA IDENTITY in the partitioned table on subscriber, but what we > check > +# is the partition, so it works fine. > > For consistency with other recently added comments, I'd suggest the > following wording: > > Test that replication works correctly as long as the leaf partition > has the necessary REPLICA IDENTITY, even though the actual target > partitioned table does not. > Modified as suggested. > On v9-0002: > > + /* cleanup the invalid attrmap */ > > It seems that "invalid" here really means no-longer-useful, so we > should use that phrase as a nearby comment does: > > Release the no-longer-useful attrmap, if any. > Modified as suggested. Attached the new version of patch set. I also moved the partitioned table check in logicalrep_rel_mark_updatable() to check_relation_updatable() as discussed [2]. [1] https://www.postgresql.org/message-id/CA%2BHiwqG3Xi%3DwH4rBHm61ku-j0gm%2B-rc5VmDHxf%3DTeFkUsHtooA%40mail.gmail.com [2] https://www.postgresql.org/message-id/CA%2BHiwqHfN789ekiYVE%2B0xsLswMosMrWBwv4cPvYgWREWejw7HA%40mail.gmail.com Regards, Shi yu
Вложения
On Fri Jun 17, 2022 11:06 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > Attached the new version of patch set. I also moved the partitioned table > check > in logicalrep_rel_mark_updatable() to check_relation_updatable() as > discussed > [2]. > Attached back-branch patches of the first patch. Regards, Shi yu
Вложения
On Fri, Jun 17, 2022 at 11:22 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > On Fri Jun 17, 2022 11:06 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > > > Attached the new version of patch set. I also moved the partitioned table > > check > > in logicalrep_rel_mark_updatable() to check_relation_updatable() as > > discussed > > [2]. > > > > Attached back-branch patches of the first patch. > One minor comment: + /* + * If it is a partitioned table, we don't check it, we will check its + * partition later. + */ Can we change the above comment to: "For partitioned tables, we only need to care if the target partition is updatable (aka has PK or RI defined for it)."? Apart from this looks good to me. I'll push this tomorrow unless there are any more suggestions/comments. -- With Regards, Amit Kapila.
On Mon, Jun 20, 2022 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 17, 2022 at 11:22 AM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > > > On Fri Jun 17, 2022 11:06 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> > wrote: > > > > > > Attached the new version of patch set. I also moved the partitioned table > > > check > > > in logicalrep_rel_mark_updatable() to check_relation_updatable() as > > > discussed > > > [2]. > > > > > > > Attached back-branch patches of the first patch. > > > > One minor comment: > + /* > + * If it is a partitioned table, we don't check it, we will check its > + * partition later. > + */ > > Can we change the above comment to: "For partitioned tables, we only > need to care if the target partition is updatable (aka has PK or RI > defined for it)."? > Thanks for your comment. Modified in the attached patches. Regards, Shi yu
Вложения
On Mon, Jun 20, 2022 at 3:46 PM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > On Mon, Jun 20, 2022 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > One minor comment: > > + /* > > + * If it is a partitioned table, we don't check it, we will check its > > + * partition later. > > + */ > > > > Can we change the above comment to: "For partitioned tables, we only > > need to care if the target partition is updatable (aka has PK or RI > > defined for it)."? > > > Thanks for your comment. Modified in the attached patches. How about: ...target "leaf" partition is updatable Regarding the commit message's top line, which is this: Fix partition table's RI checking on the subscriber. I think it should spell out REPLICA IDENTITY explicitly to avoid the commit being confused to have to do with "Referential Integrity checking". -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Tue, Jun 21, 2022 at 7:49 AM Amit Langote <amitlangote09@gmail.com> wrote: > > On Mon, Jun 20, 2022 at 3:46 PM shiy.fnst@fujitsu.com > <shiy.fnst@fujitsu.com> wrote: > > On Mon, Jun 20, 2022 1:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > One minor comment: > > > + /* > > > + * If it is a partitioned table, we don't check it, we will check its > > > + * partition later. > > > + */ > > > > > > Can we change the above comment to: "For partitioned tables, we only > > > need to care if the target partition is updatable (aka has PK or RI > > > defined for it)."? > > > > > Thanks for your comment. Modified in the attached patches. > > How about: ...target "leaf" partition is updatable > I am not very sure if this is an improvement over the current. > Regarding the commit message's top line, which is this: > > Fix partition table's RI checking on the subscriber. > > I think it should spell out REPLICA IDENTITY explicitly to avoid the > commit being confused to have to do with "Referential Integrity > checking". > This makes sense. I'll take care of this. -- With Regards, Amit Kapila.
On Tue, Jun 21, 2022 at 8:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 21, 2022 at 7:49 AM Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > I think it should spell out REPLICA IDENTITY explicitly to avoid the > > commit being confused to have to do with "Referential Integrity > > checking". > > > > This makes sense. I'll take care of this. > After pushing this patch, buildfarm member prion has failed. https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=prion&br=HEAD It seems to me that the problem could be due to the reason that the entry returned by logicalrep_partition_open() may not have the correct value for localrel when we found the entry and localrelvalid is also true. The point is that before this commit we never use localrel value from the rel entry returned by logicalrep_partition_open. I think we need to always update the localrel value in logicalrep_partition_open(). -- With Regards, Amit Kapila.
On Tuesday, June 21, 2022 1:29 PM Amit Kapila <amit.kapila16@gmail.com>: > > On Tue, Jun 21, 2022 at 8:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Jun 21, 2022 at 7:49 AM Amit Langote <amitlangote09@gmail.com> > wrote: > > > > > > > > > I think it should spell out REPLICA IDENTITY explicitly to avoid the > > > commit being confused to have to do with "Referential Integrity > > > checking". > > > > > > > This makes sense. I'll take care of this. > > > > After pushing this patch, buildfarm member prion has failed. > https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=prion&br=HE > AD > > It seems to me that the problem could be due to the reason that the > entry returned by logicalrep_partition_open() may not have the correct > value for localrel when we found the entry and localrelvalid is also > true. The point is that before this commit we never use localrel value > from the rel entry returned by logicalrep_partition_open. I think we > need to always update the localrel value in > logicalrep_partition_open(). Agreed. And I have confirmed that the failure is due to the segmentation violation when access the cached relation. I reproduced this by using -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE option which was hinted by Tom. Stack: #0 check_relation_updatable (rel=0x1cf4548) at worker.c:1745 #1 0x0000000000909cbb in apply_handle_tuple_routing (edata=0x1cbf4e8, remoteslot=0x1cbf908, newtup=0x0, operation=CMD_DELETE)at worker.c:2181 #2 0x00000000009097a5 in apply_handle_delete (s=0x7ffcef7fd730) at worker.c:2005 #3 0x000000000090a794 in apply_dispatch (s=0x7ffcef7fd730) at worker.c:2503 #4 0x000000000090ad43 in LogicalRepApplyLoop (last_received=22299920) at worker.c:2775 #5 0x000000000090c2ab in start_apply (origin_startpos=0) at worker.c:3549 #6 0x000000000090ca8d in ApplyWorkerMain (main_arg=0) at worker.c:3805 #7 0x00000000008c4c64 in StartBackgroundWorker () at bgworker.c:858 #8 0x00000000008ceaeb in do_start_bgworker (rw=0x1c3c6b0) at postmaster.c:5815 #9 0x00000000008cee97 in maybe_start_bgworkers () at postmaster.c:6039 #10 0x00000000008cdf4e in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5204 #11 <signal handler called> #12 0x00007fd8fbe0d4ab in select () from /lib64/libc.so.6 #13 0x00000000008c9cfb in ServerLoop () at postmaster.c:1770 #14 0x00000000008c96e4 in PostmasterMain (argc=4, argv=0x1c110a0) at postmaster.c:1478 #15 0x00000000007c665b in main (argc=4, argv=0x1c110a0) at main.c:202 (gdb) p rel->localrel->rd_rel $5 = (Form_pg_class) 0x7f7f7f7f7f7f7f7f We didn't hit this problem because we only access that relation when we plan to report an error[1] and then the worker will restart and cache will be built, so everything seems OK. The problem seems already existed and we hit this because we started to access the cached relation in more places. I think we should try to update the relation every time as the relation is opened and closed by caller and here is the patch to do that. [1] /* * We are in error mode so it's fine this is somewhat slow. It's better to * give user correct error. */ if (OidIsValid(GetRelationIdentityOrPK(rel->localrel))) Best regards, Hou zj
Вложения
On Tue, Jun 21, 2022 at 3:35 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > On Tuesday, June 21, 2022 1:29 PM Amit Kapila <amit.kapila16@gmail.com>: > > After pushing this patch, buildfarm member prion has failed. > > https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=prion&br=HE > > AD > > > > It seems to me that the problem could be due to the reason that the > > entry returned by logicalrep_partition_open() may not have the correct > > value for localrel when we found the entry and localrelvalid is also > > true. The point is that before this commit we never use localrel value > > from the rel entry returned by logicalrep_partition_open. I think we > > need to always update the localrel value in > > logicalrep_partition_open(). > > Agreed. > > And I have confirmed that the failure is due to the segmentation violation when > access the cached relation. I reproduced this by using -DRELCACHE_FORCE_RELEASE > -DCATCACHE_FORCE_RELEASE option which was hinted by Tom. > > Stack: > #0 check_relation_updatable (rel=0x1cf4548) at worker.c:1745 > #1 0x0000000000909cbb in apply_handle_tuple_routing (edata=0x1cbf4e8, remoteslot=0x1cbf908, newtup=0x0, operation=CMD_DELETE)at worker.c:2181 > #2 0x00000000009097a5 in apply_handle_delete (s=0x7ffcef7fd730) at worker.c:2005 > #3 0x000000000090a794 in apply_dispatch (s=0x7ffcef7fd730) at worker.c:2503 > #4 0x000000000090ad43 in LogicalRepApplyLoop (last_received=22299920) at worker.c:2775 > #5 0x000000000090c2ab in start_apply (origin_startpos=0) at worker.c:3549 > #6 0x000000000090ca8d in ApplyWorkerMain (main_arg=0) at worker.c:3805 > #7 0x00000000008c4c64 in StartBackgroundWorker () at bgworker.c:858 > #8 0x00000000008ceaeb in do_start_bgworker (rw=0x1c3c6b0) at postmaster.c:5815 > #9 0x00000000008cee97 in maybe_start_bgworkers () at postmaster.c:6039 > #10 0x00000000008cdf4e in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5204 > #11 <signal handler called> > #12 0x00007fd8fbe0d4ab in select () from /lib64/libc.so.6 > #13 0x00000000008c9cfb in ServerLoop () at postmaster.c:1770 > #14 0x00000000008c96e4 in PostmasterMain (argc=4, argv=0x1c110a0) at postmaster.c:1478 > #15 0x00000000007c665b in main (argc=4, argv=0x1c110a0) at main.c:202 > (gdb) p rel->localrel->rd_rel > $5 = (Form_pg_class) 0x7f7f7f7f7f7f7f7f > > We didn't hit this problem because we only access that relation when we plan to > report an error[1] and then the worker will restart and cache will be built, so > everything seems OK. > > The problem seems already existed and we hit this because we started to access > the cached relation in more places. > > I think we should try to update the relation every time as the relation is > opened and closed by caller and here is the patch to do that. Thanks for the patch. I agree it's an old bug. A partition map entry's localrel may point to a stale Relation pointer, because once the caller had closed the relation, the relcache subsystem is free to "clear" it, like in the case of a RELCACHE_FORCE_RELEASE build. Fixing it the way patch does seems fine, though it feels like localrelvalid will lose some of its meaning for the partition map entries -- we will now overwrite localrel even if localrelvalid is true. + /* + * Relation is opened and closed by caller, so we need to always update the + * partrel in case the cached relation was closed. + */ + entry->localrel = partrel; + + if (entry->localrelvalid) return entry; Maybe we should add a comment here about why it's okay to overwrite localrel even if localrelvalid is true. How about the following hunk: @@ -596,8 +596,20 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root, entry = &part_entry->relmapentry; + /* + * We must always overwrite entry->localrel with the latest partition + * Relation pointer, because the Relation pointed to by the old value may + * have been cleared after the caller would have closed the partition + * relation after the last use of this entry. Note that localrelvalid is + * only updated by the relcache invalidation callback, so it may still be + * true irrespective of whether the Relation pointed to by localrel has + * been cleared or not. + */ if (found && entry->localrelvalid) + { + entry->localrel = partrel; return entry; + } Attached a patch containing the above to consider as an alternative. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Вложения
On Tuesday, June 21, 2022 3:21 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jun 21, 2022 at 3:35 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, June 21, 2022 1:29 PM Amit Kapila <amit.kapila16@gmail.com>: > > > After pushing this patch, buildfarm member prion has failed. > > > > https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=prion&br=HE > > > AD > > > > > > It seems to me that the problem could be due to the reason that the > > > entry returned by logicalrep_partition_open() may not have the correct > > > value for localrel when we found the entry and localrelvalid is also > > > true. The point is that before this commit we never use localrel value > > > from the rel entry returned by logicalrep_partition_open. I think we > > > need to always update the localrel value in > > > logicalrep_partition_open(). > > > > Agreed. > > > > And I have confirmed that the failure is due to the segmentation violation > when > > access the cached relation. I reproduced this by using > -DRELCACHE_FORCE_RELEASE > > -DCATCACHE_FORCE_RELEASE option which was hinted by Tom. > > > > Stack: > > #0 check_relation_updatable (rel=0x1cf4548) at worker.c:1745 > > #1 0x0000000000909cbb in apply_handle_tuple_routing (edata=0x1cbf4e8, > remoteslot=0x1cbf908, newtup=0x0, operation=CMD_DELETE) at > worker.c:2181 > > #2 0x00000000009097a5 in apply_handle_delete (s=0x7ffcef7fd730) at > worker.c:2005 > > #3 0x000000000090a794 in apply_dispatch (s=0x7ffcef7fd730) at > worker.c:2503 > > #4 0x000000000090ad43 in LogicalRepApplyLoop > (last_received=22299920) at worker.c:2775 > > #5 0x000000000090c2ab in start_apply (origin_startpos=0) at worker.c:3549 > > #6 0x000000000090ca8d in ApplyWorkerMain (main_arg=0) at > worker.c:3805 > > #7 0x00000000008c4c64 in StartBackgroundWorker () at bgworker.c:858 > > #8 0x00000000008ceaeb in do_start_bgworker (rw=0x1c3c6b0) at > postmaster.c:5815 > > #9 0x00000000008cee97 in maybe_start_bgworkers () at postmaster.c:6039 > > #10 0x00000000008cdf4e in sigusr1_handler (postgres_signal_arg=10) at > postmaster.c:5204 > > #11 <signal handler called> > > #12 0x00007fd8fbe0d4ab in select () from /lib64/libc.so.6 > > #13 0x00000000008c9cfb in ServerLoop () at postmaster.c:1770 > > #14 0x00000000008c96e4 in PostmasterMain (argc=4, argv=0x1c110a0) at > postmaster.c:1478 > > #15 0x00000000007c665b in main (argc=4, argv=0x1c110a0) at main.c:202 > > (gdb) p rel->localrel->rd_rel > > $5 = (Form_pg_class) 0x7f7f7f7f7f7f7f7f > > > > We didn't hit this problem because we only access that relation when we plan > to > > report an error[1] and then the worker will restart and cache will be built, so > > everything seems OK. > > > > The problem seems already existed and we hit this because we started to > access > > the cached relation in more places. > > > > I think we should try to update the relation every time as the relation is > > opened and closed by caller and here is the patch to do that. > Thanks for the patch. > > I agree it's an old bug. A partition map entry's localrel may point > to a stale Relation pointer, because once the caller had closed the > relation, the relcache subsystem is free to "clear" it, like in the > case of a RELCACHE_FORCE_RELEASE build. Hi, Thanks for replying. > Fixing it the way patch does seems fine, though it feels like > localrelvalid will lose some of its meaning for the partition map > entries -- we will now overwrite localrel even if localrelvalid is > true. To me, it seems localrelvalid doesn't have the meaning that the cached relation pointer is valid. In logicalrep_rel_open(), we also reopen and update the relation even if the localrelvalid is true. > + /* > + * Relation is opened and closed by caller, so we need to always update the > + * partrel in case the cached relation was closed. > + */ > + entry->localrel = partrel; > + > + if (entry->localrelvalid) > return entry; > > Maybe we should add a comment here about why it's okay to overwrite > localrel even if localrelvalid is true. How about the following hunk: > > @@ -596,8 +596,20 @@ logicalrep_partition_open(LogicalRepRelMapEntry > *root, > > entry = &part_entry->relmapentry; > > + /* > + * We must always overwrite entry->localrel with the latest partition > + * Relation pointer, because the Relation pointed to by the old value may > + * have been cleared after the caller would have closed the partition > + * relation after the last use of this entry. Note that localrelvalid is > + * only updated by the relcache invalidation callback, so it may still be > + * true irrespective of whether the Relation pointed to by localrel has > + * been cleared or not. > + */ > if (found && entry->localrelvalid) > + { > + entry->localrel = partrel; > return entry; > + } > > Attached a patch containing the above to consider as an alternative. This looks fine to me as well. Best regards, Hou zj
On Tue, Jun 21, 2022 at 12:50 PM Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Jun 21, 2022 at 3:35 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > Attached a patch containing the above to consider as an alternative. > Thanks, the patch looks good to me. I'll push this after doing some testing. -- With Regards, Amit Kapila.
On Tuesday, June 21, 2022 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 21, 2022 at 12:50 PM Amit Langote <amitlangote09@gmail.com> > wrote: > > > > On Tue, Jun 21, 2022 at 3:35 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > Attached a patch containing the above to consider as an alternative. > > > > Thanks, the patch looks good to me. I'll push this after doing some testing. The patch looks good to me as well. I also verified that the patch can be applied cleanly on back-branches and I confirmed that the bug exists on back branches before this patch and is fixed after applying this patch. The regression tests also passed with and without RELCACHE_FORCE_RELEASE option in my machine. Regards, Shi yu
On Tue, Jun 21, 2022 at 5:08 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > On Tuesday, June 21, 2022 3:21 PM Amit Langote <amitlangote09@gmail.com> wrote: > > Thanks for the patch. > > > > I agree it's an old bug. A partition map entry's localrel may point > > to a stale Relation pointer, because once the caller had closed the > > relation, the relcache subsystem is free to "clear" it, like in the > > case of a RELCACHE_FORCE_RELEASE build. > > Hi, > > Thanks for replying. > > > Fixing it the way patch does seems fine, though it feels like > > localrelvalid will lose some of its meaning for the partition map > > entries -- we will now overwrite localrel even if localrelvalid is > > true. > > To me, it seems localrelvalid doesn't have the meaning that the cached relation > pointer is valid. In logicalrep_rel_open(), we also reopen and update the > relation even if the localrelvalid is true. Ah, right. I guess only the localrelvalid=false case is really interesting then. Only in that case, we need to (re-)build other fields that are computed using localrel. In the localrelvalid=true case, we don't need to worry about other fields, but still need to make sure that localrel points to an up to date relcache entry of the relation. > > + /* > > + * Relation is opened and closed by caller, so we need to always update the > > + * partrel in case the cached relation was closed. > > + */ > > + entry->localrel = partrel; > > + > > + if (entry->localrelvalid) > > return entry; > > > > Maybe we should add a comment here about why it's okay to overwrite > > localrel even if localrelvalid is true. How about the following hunk: > > > > @@ -596,8 +596,20 @@ logicalrep_partition_open(LogicalRepRelMapEntry > > *root, > > > > entry = &part_entry->relmapentry; > > > > + /* > > + * We must always overwrite entry->localrel with the latest partition > > + * Relation pointer, because the Relation pointed to by the old value may > > + * have been cleared after the caller would have closed the partition > > + * relation after the last use of this entry. Note that localrelvalid is > > + * only updated by the relcache invalidation callback, so it may still be > > + * true irrespective of whether the Relation pointed to by localrel has > > + * been cleared or not. > > + */ > > if (found && entry->localrelvalid) > > + { > > + entry->localrel = partrel; > > return entry; > > + } > > > > Attached a patch containing the above to consider as an alternative. > > This looks fine to me as well. Thank you. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Tuesday, June 21, 2022 4:49 PM Amit Kapila <amit.kapila16@gmail.com> > > On Tue, Jun 21, 2022 at 12:50 PM Amit Langote <amitlangote09@gmail.com> > wrote: > > > > On Tue, Jun 21, 2022 at 3:35 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > > Attached a patch containing the above to consider as an alternative. > > > > Thanks, the patch looks good to me. I'll push this after doing some testing. Since the patch has been committed. Attach the last patch to fix the memory leak. The bug exists on PG10 ~ PG15(HEAD). For HEAD,PG14,PG13, to fix the memory leak, I think we should use free_attrmap instead of pfree and release the no-longer-useful attrmap When rebuilding the map info. For PG12,PG11,PG10, we only need to add the code to release the no-longer-useful attrmap when rebuilding the map info. We can still use pfree() because the attrmap in back-branch is a single array like: entry->attrmap = palloc(desc->natts * sizeof(AttrNumber)); Best regards, Hou zj
Вложения
Hi, On Wed, Jun 22, 2022 at 12:02 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > Since the patch has been committed. Attach the last patch to fix the memory leak. > > The bug exists on PG10 ~ PG15(HEAD). > > For HEAD,PG14,PG13, to fix the memory leak, I think we should use > free_attrmap instead of pfree and release the no-longer-useful attrmap > When rebuilding the map info. > > For PG12,PG11,PG10, we only need to add the code to release the > no-longer-useful attrmap when rebuilding the map info. We can still use > pfree() because the attrmap in back-branch is a single array like: > > entry->attrmap = palloc(desc->natts * sizeof(AttrNumber)); LGTM, thank you. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Wed, Jun 22, 2022 at 10:09 AM Amit Langote <amitlangote09@gmail.com> wrote: > > On Wed, Jun 22, 2022 at 12:02 PM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > Since the patch has been committed. Attach the last patch to fix the memory leak. > > > > The bug exists on PG10 ~ PG15(HEAD). > > > > For HEAD,PG14,PG13, to fix the memory leak, I think we should use > > free_attrmap instead of pfree and release the no-longer-useful attrmap > > When rebuilding the map info. > > > > For PG12,PG11,PG10, we only need to add the code to release the > > no-longer-useful attrmap when rebuilding the map info. We can still use > > pfree() because the attrmap in back-branch is a single array like: > > > > entry->attrmap = palloc(desc->natts * sizeof(AttrNumber)); > > LGTM, thank you. > LGTM as well. One thing I am not completely sure about is whether to make this change in PG10 for which the final release is in Nov? AFAICS, the leak can only occur after the relcache invalidation on the subscriber which may or may not be a very frequent case. What do you guys think? Personally, I feel it is good to fix it in all branches including PG10. -- With Regards, Amit Kapila.
On Wed, Jun 22, 2022 at 8:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Jun 22, 2022 at 10:09 AM Amit Langote <amitlangote09@gmail.com> wrote: > > On Wed, Jun 22, 2022 at 12:02 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > Since the patch has been committed. Attach the last patch to fix the memory leak. > > > > > > The bug exists on PG10 ~ PG15(HEAD). > > > > > > For HEAD,PG14,PG13, to fix the memory leak, I think we should use > > > free_attrmap instead of pfree and release the no-longer-useful attrmap > > > When rebuilding the map info. > > > > > > For PG12,PG11,PG10, we only need to add the code to release the > > > no-longer-useful attrmap when rebuilding the map info. We can still use > > > pfree() because the attrmap in back-branch is a single array like: > > > > > > entry->attrmap = palloc(desc->natts * sizeof(AttrNumber)); > > > > LGTM, thank you. > > LGTM as well. One thing I am not completely sure about is whether to > make this change in PG10 for which the final release is in Nov? > AFAICS, the leak can only occur after the relcache invalidation on the > subscriber which may or may not be a very frequent case. What do you > guys think? Agree that the leak does not seem very significant, though... > Personally, I feel it is good to fix it in all branches including PG10. ...yes, why not. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
On Wednesday, June 22, 2022 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 22, 2022 at 10:09 AM Amit Langote <amitlangote09@gmail.com> > wrote: > > > > On Wed, Jun 22, 2022 at 12:02 PM houzj.fnst@fujitsu.com > > <houzj.fnst@fujitsu.com> wrote: > > > Since the patch has been committed. Attach the last patch to fix the > memory leak. > > > > > > The bug exists on PG10 ~ PG15(HEAD). > > > > > > For HEAD,PG14,PG13, to fix the memory leak, I think we should use > > > free_attrmap instead of pfree and release the no-longer-useful > > > attrmap When rebuilding the map info. > > > > > > For PG12,PG11,PG10, we only need to add the code to release the > > > no-longer-useful attrmap when rebuilding the map info. We can still > > > use > > > pfree() because the attrmap in back-branch is a single array like: > > > > > > entry->attrmap = palloc(desc->natts * sizeof(AttrNumber)); > > > > LGTM, thank you. > > > > LGTM as well. One thing I am not completely sure about is whether to make this > change in PG10 for which the final release is in Nov? > AFAICS, the leak can only occur after the relcache invalidation on the subscriber > which may or may not be a very frequent case. What do you guys think? > > Personally, I feel it is good to fix it in all branches including PG10. +1 Best regards, Hou zj
On Wed, Jun 22, 2022 at 5:05 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wednesday, June 22, 2022 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Jun 22, 2022 at 10:09 AM Amit Langote <amitlangote09@gmail.com> > > wrote: > > > > > > On Wed, Jun 22, 2022 at 12:02 PM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > Since the patch has been committed. Attach the last patch to fix the > > memory leak. > > > > > > > > The bug exists on PG10 ~ PG15(HEAD). > > > > > > > > For HEAD,PG14,PG13, to fix the memory leak, I think we should use > > > > free_attrmap instead of pfree and release the no-longer-useful > > > > attrmap When rebuilding the map info. > > > > > > > > For PG12,PG11,PG10, we only need to add the code to release the > > > > no-longer-useful attrmap when rebuilding the map info. We can still > > > > use > > > > pfree() because the attrmap in back-branch is a single array like: > > > > > > > > entry->attrmap = palloc(desc->natts * sizeof(AttrNumber)); > > > > > > LGTM, thank you. > > > > > > > LGTM as well. One thing I am not completely sure about is whether to make this > > change in PG10 for which the final release is in Nov? > > AFAICS, the leak can only occur after the relcache invalidation on the subscriber > > which may or may not be a very frequent case. What do you guys think? > > > > Personally, I feel it is good to fix it in all branches including PG10. > +1 > Pushed! -- With Regards, Amit Kapila.