Обсуждение: More tests with USING INDEX replident and dropped indexes
Hi all, While working on some other logical decoding patch recently, I bumped into the fact that we have special handling for the case of REPLICA IDENTITY USING INDEX when the dependent index is dropped, where the code handles that case as an equivalent of NOTHING. Attached is a patch to add more coverage for that. Any thoughts? -- Michael
Вложения
On Fri, 22 May 2020 at 12:50, Michael Paquier <michael@paquier.xyz> wrote: > > Hi all, > > While working on some other logical decoding patch recently, I bumped > into the fact that we have special handling for the case of REPLICA > IDENTITY USING INDEX when the dependent index is dropped, where the > code handles that case as an equivalent of NOTHING. > > Attached is a patch to add more coverage for that. Any thoughts? How about avoiding such an inconsistent situation? In that case, replica identity works as NOTHING, but pg_class.relreplident is still ‘i’, confusing users. It seems to me that dropping an index specified by REPLICA IDENTITY USING INDEX is not a valid operation. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jun 02, 2020 at 04:46:55PM +0900, Masahiko Sawada wrote: > How about avoiding such an inconsistent situation? In that case, > replica identity works as NOTHING, but pg_class.relreplident is still > ‘i’, confusing users. It seems to me that dropping an index specified > by REPLICA IDENTITY USING INDEX is not a valid operation. This looks first like complicating RemoveRelations() or the internal object removal APIs with a dedicated lookup at this index's pg_index tuple, but you could just put that in index_drop when REINDEX CONCURRENTLY is not used. Still, I am not sure if it is worth complicating those code paths. It would be better to get more opinions about that first. -- Michael
Вложения
On Tue, Jun 02, 2020 at 04:46:55PM +0900, Masahiko Sawada wrote:
> How about avoiding such an inconsistent situation? In that case,
> replica identity works as NOTHING, but pg_class.relreplident is still
> ‘i’, confusing users. It seems to me that dropping an index specified
> by REPLICA IDENTITY USING INDEX is not a valid operation.
This looks first like complicating RemoveRelations() or the internal
object removal APIs with a dedicated lookup at this index's pg_index
tuple, but you could just put that in index_drop when REINDEX
CONCURRENTLY is not used. Still, I am not sure if it is worth
complicating those code paths. It would be better to get more
opinions about that first.
relation while dropping the index? relation_mark_replica_identity() already
does that but do other things too. Let's move the first code block from
relation_mark_replica_identity to another function and call this new function
while dropping the index.
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jun 03, 2020 at 12:08:56PM -0300, Euler Taveira wrote: > Consistency is a good goal. Why don't we clear the relreplident from the > relation while dropping the index? relation_mark_replica_identity() already > does that but do other things too. Let's move the first code block from > relation_mark_replica_identity to another function and call this new > function > while dropping the index. I have looked at your suggestion, and finished with the attached. There are a couple of things to be aware of: - For DROP INDEX CONCURRENTLY, pg_class.relreplident of the parent table is set in the last transaction doing the drop. It would be tempting to reset the flag in the same transaction as the one marking the index as invalid, but there could be a point in reindexing the invalid index whose drop has failed, and this adds more complexity to the patch. - relreplident is switched to REPLICA_IDENTITY_NOTHING, which is the behavior we have now after an index is dropped, even if there is a primary key. - The CCI done even if ri_type is not updated in index_drop() may look useless, but the CCI will happen in any case as switching the replica identity of a table to NOTHING resets pg_index.indisreplident for an index previously used. Thoughts? -- Michael
Вложения
Hi,I couldn't test the patch as it does not apply cleanly on master.
Please find below some review comments:
1. Would it better to throw a warning at the time of dropping the REPLICA IDENTITY
index that it would also drop the REPLICA IDENTITY of the parent table?
2. CCI is used after calling SetRelationReplIdent from index_drop() but not
after following callrelation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
bool is_internal)
+ /* update relreplident if necessary */
+ SetRelationReplIdent(RelationGetRelid(rel), ri_type);
3. I think the former variable names `pg_class_tuple` and `pg_class_form`provided better clarity
+ HeapTuple tuple;
+ Form_pg_class classtuple;
- relreplident is switched to REPLICA_IDENTITY_NOTHING, which is the behavior we have now after an index is dropped, even if there is a primary key.
4. I am not aware of the reason of the current behavior, however it seems better
to switch to REPLICA_IDENTITY_DEFAULT. Can't think of a reason why user would not be
fine with using primary key as replica identity when replica identity index is dropped
Thank you,
On Wed, Aug 19, 2020 at 05:33:36PM +0530, Rahila Syed wrote: > I couldn't test the patch as it does not apply cleanly on master. The CF bot is green, and I can apply v2 cleanly FWIW: http://commitfest.cputube.org/michael-paquier.html > Please find below some review comments: > > 1. Would it better to throw a warning at the time of dropping the REPLICA > IDENTITY index that it would also drop the REPLICA IDENTITY of the > parent table? Not sure that's worth it. Updating relreplident is a matter of consistency. > 2. CCI is used after calling SetRelationReplIdent from index_drop() but not > after following call > > relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid, > bool is_internal) > > + /* update relreplident if necessary */ > + SetRelationReplIdent(RelationGetRelid(rel), ri_type); Yes, not having a CCI here is the intention, because it is not necessary. That's not the case in index_drop() where the pg_class entry of the parent table gets updated afterwards. > 3. I think the former variable names `pg_class_tuple` and > `pg_class_form`provided better clarity > + HeapTuple tuple; > > + Form_pg_class classtuple; This is chosen to be consistent with SetRelationHasSubclass(). > 4. I am not aware of the reason of the current behavior, however it seems > better > > to switch to REPLICA_IDENTITY_DEFAULT. Can't think of a reason why user > would not be fine with using primary key as replica identity when > replica identity index is dropped Using NOTHING is more consistent with the current behavior we have since 9.4 now. There could be an argument for switching back to the default, but that could be surprising to change an old behavior. -- Michael
Вложения
Hi, >> I couldn't test the patch as it does not apply cleanly on master. > The CF bot is green, and I can apply v2 cleanly FWIW: > http://commitfest.cputube.org/michael-paquier.html Sorry, I didn't apply correctly. The tests pass for me. In addition, I tested with partitioned tables. It works as expected and makes the REPLICA IDENTITY 'n' for the partitions as well when an index on a partitioned table is dropped. Thank you,
On Tue, Aug 25, 2020 at 08:59:37PM +0530, Rahila wrote: > Sorry, I didn't apply correctly. The tests pass for me. In addition, I > tested with partitioned tables. It works as expected and makes the REPLICA > IDENTITY 'n' for the partitions as well when an index on a partitioned > table is dropped. Indeed. I have added a test for that. While looking at this patch again, I noticed that the new tests for contrib/test_decoding/ and the improvements for relreplident are really two different bullet points, and that the new tests should not be impacted as long as we switch to NOTHING the replica identity once its index is dropped. So, for now, I have applied the new decoding tests with a first commit, and attached is an updated patch which includes tests in the main regression test suite for replica_identity.sql, which is more consistent with the rest as that's the main place where we look at the state of pg_class.relreplident. I'll go through this part again tomorrow, it is late here. -- Michael
Вложения
On Wed, Aug 26, 2020 at 09:08:51PM +0900, Michael Paquier wrote: > I'll go through this part again tomorrow, it is late here. There is actually a huge gotcha here that exists since replica identities exist: relation_mark_replica_identity() only considers valid indexes! So, on HEAD, if DROP INDEX CONCURRENTLY fails in the second transaction doing the physical drop, then we would finish with a catalog entry that has indisreplident and !indisinvalid. If you reindex it afterwards and switch the index back to be valid, there can be more than one valid index marked with indisreplident, which messes up with the logic in tablecmds.c because we use RelationGetIndexList(), that discards invalid indexes. This case is actually rather similar to the handling of indisclustered. So we have a set of two issues here: 1) indisreplident should be switched to false when we clear the valid flag of an index for INDEX_DROP_CLEAR_VALID. This issue exists since 9.4. 2) To never finish in a state where we have REPLICA IDENTITY INDEX without an index marked as indisreplident, we need to reset the replident of the parent table in the same transaction as the one clearing indisvalid for the concurrent case. That's a problem of the patch proposed upthread. Attached is a patch for 1) and 2) grouped together, to ease review for now. I think that we had better fix 1) separately though, so I am going to start a new thread about that with a separate patch as the current thread is misleading. -- Michael
Вложения
On Thu, Aug 27, 2020 at 11:28:35AM +0900, Michael Paquier wrote: > Attached is a patch for 1) and 2) grouped together, to ease review for > now. I think that we had better fix 1) separately though, so I am > going to start a new thread about that with a separate patch as the > current thread is misleading. A fix for consistency problem with indisreplident and invalid indexes has been committed as of 9511fb37. Attached is a rebased patch, where I noticed two incorrect things: - The comment of REPLICA_IDENTITY_INDEX is originally incorrect. If the index is dropped, the replica index switches back to nothing. - relreplident was getting reset one transaction too late, when the old index is marked as dead. -- Michael
Вложения
Hi Michael,
Thanks for updating the patch.
Please see following comments,
1.
*/
if (resetreplident)
{
SetRelationReplIdent(heapId, REPLICA_IDENTITY_NOTHING);
/* make the update visible, only for the non-concurrent case */
if (!concurrent)
CommandCounterIncrement();
}
/* continue the concurrent process */
if (concurrent)
{
PopActiveSnapshot();
CommitTransactionCommand();
StartTransactionCommand();
Now, the relreplident is being set in the transaction previous to
the one marking index as invalid for concurrent drop. Won't
it cause issues with relreplident cleared but index not dropped,
if drop index concurrently fails in the small window after
commit transaction in above snippet and before the start transaction above?
Although, it seems like a really small window.
2. I have following suggestion for the test. To the existing partitioned table test, can we add a test to demonstrate
that relreplident is set to 'n' for even the individual partitions.
I mean explicitly setting replica identity index for individual partitions
ALTER TABLE part1 REPLICA IDENTITY
USING INDEX test_replica_part_index_part_1;
and checking for relreplident for individual partition after parent index is dropped.
SELECT relreplident FROM pg_class WHERE oid = 'part1'::regclass;
3.
+* Again, commit the transaction to make the pg_index and pg_class
+ * (in the case where indisreplident is set) updates are visible to
+ * other sessions.
Typo, s/updates are visible/updates visible
4. I am wondering with the recent change to move the SetRelationRepldent
together with invalidating index, whether your initial concern stated as follows
becomes valid.
- For DROP INDEX CONCURRENTLY, pg_class.relreplident of the parent
table is set in the last transaction doing the drop. It would be
tempting to reset the flag in the same transaction as the one marking
the index as invalid, but there could be a point in reindexing the
invalid index whose drop has failed, and this adds more complexity to
the patch.
I will try to test that.
Thank you,
Rahila Syed
On Mon, Aug 31, 2020 at 10:36:13AM +0530, Rahila wrote: > Now, the relreplident is being set in the transaction previous to > the one marking index as invalid for concurrent drop. Won't > it cause issues with relreplident cleared but index not dropped, > if drop index concurrently fails in the small window after > commit transaction in above snippet and before the start transaction above? Argh. I missed your point that this stuff uses heap_inplace_update(), so the update of indisvalid flag is not transactional. The thing is that we won't be able to update the flag consistently with relreplident except if we switch index_set_state_flags() to use a transactional operation here. So, with the current state of affairs, it looks like we could just call SetRelationReplIdent() in the last transaction, after the transaction marking the index as dead has committed (see the top of index_set_state_flags() saying that this should be the last step of a transaction), but that leaves a window open as you say. On the other hand, it looks appealing to make index_set_state_flags() transactional. This would solve the consistency problem, and looking at the code scanning pg_index, I don't see a reason why we could not do that. What do you think? > To the existing partitioned table test, can we add a test to demonstrate > that relreplident is set to 'n' for even the individual partitions. Makes sense. It is still important to test the case where a partitioned table without leaves is correctly reset IMO. > Typo, s/updates are visible/updates visible Thanks. > - For DROP INDEX CONCURRENTLY, pg_class.relreplident of the parent > table is set in the last transaction doing the drop. It would be > tempting to reset the flag in the same transaction as the one marking > the index as invalid, but there could be a point in reindexing the > invalid index whose drop has failed, and this adds morecomplexity to > the patch. That's a point I studied, but it makes actually more sense to just reset the flag once the index is marked as invalid, similarly to indisclustered. Reindexing an invalid index can have value in some cases, but we would have much more problems with the relcache if we finish with two indexes marked as indreplident :( -- Michael
Вложения
On the other hand, it looks appealing to make index_set_state_flags()
transactional. This would solve the consistency problem, and looking
at the code scanning pg_index, I don't see a reason why we could not
do that. What do you think?
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Sep 02, 2020 at 09:27:33AM +0530, Rahila Syed wrote: > TBH, I am not sure. I think it is a reasonable change. It is even > indicated in the > comment above index_set_state_flags() that it can be made transactional. > At the same time, probably we can just go ahead with current > inconsistent update of relisreplident and indisvalid flags. Can't see what > will break with that. Yeah, that's true as well. Still, I would like to see first if people are fine with changing this code path to be transactional. This way, we will have zero history in the tree where there was a risk for an inconsistent window. -- Michael
Вложения
On Wed, Sep 02, 2020 at 03:00:44PM +0900, Michael Paquier wrote: > Yeah, that's true as well. Still, I would like to see first if people > are fine with changing this code path to be transactional. This way, > we will have zero history in the tree where there was a risk for an > inconsistent window. So, I have begun a thread about that part with a dedicated CF entry: https://commitfest.postgresql.org/30/2725/ While considering more this point, I think that making this code path transactional is the correct way to go, as a first step. Also, as this thread is about adding more tests and that this has been done with fe7fd4e9, I am marking the CF entry as committed. Let's discuss the reset of relreplident for the parent relation when its replica identity index is dropped once the transactional part is fully discussed, on a new thread. -- Michael