Re: BUG #18815: Logical replication worker Segmentation fault
От | Tom Lane |
---|---|
Тема | Re: BUG #18815: Logical replication worker Segmentation fault |
Дата | |
Msg-id | 1147221.1739843510@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #18815: Logical replication worker Segmentation fault (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-bugs |
I wrote: > The problem seems to be that apply_handle_insert_internal does > ExecOpenIndices and then ExecCloseIndices, and then > ExecCleanupTupleRouting does ExecCloseIndices again, which nicely > explains why brininsertcleanup blows up if you happen to have a BRIN > index involved. What it doesn't explain is how come we don't see > other symptoms from the duplicate index_close calls, regardless of > index type. Hah, I see it: this bug is specific to the apply_handle_tuple_routing code path. The ResultRelInfo we're dealing with is made by ExecFindPartition, which does ExecOpenIndices on it. Then apply_handle_tuple_routing calls apply_handle_insert_internal, which does ExecOpenIndices *again* on the same ResultRelInfo. That gets a second refcount and second lock on the index(es), and leaks all the IndexInfo data structures made by the first call. When done, apply_handle_insert_internal does ExecCloseIndices, releasing one refcount and lock. Then, back in apply_handle_tuple_routing, we do ExecCloseIndices again. So the refcounts and locks balance out, and it very accidentally fails to crash, so long as nobody is expecting the contents of the IndexInfos to match up. The "Move the tuple into the new partition" path in apply_handle_tuple_routing's UPDATE case is even squirrelier. That does another ExecFindPartition and then apply_handle_insert_internal, but there's no ExecCloseIndices call balancing the ExecFindPartition, meaning it looks like it's leaking refcount and lock. Experimentation shows that it's not, because the matching ExecCloseIndices is eventually done by the ExecCleanupTupleRouting call in finish_edata after returning to the outer apply_handle_update. So if you ask me, this is impossibly convoluted and in need of a significant rewrite. It's not okay to be opening/closing the ResultRelInfo's indexes twice (not least because "speculative" is different in the two open calls). And it's not okay to have such radically different paths for cleaning up the index-opening done by ExecFindPartition. It's even more not okay that there's not one word of commentary about this complexity, strongly suggesting that the original author didn't quite understand how it worked either. regards, tom lane
В списке pgsql-bugs по дате отправления: