Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
От | vignesh C |
---|---|
Тема | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
Дата | |
Msg-id | CALDaNm0=JNjLqFzcAhujHLH1XQWcu+Jh_WJ6CCuqpjNb-fLnzg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-bugs |
On Sat, 31 May 2025 at 10:20, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, May 30, 2025 at 11:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Thank you for updating the patch. Here are some review comments: > > > > --- > > + /* > > + * Make sure there's no cache pollution. Unlike the PG_TRY part, > > + * this must be done unconditionally because the processing might > > + * fail before we reach invalidation messages. > > + */ > > + if (rbtxn_inval_all_cache(txn)) > > + InvalidateSystemCaches(); > > + else > > + ReorderBufferExecuteInvalidations(txn->ninvalidations_distr, > > + > > txn->distributed_invalidations); > > + > > > > If we don't need to execute the distributed inval message in an error > > path other than detecting concurrent abort, we should describe the > > reason. > > > > The concurrent abort handling is done for streaming and prepared > transactions, where we send the transaction changes to the client > before we read COMMIT/COMMIT PREPARED/ROLLBACK/ROLLBACK PREPARED. Now, > among these COMMIT/ROLLBACK PREPARED cases are handled as part of a > prepared transaction case. For ROLLBACK, we will never perform any > changes from the current transaction, so we don't need distributed > invalidations to be executed. For COMMIT, if we encounter any errors > while processing changes (this is when we reach the ERROR path, which > is not a concurrent abort), then we will reprocess all changes and, at > the end, execute both the current transaction and distributed > invalidations. Now, one possibility is that if, after ERROR, the > caller does slot_advance to skip the ERROR, then we will probably miss > executing the distributed invalidations, leading to data loss > afterwards. If the above theory is correct, then it is better to > execute distributed invalidation even in non-concurrent-abort cases in > the ERROR path. One possible reason this scenario may not occur is that pg_logical_slot_get_changes_guts uses a PG_CATCH block to handle exceptions, during which it calls InvalidateSystemCaches to clear the system cache. Because of this, I believe the scenario might not actually happen. @Sawada-san / others — Are there any other cases where this could still occur? Regards, Vignesh
В списке pgsql-bugs по дате отправления: