RE: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
От | Zhijie Hou (Fujitsu) |
---|---|
Тема | RE: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber |
Дата | |
Msg-id | OS0PR01MB5716AB0BE911D79120DDE28D94B92@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber (shveta malik <shveta.malik@gmail.com>) |
Ответы |
Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
|
Список | pgsql-hackers |
On Thursday, August 8, 2024 6:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Aug 8, 2024 at 12:03 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > On Thu, Aug 8, 2024 at 10:37 AM Hayato Kuroda (Fujitsu) > > <kuroda.hayato@fujitsu.com> wrote: > > > > > ... > > > > > > An easiest fix is to reset session replication origin before calling > > > the RecordTransactionAbort(). I think this can happen when 1) > > > LogicalRepApplyLoop() raises an ERROR or 2) apply worker exits. > Attached patch can fix the issue on HEAD. > > > > > > > Few comments: > > ============= > > * > > @@ -4409,6 +4409,14 @@ start_apply(XLogRecPtr origin_startpos) > > } > > PG_CATCH(); > > { > > + /* > > + * Reset the origin data to prevent the advancement of origin > > + progress > > + * if the transaction failed to apply. > > + */ > > + replorigin_session_origin = InvalidRepOriginId; > > + replorigin_session_origin_lsn = InvalidXLogRecPtr; > > + replorigin_session_origin_timestamp = 0; > > > > Can't we call replorigin_reset() instead here? > > > > * > > + /* > > + * Register a callback to reset the origin state before aborting the > > + * transaction in ShutdownPostgres(). This is to prevent the > > + advancement > > + * of origin progress if the transaction failed to apply. > > + */ > > + before_shmem_exit(replorigin_reset, (Datum) 0); > > > > I think we need this despite resetting the origin-related variables in > > PG_CATCH block to handle FATAL error cases, right? If so, can we use > > PG_ENSURE_ERROR_CLEANUP() instead of PG_CATCH()? > > +1 > > Basic tests work fine on this patch. Just thinking out loud, > SetupApplyOrSyncWorker() is called for table-sync worker as well and IIUC > tablesync worker does not deal with 2PC txns. So do we even need to register > replorigin_reset() for tablesync worker as well? If we may hit such an issue in > general, then perhaps we need it in table-sync worker otherwise not. It > needs some investigation. Thoughts? I think this is a general issue that can occur not only due to 2PC. IIUC, this problem should arise if any ERROR arises after setting the replorigin_session_origin_lsn but before the CommitTransactionCommand is completed. If so, I think we should register it for tablesync worker as well. Best Regards, Hou zj
В списке pgsql-hackers по дате отправления: