Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

Поиск
Список
Период
Сортировка
От Imseih (AWS), Sami
Тема Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Дата
Msg-id EF51690F-5006-47C4-9473-9DFAFC0B1778@amazon.com
обсуждение исходный текст
Ответ на Re: [BUG] Panic due to incorrect missingContrecPtr after promotion  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Список pgsql-hackers
>    After some investigation, I finally concluded that we should reset
>    abortedRecPtr and missingContrecPtr at processing
>    XLOG_OVERWRITE_CONTRECORD.


 >   --- a/src/backend/access/transam/xlogrecovery.c
 >   +++ b/src/backend/access/transam/xlogrecovery.c
 >   @@ -1953,6 +1953,11 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
                                                    LSN_FORMAT_ARGS(xlrec.overwritten_lsn),
                                                    timestamptz_to_str(xlrec.overwrite_time))));

 >   +               /* We have safely skipped the aborted record */
 >   +               abortedRecPtr = InvalidXLogRecPtr;
 >   +               missingContrecPtr = InvalidXLogRecPtr;
 >   +
 >                   /* Verifying the record should only happen once */
 >                   record->overwrittenRecPtr = InvalidXLogRecPtr;
 >           }

>    The last check in the test against "resetting aborted record" is no
>    longer useful since it is already checked by
>    026_verwrite_contrecord.pl.

>    regards.

+1 for this. Resetting abortedRecPtr and missingContrecPtr after the broken record is skipped in is cleaner. I also
thinkit would make sense to move the " successfully skipped missing contrecord" to after we invalidate the aborted and
missingrecord pointers, like below.
 

+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1948,15 +1948,15 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
                                 LSN_FORMAT_ARGS(xlrec.overwritten_lsn),
                                 LSN_FORMAT_ARGS(record->overwrittenRecPtr));

+               /* We have safely skipped the aborted record */
+               abortedRecPtr = InvalidXLogRecPtr;
+               missingContrecPtr = InvalidXLogRecPtr;
+
                ereport(LOG,
                                (errmsg("successfully skipped missing contrecord at %X/%X, overwritten at %s",
                                                LSN_FORMAT_ARGS(xlrec.overwritten_lsn),
                                                timestamptz_to_str(xlrec.overwrite_time))));

Also, instead of a new test, the current 026_overwrite_contrecord.pl test can include a promotion test at the end.

Attaching a new patch for review.

 --
Sami Imseih
Amazon Web Services


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [PATCH] Expose port->authn_id to extensions and triggers
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: why do hash index builds use smgrextend() for new splitpoint pages