pgsql: Fix timeline assignment in checkpoints with 2PC transactions

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема pgsql: Fix timeline assignment in checkpoints with 2PC transactions
Дата
Msg-id E1lO7Xt-0007QE-Vq@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Fix timeline assignment in checkpoints with 2PC transactions

Any transactions found as still prepared by a checkpoint have their
state data read from the WAL records generated by PREPARE TRANSACTION
before being moved into their new location within pg_twophase/.  While
reading such records, the WAL reader uses the callback
read_local_xlog_page() to read a page, that is shared across various
parts of the system.  This callback, since 1148e22a, has introduced an
update of ThisTimeLineID when reading a record while in recovery, which
is potentially helpful in the context of cascading WAL senders.

This update of ThisTimeLineID interacts badly with the checkpointer if a
promotion happens while some 2PC data is read from its record, as, by
changing ThisTimeLineID, any follow-up WAL records would be written to
an timeline older than the promoted one.  This results in consistency
issues.  For instance, a subsequent server restart would cause a failure
in finding a valid checkpoint record, resulting in a PANIC, for
instance.

This commit changes the code reading the 2PC data to reset the timeline
once the 2PC record has been read, to prevent messing up with the static
state of the checkpointer.  It would be tempting to do the same thing
directly in read_local_xlog_page().  However, based on the discussion
that has led to 1148e22a, users may rely on the updates of
ThisTimeLineID when a WAL record page is read in recovery, so changing
this callback could break some cases that are working currently.

A TAP test reproducing the issue is added, relying on a PITR to
precisely trigger a promotion with a prepared transaction still
tracked.

Per discussion with Heikki Linnakangas, Kyotaro Horiguchi, Fujii Masao
and myself.

Author: Soumyadeep Chakraborty, Jimmy Yih, Kevin Yeap
Discussion: https://postgr.es/m/CAE-ML+_EjH_fzfq1F3RJ1=XaaNG=-Jz-i3JqkNhXiLAsM3z-Ew@mail.gmail.com
Backpatch-through: 10

Branch
------
REL_13_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/6e5ce888ad1e7b7da5de507d89d03bc83d954923

Modified Files
--------------
src/backend/access/transam/twophase.c         | 15 ++++-
src/test/recovery/t/023_pitr_prepared_xact.pl | 89 +++++++++++++++++++++++++++
2 files changed, 103 insertions(+), 1 deletion(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: pgsql: Fix assorted silliness in ATExecSetCompression().
Следующее
От: Michael Paquier
Дата:
Сообщение: pgsql: Fix timeline assignment in checkpoints with 2PC transactions