Re: [HACKERS] SERIALIZABLE on standby servers

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] SERIALIZABLE on standby servers
Дата
Msg-id CA+hUKGLRUwKNs0w+YzfsNF62O0L_gfbTFOJND1yMTorg95rJOw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] SERIALIZABLE on standby servers  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On Mon, Jun 15, 2020 at 5:00 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, Dec 28, 2018 at 02:21:44PM +1300, Thomas Munro wrote:
> > Just to be clear, although this patch is registered in the commitfest
> > and currently applies and tests pass, it is prototype/WIP code with
> > significant problems that remain to be resolved.  I sort of wish there
> > were a way to indicate that in the CF but since there isn't, I'm
> > saying that here.  What I hope to get from Kevin, Simon or other
> > reviewers is some feedback on the general approach and problems
> > discussed upthread (and other problems and better ideas I might have
> > missed).  So it's not seriously proposed for commit in this CF.
>
> No feedback has actually come, so I have moved it to next CF.

Having been nerd-sniped by SSI again, I spent some time this weekend
rebasing this old patch, making a few improvements, and reformulating
the problems to be solved as I see them.  It's very roughly based on
Kevin Grittner and Dan Ports' description of how you could give
SERIALIZABLE a useful meaning on hot standbys.  The short version of
the theory is that you can make it work like SERIALIZABLE READ ONLY
DEFERRABLE by adding a bit of extra information into the WAL stream.

Problems:

1.  As a prerequisite, we'd need to teach primary servers to make
transactions visible in the same order that they log commits.
Otherwise, we permit nonsense like seeing TX1 but not TX2 on the
primary, and TX2 but not TX1 on the replica.  You can probably argue
that our read replicas don't satisfy the lower isolation levels, let
alone serializable.

2.  Similarly, it's probably not OK that
PreCommit_CheckForSerializationFailure() determines
MySerializableXact->snapshotSafetyAfterThisCommit.  That may not
happen in exactly the same order as commits are logged.  Or maybe
there is some argument for why that is OK, based on what we're doing
with prepareSeqNo, or maybe we can do something with that to detect
disorder.

3.  The patch doesn't yet attempt to checkpoint the snapshot safety
state.  That's needed to start up in a sane state, without having to
wait for WAL activity.

4.  XactLogSnapshotSafetyRecord() flushes the WAL an extra time after
a commit is flushed, which I put in for testing; that's silly...
somehow it needs to be better integrated so we don't generate two sync
I/Os in a row.

5.  You probably want a way to turn off the extra WAL records and
SERIALIZABLEXACT consumption if you're using SERIALIZABLE on a primary
but not on the standby.  Or maybe there is some way to make it come on
automatically.

I think I have cleared up the matter of xmin tracking for
"hypothetical" SERIALIZABLEXACTs mentioned earlier.  It's not needed,
so should be set to InvalidTransactionId, and I added a comment to
explain why.

I also wrote a TAP test to exercise this thing.  It is the same
schedule as src/test/isolation/specs/read-only-anomaly-3.spec, except
that transaction 3 runs on a streaming replica.

One thing to point out is that this patch only aims to make it so that
streaming replicas can't observe a state that would have caused a
transaction to abort if it had been observed on the primary.  The TAP
test still has to insert its own wait-for-LSN loop to make sure step
"s1c" is replayed before "s3r" runs.  We could use
synchronous_commit=remote_apply, and that'd probably work just as well
for this particular test, but I'm not sure how to square that with
fixing problem #1 above.

The perl hackery I used to do overlapping transactions in a TAP test
is pretty crufty.  I guess we'd ideally have the isolation tester
support per-session connection strings, and somehow get some perl code
to orchestrate the cluster setup but then run the real isolation
tester.  Or something like that.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: min_safe_lsn column in pg_replication_slots view
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: doc examples for pghandler