Re: [HACKERS] logical decoding of two-phase transactions

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] logical decoding of two-phase transactions
Дата
Msg-id CA+Tgmoa1c7DL6ANf1Z3YfneVSnALMJW0soMXkVq0dSakGDG0DQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] logical decoding of two-phase transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [HACKERS] logical decoding of two-phase transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Jul 16, 2018 at 1:28 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> I'm not sure I understand. Are you suggesting the process might get killed
> or something, thanks to the CHECK_FOR_INTERRUPTS() call?

Yes.  CHECK_FOR_INTERRUPTS() can certainly lead to a non-local
transfer of control.

> But BackendXidGetProc() internally acquires ProcArrayLock, of course. It's
> true there are a few places where we do != NULL checks on the result without
> holding any lock, but I don't see why that would be a problem? And before
> actually inspecting the contents, the code always does
> LockHashPartitionLockByProc.

I think at least some of those cases are a problem.  See below...

> I don't follow. How could one process put another process into a decoding
> group? I don't think that's possible.

Isn't that exactly what AssignDecodeGroupLeader() is doing?  It looks
up the process that currently has that XID, then turns that process
into a decode group leader.  Then after that function returns, the
caller adds itself to the decode group as well.  So it seems entirely
possible for somebody to swing the decodeGroupLeader pointer for a
PGPROC from NULL to some other value at an arbitrary point in time.

> I'm not sure about the 'unsalvageable' part, but it needs more work, that's
> for sure. Unfortunately, all previous attempts to make this work in various
> other ways failed (see past discussions in this thread), so this is the only
> approach left :-( So let's see if we can make it work.

I think that's probably not going to work out, but of course it's up
to you how you want to spend your time!

After thinking about it a bit more, if you want to try to stick with
this design, I don't think that this decode group leader/members thing
has much to recommend it.  In the case of parallel query, the point of
the lock group stuff is to treat all of those processes as one for
purposes of heavyweight lock acquisition.  There's no similar need
here, so the design that makes sure the "leader" is in the list of
processes that are members of the "group" is, AFAICS, just wasted
code.  All you really need is a list of processes hung off of the
PGPROC that must abort before the leader is allowed to abort; the
leader itself doesn't need to be in the list, and there's no need to
consider it as a "group".  It's just a list of waiters.

That having been said, I still don't see how that's really going to
work.  Just to take one example, suppose that the leader is trying to
ERROR out, and the decoding workers are blocked waiting for a lock
held by the leader.  The system has no way of detecting this deadlock
and resolving it automatically, which certainly seems unacceptable.
The only way that's going to work is if the leader waits for the
worker by trying to acquire a lock held by the worker.  Then the
deadlock detector would know to abort some transaction.  But that
doesn't really work either - the deadlock was created by the
foreground process trying to abort, and if the deadlock detector
chooses that process as its victim, what then?  We're already trying
to abort, and the abort code isn't supposed to throw further errors,
or fail in any way, lest we break all kinds of other things.  Not to
mention the fact that running the deadlock detector in the abort path
isn't really safe to begin with, again because we can't throw errors
when we're already in an abort path.

If we're only ever talking about decoding prepared transactions, we
could probably work around all of these problems: have the decoding
process take a heavyweight lock before it begins decoding.  Have a
process that wants to execute ROLLBACK PREPARED take a conflicting
heavyweight lock on the same object.  The net effect would be that
ROLLBACK PREPARED would simply wait for decoding to finish.  That
might be rather lousy from a latency point of view since the
transaction could take an arbitrarily long time to decode, but it
seems safe enough.  Possibly you could also design a mechanism for the
ROLLBACK PREPARED command to SIGTERM the processes that are blocking
its lock acquisition, if they are decoding processes.  The difference
between this and what you the current patch is doing is that nothing
complex or fragile is happening in the abort pathway itself.  The
complicated stuff in both the worker and in the main backend happens
while the transaction is still good and can still be rolled back at
need.  This kind of approach won't work if you want to decode
transactions that aren't yet prepared, so if that is the long term
goal then we need to think harder.  I'm honestly not sure that problem
has any reasonable solution.  The assumption that a running process
can abort at any time is deeply baked into many parts of the system
and for good reasons.  Trying to undo that is going to be like trying
to push water up a hill.  I think we need to install interlocks in
such a way that any waiting happens before we enter the abort path,
not while we're actually trying to perform the abort.  But I don't
know how to do that for a foreground task that's still actively doing
stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: [HACKERS] logical decoding of two-phase transactions
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: Vacuum: allow usage of more than 1GB of work mem