Re: [HACKERS] logical decoding of two-phase transactions

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [HACKERS] logical decoding of two-phase transactions
Дата
Msg-id 9cac2837-7e58-1cad-3940-27ac0dd7c198@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [HACKERS] logical decoding of two-phase transactions  (Nikhil Sontakke <nikhils@2ndquadrant.com>)
Ответы Re: [HACKERS] logical decoding of two-phase transactions  (Nikhil Sontakke <nikhils@2ndquadrant.com>)
Список pgsql-hackers
Hi,

I think the patch looks mostly fine. I'm about to do a bit more testing
on it, but a few comments. Attached diff shows which the discussed
places / comments more closely.


1) There's a race condition in LogicalLockTransaction. The code does
roughly this:

  if (!BecomeDecodeGroupMember(...))
     ... bail out ...

  Assert(MyProc->decodeGroupLeader);
  lwlock = LockHashPartitionLockByProc(MyProc->decodeGroupLeader);
  ...

but AFAICS there is no guarantee that the transaction does not commit
(or even abort) right after the become decode group member. In which
case LogicalDecodeRemoveTransaction might have already reset our pointer
to a leader to NULL. In which case the Assert() and lock will fail.

I've initially thought this can be fixed by setting decodeLocked=true in
BecomeDecodeGroupMember, but that's not really true - that would fix the
race for aborts, but not commits. LogicalDecodeRemoveTransaction skips
the wait for commits entirely, and just resets the flags anyway.

So this needs a different fix, I think. BecomeDecodeGroupMember also
needs the leader PGPROC pointer, but it does not have the issue because
it gets it as a parameter. I think the same thing would work for here
too - that is, use the AssignDecodeGroupLeader() result instead.


2) BecomeDecodeGroupMember sets the decodeGroupLeader=NULL when the
leader does not match the parameters, despite enforcing it by Assert()
at the beginning. Let's remove that assignment.


3) I don't quite understand why BecomeDecodeGroupMember does the
cross-check using PID. In which case would it help?


4) AssignDecodeGroupLeader still sets pid, which is never read. Remove.


5) ReorderBufferCommitInternal does elog(LOG) about interrupting the
decoding of aborted transaction only in one place. There are about three
other places where we check LogicalLockTransaction. Seems inconsistent.


6) The comment before LogicalLockTransaction is somewhat inaccurate,
because it talks about adding/removing the backend to the group, but
that's not what's happening. We join the group on the first call and
then we only tweak the decodeLocked flag.


7) I propose minor changes to a couple of comments.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Add support for printing/reading MergeAction nodes
Следующее
От: Teodor Sigaev
Дата:
Сообщение: Re: [HACKERS] GUC for cleanup indexes threshold.