Re: Transactions involving multiple postgres foreign servers, take 2

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Transactions involving multiple postgres foreign servers, take 2
Дата
Msg-id 20201014.171056.1853173364418725135.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Transactions involving multiple postgres foreign servers, take 2  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Ответы Re: Transactions involving multiple postgres foreign servers, take 2  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Список pgsql-hackers
(v26 fails on the current master)

At Wed, 14 Oct 2020 13:52:49 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in 
> On Wed, 14 Oct 2020 at 13:19, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> >
> > At Wed, 14 Oct 2020 12:09:34 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
> > > On Wed, 14 Oct 2020 at 10:16, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrot> > There are cases of
commit-failureof a local transaction caused by
 
> > > > too-many notifications or by serialization failure.
> > >
> > > Yes, even if that happens we are still able to rollback all foreign
> > > transactions.
> >
> > Mmm. I'm confused. If this is about 2pc-commit-request(or prepare)
> > phase, we can rollback the remote transactions. But I think we're
> > focusing 2pc-commit phase. remote transaction that has already
> > 2pc-committed, they can be no longer rollback'ed.
> 
> Did you mention a failure of local commit, right? With the current
> approach, we prepare all foreign transactions first and then commit
> the local transaction. After committing the local transaction we
> commit the prepared foreign transactions. So suppose a serialization
> failure happens during committing the local transaction, we still are
> able to roll back foreign transactions. The check of serialization
> failure of the foreign transactions has already been done at the
> prepare phase.

Understood.

> > > > > to commit the local transaction without preparation, the local
> > > > > transaction must be committed at last. But since the above sequence
> > > > > doesn’t follow this protocol, we will have such problems. I think if
> > > > > we follow the 2pc properly, such basic failures don't happen.
> > > >
> > > > True. But I haven't suggested that sequence.
> > >
> > > Okay, I might have missed your point. Could you elaborate on the idea
> > > you mentioned before, "I think remote-commits should be performed
> > > before local commit passes the point-of-no-return"?
> >
> > It is simply the condition that we can ERROR-out from
> > CommitTransaction. I thought that when you say like "we cannot
> > ERROR-out" you meant "since that is raised to FATAL", but it seems to
> > me that both of you are looking another aspect.
> >
> > If the aspect is "what to do complete the all-prepared p2c transaction
> > at all costs", I'd say "there's a fundamental limitaion".  Although
> > I'm not sure what you mean exactly by prohibiting errors from fdw
> > routines , if that meant "the API can fail, but must not raise an
> > exception", that policy is enforced by setting a critical
> > section. However, if it were "the API mustn't fail", that cannot be
> > realized, I believe.
> 
> When I say "we cannot error-out" it means it's too late. What I'd like
> to prevent is that the backend process returns an error to the client
> after committing the local transaction. Because it will mislead the
> user.

Anyway we don't do anything that can fail after changing state to
TRANS_COMMIT. So we cannot run fdw-2pc-commit after that since it
cannot be failure-proof. if we do them before the point we cannot
ERROR-out after local commit completes.

> > > > I thought that we are discussing on fdw-errors during the 2pc-commit
> > > > phase.
> > > >
> > >
> > > Yes, I'm also discussing on fdw-errors during the 2pc-commit phase
> > > that happens after committing the local transaction.
> > >
> > > Even if FDW-commit raises an error due to the user's cancel request or
> > > whatever reason during committing the prepared foreign transactions,
> > > it's too late. The client will get an error like "ERROR:  canceling
> > > statement due to user request" and would think the transaction is
> > > aborted but it's not true, the local transaction is already committed.
> >
> > By the way I found that I misread the patch. in v26-0002,
> > AtEOXact_FdwXact() is actually called after the
> > point-of-no-return. What is the reason for the place?  We can
> > error-out before changing the state to TRANS_COMMIT.
> >
> 
> Are you referring to
> v26-0002-Introduce-transaction-manager-for-foreign-transa.patch? If
> so, the patch doesn't implement 2pc. I think we can commit the foreign

Ah, I guessed that the trigger points of PREPARE and COMMIT that are
inserted by 0002 won't be moved by the following patches. So the
direction of my discussion doesn't change by the fact.

> transaction before changing the state to TRANS_COMMIT but in any case
> it cannot ensure atomic commit. It just adds both commit and rollback

I guess that you have the local-commit-failure case in mind? Couldn't
we internally prepare the local transaction then following the correct
p2c protocol involving the local transaction? (I'm looking v26-0008)

> transaction APIs so that FDW can control transactions by using these
> API, not by XactCallback.

> > And if any of the remotes ended with 2pc-commit (not prepare phase)
> > failure, consistency of the commit is no longer guaranteed so we have
> > no choice other than shutting down the server, or continuing running
> > allowing the incosistency.  What do we want in that case?
> 
> I think it depends on the failure. If 2pc-commit failed due to network
> connection failure or the server crash, we would need to try again
> later. We normally expect the prepared transaction is able to be
> committed with no issue but in case it could not, I think we can leave
> the choice for the user: resolve it manually after recovered, give up
> etc.

Understood.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Use appendStringInfoString and appendPQExpBufferStr where possible
Следующее
От: Michael Banck
Дата:
Сообщение: Re: Two fsync related performance issues?