Re: 2-phase commit

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: 2-phase commit
Дата
Msg-id 200309261720.h8QHKhq10420@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: 2-phase commit  ("Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at>)
Ответы Re: 2-phase commit  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Zeugswetter Andreas SB SD wrote:
> 
> > > From our previous discussion of 2-phase commit, there was concern that
> > > the failure modes of 2-phase commit were not solvable.  However, I think
> > > multi-master replication is going to have similar non-solvable failure
> > > modes, yet people still want multi-master replication.
> > 
> > No.  The real problem with 2PC in my mind is that its failure modes
> > occur *after* you have promised commit to one or more parties.  In
> > multi-master, if you fail you know it before you have told the client
> > his data is committed.
> 
> Hmm ? The appl cannot take the first phase commit as its commit info. It 
> needs to wait for the second phase commit. The second phase is only finished
> when all coservers have reported back. 2PC is synchronous.
> 
> The problems with 2PC are when after second phase commit was sent to all
> servers and before all report back one of them becomes unreachable/down ...
> (did it receive and do the 2nd commit or not) Such a transaction must stay
> open until the coserver is reachable again or an administrator committed/aborted it. 
> 
> It is multi master replication that usually has an asynchronous mode for
> performance, and there the trouble starts.

Let me diagram this so we can see the issues.  Normal operation is:
Master        Slave------        -----commit ready-->        <--OKcommit done--->        <--OKcompleted

One possible failure is:
Master        Slave------        -----commit ready-->        <--OKcommit done--->        dies herestuck waiting

Another possible failure is:
Master        Slave------        -----commit ready-->        <--OKdies here        stuck waiting

Are these the issues?  Can't we just add GUC timeouts to cause the
commit to fail, and the slave to stop waiting?  I suppose a problem is:
Master        Slave------        -----commit ready-->        <--OKsleep        stuck waiting, times outcommit done

Could we allow slaves to check if the backend is still alive, perhaps by
asking the postmaster, similar to what we do with the cancel signal ---
that way, the slave would never time out and always wait if the master
was alive.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: invalid tid errors in latest 7.3.4 stable.
Следующее
От: Tom Lane
Дата:
Сообщение: Re: 2-phase commit