Re: Transactions involving multiple postgres foreign servers, take 2

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Transactions involving multiple postgres foreign servers, take 2
Дата
Msg-id CAExHW5u_8DcnYApXXLRovpcG33B3k=BcP0oLs+5FDRDNyn2fBQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Transactions involving multiple postgres foreign servers, take 2  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
On Thu, Jun 18, 2020 at 6:49 PM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Thu, Jun 18, 2020 at 04:09:56PM +0530, Amit Kapila wrote:
> > You are right and we are not going to claim that after this feature is
> > committed.  This feature has independent use cases like it can allow
> > parallel copy when foreign tables are involved once we have parallel
> > copy and surely there will be more.  I think it is clear that we need
> > atomic visibility (some way to ensure global consistency) to avoid the
> > data inconsistency problems you and I are worried about and we can do
> > that as a separate patch but at this stage, it would be good if we can
> > have some high-level design of that as well so that if we need some
> > adjustments in the design/implementation of this patch then we can do
> > it now.  I think there is some discussion on the other threads (like
> > [1]) about the kind of stuff we are worried about which I need to
> > follow up on to study the impact.
> >
> > Having said that, I don't think that is a reason to stop reviewing or
> > working on this patch.
>
> I think our first step is to allow sharding to work on read-only
> databases, e.g. data warehousing.  Read/write will require global
> snapshots.  It is true that 2PC is limited usefulness without global
> snapshots, because, by definition, systems using 2PC are read-write
> systems.   However, I can see cases where you are loading data into a
> data warehouse but want 2PC so the systems remain consistent even if
> there is a crash during loading.
>

For sharding, just implementing 2PC without global consistency
provides limited functionality. But for general purpose federated
databases 2PC serves an important functionality - atomic visibility.
When PostgreSQL is used as one of the coordinators in a heterogeneous
federated database system, it's not expected to have global
consistency or even atomic visibility. But it needs a guarantee that
once a transaction commit, all its legs are committed. 2PC provides
that guarantee as long as the other databases keep their promise that
prepared transactions will always get committed when requested so.
Subtle to this is HA requirement from these databases as well. So the
functionality provided by this patch is important outside the sharding
case as well.

As you said, even for a data warehousing application, there is some
write in the form of loading/merging data. If that write happens
across multiple servers, we need  atomic commit to be guaranteed. Some
of these applications can work even if global consistency and atomic
visibility is guaranteed eventually.

-- 
Best Wishes,
Ashutosh Bapat



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: snowball ASCII stemmer configuration
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: doing something about the broken dynloader.h symlink