Re: Global snapshots

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Global snapshots
Дата
Msg-id CA+fd4k5d9Jjt2zwCFwm9FmFF1m6J1_ixaU_X7noM9qkDWCbAzA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Global snapshots  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: Global snapshots  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
On Thu, 15 Oct 2020 at 01:41, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/09/17 15:56, Amit Kapila wrote:
> > On Thu, Sep 10, 2020 at 4:20 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>
> >>>> One alternative is to add only hooks into PostgreSQL core so that we can
> >>>> implement the global transaction management outside. This idea was
> >>>> discussed before as the title "eXtensible Transaction Manager API".
> >>>
> >>> Yeah, I read that discussion.  And I remember Robert Haas and Postgres Pro people said it's not good...
> >>
> >> But it may be worth revisiting this idea if we cannot avoid the patent issue.
> >>
> >
> > It is not very clear what exactly we can do about the point raised by
> > Tsunakawa-San related to patent in this technology as I haven't seen
> > that discussed during other development but maybe we can try to study
> > a bit. One more thing I would like to bring here is that it seems to
> > be there have been some concerns about this idea when originally
> > discussed [1]. It is not very clear to me if all the concerns are
> > addressed or not. If one can summarize the concerns discussed and how
> > the latest patch is able to address those then it will be great.
>
> I have one concern about Clock-SI (sorry if this concern was already
> discussed in the past). As far as I read the paper about Clock-SI, ISTM that
> Tx2 that starts after Tx1's commit can fail to see the results by Tx1,
> due to the clock skew. Please see the following example;
>
> 1. Tx1 starts at the server A.
>
> 2. Tx1 writes some records at the server A.
>
> 3. Tx1 gets the local clock 20, uses 20 as CommitTime, then completes
>       the commit at the server A.
>       This means that Tx1 is the local transaction, not distributed one.
>
> 4. Tx2 starts at the server B, i.e., the server B works as
>       the coordinator node for Tx2.
>
> 5. Tx2 gets the local clock 10 (i.e., it's delayed behind the server A
>       due to clock skew) and uses 10 as SnapshotTime at the server B.
>
> 6. Tx2 starts the remote transaction at the server A with SnapshotTime 10.
>
> 7. Tx2 doesn't need to wait due to clock skew because the imported
>       SnapshotTime 10 is smaller than the local clock at the server A.
>
> 8. Tx2 fails to see the records written by Tx1 at the server A because
>       Tx1's CommitTime 20 is larger than SnapshotTime 10.
>
> So Tx1 was successfully committed before Tx2 starts. But, at the above example,
> the subsequent transaction Tx2 fails to see the committed results.
>
> The single PostgreSQL instance seems to guarantee that linearizability of
> the transactions, but Clock-SI doesn't in the distributed env. Is this my
> understanding right? Or am I missing something?
>
> If my understanding is right, shouldn't we address that issue when using
> Clock-SI? Or the patch has already addressed the issue?

As far as I read the paper, the above scenario can happen. I could
reproduce the above scenario with the patch. Moreover, a stale read
could happen even if Tx1 was initiated at server B (i.g., both
transactions started at the same server in sequence). In this case,
Tx1's commit timestamp would be 20 taken from server A's local clock
whereas Tx2's snapshot timestamp would be 10 same as the above case.
Therefore, in spite of both transactions were initiated at the same
server the linearizability is not provided.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: partition routing layering in nodeModifyTable.c
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: "unix_socket_directories" should be GUC_LIST_INPUT?