Обсуждение: commit timestamps and replication

Поиск
Список
Период
Сортировка

commit timestamps and replication

От
Andreas Brandl
Дата:
Hi,

we're discussing a system design and it boils down to a question
regarding commit timestamps (in the sense of [1],
track_commit_timestamp='on'):

We have a insert-only (append-only) table. Do commit timestamps on
this table constitute the same order in which records become visible
on a secondary (streaming replication)? Is there any reason why this
might not be the case?

To put this differently: If a client reads from a secondary and reads
the "latest" record by commit timestamp (commit time T). Is it safe to
assume that there won't ever be another record with a lower commit
timestamp < T, that only shows up on the secondary after that read?

I'm aware of concerns regarding physical time, time adjustments and so
on, so the question here assumes those things never happen.

Thank you!

Best regards,
Andreas

[1] https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.5#Commit_timestamp_tracking


Re: commit timestamps and replication

От
Andres Freund
Дата:
Hi,

On 2018-09-14 21:01:54 +0200, Andreas Brandl wrote:
> we're discussing a system design and it boils down to a question
> regarding commit timestamps (in the sense of [1],
> track_commit_timestamp='on'):
> 
> We have a insert-only (append-only) table. Do commit timestamps on
> this table constitute the same order in which records become visible
> on a secondary (streaming replication)? Is there any reason why this
> might not be the case?
> 
> To put this differently: If a client reads from a secondary and reads
> the "latest" record by commit timestamp (commit time T). Is it safe to
> assume that there won't ever be another record with a lower commit
> timestamp < T, that only shows up on the secondary after that read?

No, that's not safe to assume.  The order of visibility on the standby
is determined by the point the WAL record is inserted into the log.  The
commit timestamp has to *previously* be determined to be part of the WAL
log:

        SetCurrentTransactionStopTimestamp();

        XactLogCommitRecord(xactStopTimestamp,
                            nchildren, children, nrels, rels,
                            nmsgs, invalMessages,
                            RelcacheInitFileInval, forceSyncCommit,
                            MyXactFlags,
                            InvalidTransactionId, NULL /* plain commit */ );

Those two things happen sequentially, *WITHOUT* a lock preventing
concurrent transactions to do the same.


> I'm aware of concerns regarding physical time, time adjustments and so
> on, so the question here assumes those things never happen.

Note that in addition to that you cannot be certain that concurrent
transactions *in the same postgres cluster* have perfectly coherent
timestamp - clock drift between CPU sockets does still sometimes occur,
and was extremely common.

Greetings,

Andres Freund