Proposal: Commit timestamp

Поиск
Список
Период
Сортировка
От Jan Wieck
Тема Proposal: Commit timestamp
Дата
Msg-id 45B93A35.8040704@Yahoo.com
обсуждение исходный текст
Ответы Re: Proposal: Commit timestamp  (Neil Conway <neilc@samurai.com>)
Re: Proposal: Commit timestamp  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Proposal: Commit timestamp  (Richard Troy <rtroy@ScienceTools.com>)
Re: Proposal: Commit timestamp  (Naz Gassiep <naz@mira.net>)
Re: Proposal: Commit timestamp  ("Simon Riggs" <simon@2ndquadrant.com>)
Re: Proposal: Commit timestamp  (Jim Nasby <decibel@decibel.org>)
Список pgsql-hackers
For a future multimaster replication system, I will need a couple of 
features in the PostgreSQL server itself. I will submit separate 
proposals per feature so that discussions can be kept focused on one 
feature per thread.

For conflict resolution purposes in an asynchronous multimaster system, 
the "last update" definition often comes into play. For this to work, 
the system must provide a monotonically increasing timestamp taken at 
the commit of a transaction. During replication, the replication process 
must be able to provide the remote nodes timestamp so that the 
replicated data will be "as of the time it was written on the remote 
node", and not the current local time of the replica, which is by 
definition of "asynchronous" later.

To provide this data, I would like to add another "log" directory, 
pg_tslog. The files in this directory will be similar to the clog, but 
contain arrays of timestamptz values. On commit, the current system time 
will be taken. As long as this time is lower or equal to the last taken 
time in this PostgreSQL instance, the value will be increased by one 
microsecond. The resulting time will be added to the commit WAL record 
and written into the pg_tslog file.

If a per database configurable tslog_priority is given, the timestamp 
will be truncated to milliseconds and the increment logic is done on 
milliseconds. The priority is added to the timestamp. This guarantees 
that no two timestamps for commits will ever be exactly identical, even 
across different servers.

The COMMIT syntax will get extended to
    COMMIT [TRANSACTION] [WITH TIMESTAMP <timestamptz>];

The extension is limited to superusers and will override the normally 
generated commit timestamp. This will be used to give the replicating 
transaction on the replica the exact same timestamp it got on the 
originating master node.

The pg_tslog segments will be purged like the clog segments, after all 
transactions belonging to them have been stamped frozen. A frozen xid by 
definition has a timestamp of epoch. To ensure a system using this 
timestamp feature has enough time to perform its work, a new GUC 
variable defining an interval will prevent vacuum from freezing xid's 
that are younger than that.

A function get_commit_timestamp(xid) returning timpstamptz will return 
the commit time of a transaction as recorded by this feature.


Comments, changes, additions?

Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: DROP FUNCTION failure: cache lookup failed for
Следующее
От: Jan Wieck
Дата:
Сообщение: Proposal: Change of pg_trigger.tg_enabled and adding pg_rewrite.ev_enabled