Re: Proposal: Commit timestamp

Поиск
Список
Период
Сортировка
От Jan Wieck
Тема Re: Proposal: Commit timestamp
Дата
Msg-id 45B994A3.8090804@Yahoo.com
обсуждение исходный текст
Ответ на Re: Proposal: Commit timestamp  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Proposal: Commit timestamp  (Heikki Linnakangas <heikki@enterprisedb.com>)
Список pgsql-hackers
On 1/25/2007 11:41 PM, Bruce Momjian wrote:
> Jan Wieck wrote:
>> On 1/25/2007 6:49 PM, Tom Lane wrote:
>> > Jan Wieck <JanWieck@Yahoo.com> writes:
>> >> To provide this data, I would like to add another "log" directory, 
>> >> pg_tslog. The files in this directory will be similar to the clog, but 
>> >> contain arrays of timestamptz values.
>> > 
>> > Why should everybody be made to pay this overhead?
>> 
>> It could be made an initdb time option. If you intend to use a product 
>> that requires this feature, you will be willing to pay that price.
> 
> That is going to cut your usage by like 80%.  There must be a better
> way.

I'd love to.

But it is a datum that needs to be collected at the moment where 
basically the clog entry is made ... I don't think any external module 
can do that ever.

You know how long I've been in and out and back into replication again. 
The one thing that pops up again and again in all the scenarios is "what 
the heck was the commit order?". Now the pure commit order for a single 
node could certainly be recorded from a sequence, but that doesn't cover 
the multi-node environment I am after. That's why I want it to be a 
timestamp with a few fudged bits at the end. If you look at what I've 
described, you will notice that as long as all node priorities are 
unique, this timestamp will be a globally unique ID in a somewhat 
ascending order along a timeline. That is what replication people are 
looking for.

Tom fears that the overhead is significant, which I do understand and 
frankly, wonder myself about (actually I don't even have a vague 
estimate). I really think we should make this thing an initdb option and 
decide later if it's on or off by default. Probably we can implement it 
even in a way that one can turn it on/off and a postmaster restart plus 
waiting the desired freeze-delay would do.

What I know for certain is that no async replication system can ever do 
without the commit timestamp information. Using the transaction start 
time or even the single statements timeofday will only lead to 
inconsistencies all over the place (I haven't been absent from the 
mailing lists for the past couple of month hiding in my closet ... I've 
been experimenting and trying to get around all these issues - in my 
closet). Slony-I can survive without that information because everything 
happens on one node and we record snapshot information for later abusal. 
But look at what cost we are dealing with this rather trivial issue. All 
we need to know is the serializable commit order. And we have to issue 
queries that eventually might exceed address space limits?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Proposal: Commit timestamp
Следующее
От: Hannu Krosing
Дата:
Сообщение: Re: Proposal: Snapshot cloning