Re: Temporal extensions
От | Jim Nasby |
---|---|
Тема | Re: Temporal extensions |
Дата | |
Msg-id | 553EF53E.2070007@BlueTreble.com обсуждение исходный текст |
Ответ на | Re: Temporal extensions (Dave Jones <dave@waveform.org.uk>) |
Ответы |
Re: Temporal extensions
(Dave Jones <dave@waveform.org.uk>)
|
Список | pgsql-hackers |
On 4/27/15 6:08 PM, Dave Jones wrote: >> (Though, I dislike using timestamps to do change/history tracking, but >> >that's just me...) > I've been playing around with history tracking (in the context of BI, > typically with batch loaded reporting databases) for about 7-8 years now > and always found timestamps perfect for the purpose, but are you perhaps > referring to using it for audit purposes? If that's the case I'd agree > entirely - this is absolutely the wrong tool for such things (which is > something I need to put a bit more prominently in the docs - it's buried > in the design section at the moment). Most warehouses dumb things down to a day level, so it's probably OK there. What I specifically don't like is that using a timestamp to try and determine the order in which something happened is just fraught with gotchas. For starters, now() is locked in when you do a BEGIN, but maybe a newer transaction modifies a table before an older one does. Now the ordering is *backwards*. You have the same problem with using an XID. The only way I've thought of to make this guaranteed safe is to somehow serialize the logging with something like CREATE TABLE customer_history( customer_hid serial primary key -- hid == history_id , previous_customer_hid int referencescustomer_history , customer_id int NOT NULL references customer ... ); CREATE UNIQUE INDEX ... ON customer_history(previous_customer_hid) WHERE previous_customer_hid IS NOT NULL; CREATE UNIQUE INDEX ... ON customer_history(customer_hid) WHERE previous_customer_hid IS NULL; and then have a before trigger enforce NEW.previous_customer_hid := customer_history__get_latest(customer_id) where customer_history__get_latest() will 'walk the chain' starting with the first link customer_id = blah AND previous_customer_id = NULL Because of the indexes that will serialize inserts on a per-customer basis. You could still run into problems with a newer snapshot creating a history record before a transaction with an older snapshot does though. :( Though, if you included txid_current_snapshot() with each record you could probably detect when that happens. > Or did you mean ranges would be better? They certainly looked intriguing > when I started moving this stuff to postgres, and I'd like to re-visit > them in the near future as they offer capabilities I don't have with > timestamps (such as guaranteeing no overlapping ranges via exclusion > constraints) but my initial tests suggested some rather major > performance degradation so I put it on the back-burner at first. If you're going to keep both a start and end for each record you'd definitely want to do it with a range. If you're only keeping the change time then you can handle it differently. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Sawada MasahikoДата:
Сообщение: Re: Proposal: knowing detail of config files via SQL