Re: historical log of data records

Поиск

Список

Период

Сортировка

От	Sanjay Minni
Тема	Re: historical log of data records
Дата	16 ноября 2021 г. 11:11:15
Msg-id	CAMpxBonpFbrnbXVV_9kEc3Wy4hYbRZoHTqgPyNA5C3a9JvsxRA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: historical log of data records (Alban Hertroys <haramrae@gmail.com>)
Список	pgsql-general

Дерево обсуждения

Alban,

Its a simple financial transaction processing application, the application permits editing / updating / deleting of entered data even multiple times but audit trail of the data tracing through all versions to its original must be preserved.

(as outlined - Programmatically i could approach it by keeping a parallel set of tables and copying the row being replaced into the parallel table set, or, keeping all record versions in a single table only and a flag to indicate the final / current version)

I am looking is there are better ways to do it

with warm regards

Sanjay Minni

+91-9900-902902

On Tue, 16 Nov 2021 at 15:57, Alban Hertroys <haramrae@gmail.com> wrote:

> On 16 Nov 2021, at 10:20, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Tue, 2021-11-16 at 13:56 +0530, Sanjay Minni wrote:
>> I need to keep a copy of old data as the rows are changed.
>>
>> For a general RDBMS I could think of keeping all the data in the same table with a flag
>> to indicate older copies of updated / deleted rows or keep a parallel table and copy
>> these rows into the parallel data under program / trigger control. Each has its pros and cons.
>>
>> In Postgres would i have to follow the same methods or are there any features / packages available ?
>
> Yes, I would use one of these methods.
>
> The only feature I can think of that may help is partitioning: if you have one partition
> for the current data and one for the deleted data, then updating the flag would
> automatically move the row between partitions, so you don't need a trigger.

Are you building (something like) a data-vault? If so, keep in mind that you will have a row for every update, not just a single deleted row. Enriching the data can be really useful in such cases.

For a data-vault at a previous employer, we determined how to treat new rows by comparing a (md5) hash of the new and old rows, adding the hash and a validity interval to the stored rows. Historic data went to a separate table for each respective current table.

The current tables “inherited” the PK’s from the tables on the source systems (this was a data-warehouse DB). Obviously that same PK can not be applied to the historic tables where there _will_ be duplicates, although they should be at non-overlapping validity intervals.

Alternatively, since this is time-series data, it would probably be a good idea to store that in a way optimised for that. TimescaleDB comes to mind, or arrays as per Pavel’s suggestion at https://stackoverflow.com/questions/68440130/time-series-data-on-postgresql.

Regards,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: historical log of data records