Обсуждение: get a list of table modifications in a day?

Поиск
Список
Период
Сортировка

get a list of table modifications in a day?

От
Ottavio Campana
Дата:
I need to generate a diff (or something similar) of a table, day by day.
What is the best way to tack insert/update/delete operations? I have two
ideas, and I'd like to hear your opinion:

1) pg_dump each day and run diff
2) modify some triggers we use and store the information in another table

I am not aware of any functionality offered by postgresql. Does it exists?

If not, which solution would you prefer?


Вложения

Re: get a list of table modifications in a day?

От
hubert depesz lubaczewski
Дата:
On Thu, Sep 13, 2007 at 09:59:30AM +0200, Ottavio Campana wrote:
> 1) pg_dump each day and run diff

it will become increasingly painful as the table size increases.

> 2) modify some triggers we use and store the information in another table

this is the best choice. you can use table_log extension to avoid
writing your own triggers.

http://ads.wars-nicht.de/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: get a list of table modifications in a day?

От
"Asko Oja"
Дата:
Hi

PgQ can be used this purpose. Idea is to have triggers on table that push events into queue and then on that queue you can do whatever suits you best. As we don't want to keep these logs online PgQ is most conenient as it efficiently removes them as soon as they are handled.

PgQ - table_dispatcher.py
Has url encoded events as data source and writes them into table on target database.
Used to partiton data. For example change log's that need to kept online only shortly can be written to daily tables and then dropped as they become irrelevant.
Also allows to select which columns have to be written into target database
Creates target tables according to configuration file as needed

PgQ - cube_dispatcher.py
Has url encoded events as data source and writes them into partitoned tables in target database. Logutriga is used to create events.
Used to provide batches of data for business intelligence and data cubes.
Only one instance of each record is stored. For example if record is created and then updated twice only latest version of record stays in that days table.
Does not support deletes (not that it is hard to support just we have no need for it).

PgQ - queue_archiver.py
Writes queue contents into file. Used for backing up queue contents for safety.

regards,
Asko

On 9/13/07, Ottavio Campana <ottavio@campana.vi.it> wrote:
I need to generate a diff (or something similar) of a table, day by day.
What is the best way to tack insert/update/delete operations? I have two
ideas, and I'd like to hear your opinion:

1) pg_dump each day and run diff
2) modify some triggers we use and store the information in another table

I am not aware of any functionality offered by postgresql. Does it exists?

If not, which solution would you prefer?



Re: get a list of table modifications in a day?

От
Gregory Stark
Дата:
"Ottavio Campana" <ottavio@campana.vi.it> writes:

> I need to generate a diff (or something similar) of a table, day by day.
> What is the best way to tack insert/update/delete operations? I have two
> ideas, and I'd like to hear your opinion:
>
> 1) pg_dump each day and run diff

You can't use pg_dump directly as the rows are unordered. An update will
remove the old row in one place and put the new row possibly in a completely
different place. Some operations like CLUSTER or VACUUM FULL could move around
rows which doesn't matter to SQL but would show up in a diff.

You would have to COPY to a file a query which includes an ORDER BY.

> 2) modify some triggers we use and store the information in another table
>
> I am not aware of any functionality offered by postgresql. Does it exists?

Well alternatively you could do the same as 1) but do it in SQL. Something
like

CREATE TABLE copy_table AS (SELECT * FROM original_table);
... wait a day
SELECT * FROM copy_table EXCEPT SELECT * FROM original_table

It's not going to be fast though. Probably the triggers are the best option
really. They give you more information than a diff in that they tell you when
the change occurred, what user made the change, and if multiple changes to the
same record occurred you get a record of each.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: get a list of table modifications in a day?

От
Ottavio Campana
Дата:
hubert depesz lubaczewski ha scritto:
> On Thu, Sep 13, 2007 at 09:59:30AM +0200, Ottavio Campana wrote:
>> 1) pg_dump each day and run diff
>
> it will become increasingly painful as the table size increases.
>
>> 2) modify some triggers we use and store the information in another table
>
> this is the best choice. you can use table_log extension to avoid
> writing your own triggers.
>
> http://ads.wars-nicht.de/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html

since I already use triggers on that table, can I use table_log?

I mean, can I have two triggers for the same event on the same table?



Вложения

Re: get a list of table modifications in a day?

От
"A. Kretschmer"
Дата:
am  Thu, dem 13.09.2007, um 10:44:41 +0200 mailte Ottavio Campana folgendes:
> > http://ads.wars-nicht.de/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html
>
> since I already use triggers on that table, can I use table_log?
>
> I mean, can I have two triggers for the same event on the same table?

Yes.

Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net