Re: modification time & transaction synchronisation problem

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: modification time & transaction synchronisation problem
Дата
Msg-id 4BCC197B.9010202@postnewspapers.com.au
обсуждение исходный текст
Ответ на modification time & transaction synchronisation problem  (Ostrovsky Eugene <e79ene@yandex.ru>)
Ответы Re: modification time & transaction synchronisation problem  (Craig Ringer <craig@postnewspapers.com.au>)
Список pgsql-general
Ostrovsky Eugene wrote:
>  Hi.
> I need to export data from the database to external file. The difficulty
> is that only data modified or added since previous export should be
> written to the file.
> I consider adding "modification_time" timestamp field to all the tables
> that should be exported. Then I can set this field to now() within ON
> UPDATE OR INSERT trigger.
> During export I can select modified data with 'WHERE modification_time >
> last_export_time' clause.
>
> It seems to be the solution but...
> What if the concurrent (and not yet committed) transaction modified some
> data before export transaction begins? These modifications would not be
> visible to export transaction and modified data would not be included to
> export file. Also it won't be included to the next export because it's
> modification time is less than current export start time (the new value
> of last_export_time).
>
> Thus some data could be lost from export files sequence. And that is not
> good at all.

About the only solid solution I can think of right now is to LOCK TABLE
the table you want to dump. You can use a lockmode that permits SELECT,
but just blocks UPDATE/INSERT/DELETE from other threads. That way your
modification time approach works.

(I strongly suggest leaving the modification time field without an
index, so that HOT can do in-place replacement of the rows and you avoid
a whole lot of bloat).

There might be another possible approach that uses the system
"xmin/xmax" fields of each tuple. That'd permit your incremental dumps
to be done read-only, saving you a whole lot of expensive I/O and bloat.
I'm just not sure what I'm thinking of will work yet. I'll check back in
once I've had a play and written a test script or two.

( If it will, then surely there'd be "pg_dump --incremental-as-of" by
now ...)

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

В списке pgsql-general по дате отправления:

Предыдущее
От: Andre Lopes
Дата:
Сообщение: How to call a file from a PlPgsql trigger?
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: How to call a file from a PlPgsql trigger?