Обсуждение: modification time & transaction synchronisation problem

Поиск

Список

Период

Сортировка

modification time & transaction synchronisation problem

От

Ostrovsky Eugene

Дата:

16 апреля 2010 г., 06:23:24

<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
    {mso-style-parent:"";
    margin:0cm;
    margin-bottom:.0001pt;
    mso-pagination:widow-orphan;
    font-size:12.0pt;
    font-family:"Times New Roman";
    mso-fareast-font-family:"Times New Roman";}
@page Section1
    {size:595.3pt 841.9pt;
    margin:2.0cm 42.5pt 2.0cm 3.0cm;
    mso-header-margin:35.4pt;
    mso-footer-margin:35.4pt;
    mso-paper-source:0;}
div.Section1
    {page:Section1;}
-->


Hi.
I need to export data from the database to external file. The difficulty is
that only data modified or added since previous export should be written to the
file.
I consider adding "modification_time" timestamp field to all the
tables that should be exported. Then I can set this field to now() within ON
UPDATE OR INSERT trigger.
During export I can select modified data with 'WHERE modification_time >
last_export_time' clause.

It seems to be the solution but...
What if the concurrent (and not yet committed) transaction modified some data
before export transaction begins? These modifications would not be visible to
export transaction and modified data would not be included to export file. Also
it won't be included to the next export because it's modification time is less
than current export start time (the new value of last_export_time).

Thus some data could be lost from export files sequence. And that is not good
at all.

I will appreciate any suggestions on how to solve this problem. I.e. how can I
(within the export transaction) select all the data that was updated since the
last export?

Thanks.
Eugene.

Re: modification time & transaction synchronisation problem

От

Craig Ringer

Дата:

19 апреля 2010 г., 08:55:51

Ostrovsky Eugene wrote:
>  Hi.
> I need to export data from the database to external file. The difficulty
> is that only data modified or added since previous export should be
> written to the file.
> I consider adding "modification_time" timestamp field to all the tables
> that should be exported. Then I can set this field to now() within ON
> UPDATE OR INSERT trigger.
> During export I can select modified data with 'WHERE modification_time >
> last_export_time' clause.
>
> It seems to be the solution but...
> What if the concurrent (and not yet committed) transaction modified some
> data before export transaction begins? These modifications would not be
> visible to export transaction and modified data would not be included to
> export file. Also it won't be included to the next export because it's
> modification time is less than current export start time (the new value
> of last_export_time).
>
> Thus some data could be lost from export files sequence. And that is not
> good at all.

About the only solid solution I can think of right now is to LOCK TABLE
the table you want to dump. You can use a lockmode that permits SELECT,
but just blocks UPDATE/INSERT/DELETE from other threads. That way your
modification time approach works.

(I strongly suggest leaving the modification time field without an
index, so that HOT can do in-place replacement of the rows and you avoid
a whole lot of bloat).

There might be another possible approach that uses the system
"xmin/xmax" fields of each tuple. That'd permit your incremental dumps
to be done read-only, saving you a whole lot of expensive I/O and bloat.
I'm just not sure what I'm thinking of will work yet. I'll check back in
once I've had a play and written a test script or two.

( If it will, then surely there'd be "pg_dump --incremental-as-of" by
now ...)

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

Re: modification time & transaction synchronisation problem

От

Craig Ringer

Дата:

19 апреля 2010 г., 13:51:44

On 19/04/2010 4:51 PM, Craig Ringer wrote:

> There might be another possible approach that uses the system
> "xmin/xmax" fields of each tuple. That'd permit your incremental dumps
> to be done read-only, saving you a whole lot of expensive I/O and bloat.
> I'm just not sure what I'm thinking of will work yet.

Yeah. You can use a SERIALIZABLE transaction and
txid_current_snapshot(), which almost magically solves your problem. In
a single call it provides all the details about active and committed
transactions at the time of snapshot creation that you need. It even
gives you a list of transaction IDs for uncommitted transactions between
those ranges so old uncommitted transactions don't force you to
repeatedly dump data. That's all you need to know to do intelligent
incremental backup of a table.

I haven't written the actual test code, but what you should need to do
(according to my probably flawed understanding) is:

- Begin a read only SERIALIZABLE transaction
- Record txid_current_snapshot(), which you will need
   for the next run. We'll call the value of the last
   run's txid_current_snapshot() call 'txprev'.
- SELECT all rows that have:
   tablename.xmin > current_snapshot_xmax(txprev) OR
   tablename.xmin IN (txid_snapshot_xip(txprev))

( The above doesn't consider deletion. Deletion is never a fun
   thing to handle in incremental backups/dumps. I'm not presently
   sure how it should be handled or if it *can* be handled without
   help from VACCUM and/or an ON DELETE trigger ).

If I get a chance, I'll play with this and see if it works in practice.

See:

   http://www.postgresql.org/docs/current/interactive/functions-info.html
   Table 9-52. Transaction IDs and snapshots

http://www.postgresql.org/docs/current/interactive/storage-page-layout.html

   \df pg_catalog.tx*

--
Craig Ringer

Re: modification time & transaction synchronisation problem

От

Craig Ringer

Дата:

20 апреля 2010 г., 05:19:40

Craig Ringer wrote:
> On 19/04/2010 4:51 PM, Craig Ringer wrote:
>
>> There might be another possible approach that uses the system
>> "xmin/xmax" fields of each tuple. That'd permit your incremental dumps
>> to be done read-only, saving you a whole lot of expensive I/O and bloat.
>> I'm just not sure what I'm thinking of will work yet.
>
> Yeah. You can use a SERIALIZABLE transaction and
> txid_current_snapshot(), which almost magically solves your problem. In
> a single call it provides all the details about active and committed
> transactions at the time of snapshot creation that you need. It even
> gives you a list of transaction IDs for uncommitted transactions between
> those ranges so old uncommitted transactions don't force you to
> repeatedly dump data. That's all you need to know to do intelligent
> incremental backup of a table.
>
> I haven't written the actual test code, but what you should need to do
> (according to my probably flawed understanding) is:
>
> - Begin a read only SERIALIZABLE transaction

Actually, it looks like READ COMMITTED is preferable, as it gives you
information about any concurrently running transactions in
txid_current_snapshot(), and you can do the actual dump in one statement
anyway.

> - Record txid_current_snapshot(), which you will need
>   for the next run. We'll call the value of the last
>   run's txid_current_snapshot() call 'txprev'.
> - SELECT all rows that have:
>   tablename.xmin > current_snapshot_xmax(txprev) OR
>   tablename.xmin IN (txid_snapshot_xip(txprev))

I've tested this approach and it appears to work fine, *but* only for
append-only tables.

I don't seem to be smart enough to figure out how to correctly handle
UPDATEs and DELETEs - I know the xmax (for last transaction in which the
row is visible) field is key, but can't figure out a way to effectively
use it to record deletions. I guess I don't understand MVCC or at least
Pg's implementation of it even as well as I thought I did (which isn't
very).

Unless someone smarter steps in or I have the time to learn more of the
details about this, I probably can't offer a pre-formed solution to your
problem.

You can always use a trigger-maintained change history table to track
inserts/updates/deletes, and DELETE FROM ... RETURNING it. That's simple
and easy, but generates plenty of extra I/O to do your progressive
backup/copy.

I'm going to stop talking to myself now.

--
Craig Ringer

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: modification time & transaction synchronisation problem

modification time & transaction synchronisation problem

Re: modification time & transaction synchronisation problem

Re: modification time & transaction synchronisation problem

Re: modification time & transaction synchronisation problem