[GENERAL] efficiently migrating 'old' data from one table to another

Поиск

Список

Период

Сортировка

От	Jonathan Vanasco
Тема	[GENERAL] efficiently migrating 'old' data from one table to another
Дата	12 января 2017 г. 23:06:14
Msg-id	0CF75CDE-93CC-4E46-9D0A-0A91C4F9C4F0@2xlp.com обсуждение исходный текст
Ответы	Re: [GENERAL] efficiently migrating 'old' data from one table toanother Re: [GENERAL] efficiently migrating 'old' data from one table toanother Re: [GENERAL] efficiently migrating 'old' data from one table toanother
Список	pgsql-general

Дерево обсуждения

I'm just wondering if there's a more efficient way of handling a certain periodic data migration.

We have a pair of tables with this structure:

    table_a__live
        column_1 INT
        column_2 INT
        record_timestamp TIMESTAMP

    table_a__archive
        column_1 INT
        column_2 INT
        record_timestamp TIMESTAMP

periodically, we must migrate items that are 'stale' from `table_a__live ` to `table_a__archive`.  The entries are
copiedover to the archive, then deleted. 

The staleness is calculated based on age--  so we need to use INTERVAL.  the "live" table can have anywhere from 100k
to20MM records. 

the primary key on `table_a__live` is a composite of column_1 & column_2,

In order to minimize scanning the table, we opted to hint migrations with a dedicated column:

    ALTER TABLE table_a__live ADD is_migrate BOOLEAN DEFAULT NULL;
    CREATE INDEX idx_table_a__live_migrate ON table_a__live(is_migrate) WHERE is_migrate IS NOT NULL;

so our migration is then based on that `is_migrate` column:

    BEGIN;
    UPDATE table_a__live SET is_migrate = TRUE WHERE record_timestamp < transaction_timestamp() AT TIME ZONE 'UTC' -
INTERVAL'1 month'; 
    INSERT INTO table_a__archive (column_1, column_2, record_timestamp) SELECT column_1, column_2, record_timestamp
FROMtable_a__live WHERE is_migrate IS TRUE; 
    DELETE FROM table_a__live WHERE is_migrate IS TRUE;
    COMMIT;

The inserts & deletes are blazing fast, but the UPDATE is a bit slow from postgres re-writing all the rows.

can anyone suggest a better approach?

I considered copying everything to a tmp table then inserting/deleting based on that table -- but there's a lot of
disk-ioon that approach too. 


fwiw we're on postgres9.6.1

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

[GENERAL] efficiently migrating 'old' data from one table to another