Making table reloading easier

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Making table reloading easier
Дата
Msg-id CAMsr+YHX7kXZdaOiSihrJ5FXGu-syXmPw6bbOXABt9pnbUPP8g@mail.gmail.com
обсуждение исходный текст
Ответы Re: Making table reloading easier  (Corey Huinker <corey.huinker@gmail.com>)
Список pgsql-hackers
Hi all

A very common operation that users perform is reloading tables.
Sometimes as part of an ETL process. Sometimes as part of a dump and
reload. Sometimes loading data from external DBs, etc.

Right now users have to jump through a bunch of hoops to do this efficiently:

BEGIN;

TRUNCATE TABLE my_table;

SELECT pg_get_indexdef(indexrelid::regclass)
FROM pg_index WHERE indrelid = 'table_name'::regclass;

-- Drop 'em all
DROP INDEX ... ;

COPY my_table FROM 'file';

-- Re-create indexes
CREATE INDEX ...;

COMMIT;


This is pretty clunky.

We already have support for disabling indexes, it's just not exposed
to the user. So the simplest option would seem to be to expose it with
something like:
 ALTER TABLE my_table DISABLE INDEX ALL;

which would take an ACCESS EXCLUSIVE lock then set:
 indistready = 'f' indislive = 'f' indisvalid = 'f'

on each index, or the named index if the user specifies one particular index.

After loading the table, a REINDEX on the table would rebuild and
re-enable the indexes.

That changes the process to:

BEGIN;
TRUNCATE TABLE my_table;
ALTER TABLE my_table DISABLE INDEX ALL;
COPY ...;
REINDEX TABLE my_table;
COMMIT;

It'd be even better to also add a REINDEX flag to COPY, where it
disables indexes and re-creates them after it finishes. But that could
be done separately.

Thoughts?

I'm not sure I can tackle this in the current dev cycle, but it looks
simple enough that I can't help wondering what obvious thing I'm
missing about why it hasn't been done yet.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: plan_rows confusion with parallel queries
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: about missing xml related functionnalities