Re: Massive table (500M rows) update nightmare

Поиск
Список
Период
Сортировка
От Carlo Stonebanks
Тема Re: Massive table (500M rows) update nightmare
Дата
Msg-id hi577c$1lc2$1@news.hub.org
обсуждение исходный текст
Ответ на Re: Massive table (500M rows) update nightmare  (Leo Mannhart <leo.mannhart@beecom.ch>)
Список pgsql-performance
> If it is possible to lock this audit table exclusively (may be during
> off peak hours) I would look into
> - create new_audit_table as select col1, col2, col3 ... col9,
> 'new_col_value' from old_audit_table;
> - create all indexes
> - drop old_audit_table
> - rename new_audit_table to old_audit_table
>
> That is probably the fasted method you can do, even if you have to join
> the "new_col_value" from an extra helper-table with the correspondig id.
> Remeber, databases are born to join.
>

This has all been done before - the production team was crippled while they
waited for this and the SECOND the table was available again, they jumped on
it - even though it meant recreating the bare minimum of the indexes.

> You could also try to just update the whole table in one go, it is
> probably faster than you expect.

Possibly, but with such a large table you have no idea of the progress, you
cannot interrupt it without rolling back everything. Worse, you have
applications stalling and users wanting to know what is going on - is the OS
and the DB/MVCC trashing while it does internal maintenance? Have you
reached some sort of deadlock condition that you can't see because the
server status is not helpful with so many uncommitted pending updates?

And of course, there is the file bloat.


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Carlo Stonebanks"
Дата:
Сообщение: Re: Massive table (500M rows) update nightmare
Следующее
От: Brian Cox
Дата:
Сообщение: Re: query looping?