Re: Massive table bloat

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: Massive table bloat
Дата
Msg-id 50C78B06.9000606@pinpointresearch.com
обсуждение исходный текст
Ответ на Massive table bloat  (Michael Sawyers <msawyers@iii.com>)
Ответы Re: Massive table bloat  (Michael Sawyers <msawyers@iii.com>)
Список pgsql-admin
On 12/11/2012 08:11 AM, Michael Sawyers wrote:
> Our dba quit last week leaving me with an interesting problem.
>
> We have a table currently using 33gb worth of space for only 152mb worth of
> data because of bad processes or autovacuum not being aggressive enough. I
> was able to confirm the size difference by doing a create table as select
> along with a second test of restoring the table from the dump file to a dev
> machine.
>
>
>
> There is a very large list of foreign key relationships that I'm not
> including for the sake of brevity.
>
> The database version is 8.4.1
>
> The old DBA had said that vacuum full would take days to complete, and we
> don't have that much of a window.  So I was considering using the  to force
> a full table rewrite.  In testing on a dev machine it only took about five
> minutes.
>
> I do not have as much hands on experience with postgres so I wanted to get
> thoughts on what is considered the proper way to deal with this kind of
> situation.
>
> Any comments would be welcome.
It's true that vacuum-full has historically been very slow as it
rewrites data in-place but you can use CLUSTER which is *WAY* faster
since it rewrites the data to a new file which it then drops back in
place. CLUSTER also gives you new clean indexes. In fact the current
versions of PostgreSQL basically use the CLUSTER process to handle
VACUUM FULL. (I just don't recall which version applied that change.)

How long did your test copy/dump/restore steps take? This will give you
a *rough* idea of the time that CLUSTER will take.

Really the only downside of CLUSTER is the requirement that you have
enough available space to write the new copy of the table/indexes.

Another option if you can't tolerate downtime is pg_reorg
(http://pgfoundry.org/projects/reorg/) or its fork pg_repack
(https://github.com/reorg/pg_repack) which essentially do the copy to a
new table in the background then briefly lock the tables to apply any
recent changes from the master before dropping the copy into place.

Cheers,
Steve



В списке pgsql-admin по дате отправления:

Предыдущее
От: Craig James
Дата:
Сообщение: Re: Massive table bloat
Следующее
От: Michael Sawyers
Дата:
Сообщение: Re: Massive table bloat