On Wed, Oct 13, 2010 at 4:38 AM, Neil Whelchel <neil.whelchel@gmail.com> wrote:
> There seems to be allot of discussion about VACUUM FULL, and its problems. The
> overall buzz seems to be that VACUUM FULL is a bad idea (I could be wrong
> here). It has been some time since I have read the changelogs, but I seem to
> remember that there have been some major changes to VACUUM FULL recently.
> Maybe this needs to be re-visited in the documentation.
In 9.0, VACUUM FULL does something similar to what CLUSTER does. This
is a much better idea than what it did in 8.4 and prior.
> crash:~# time psql -U test test -c "VACUUM FULL log;"
> VACUUM
>
> real 4m49.055s
> user 0m0.000s
> sys 0m0.000s
>
> crash:~# time psql -U test test -c "SELECT count(*) FROM log;"
> count
> ----------
> 10050886
> (1 row)
>
> real 0m9.665s
> user 0m0.000s
> sys 0m0.004s
>
> A huge improvement from the minute and a half before the VACUUM FULL.
This is a very surprising result that I would like to understand
better. Let's assume that your UPDATE statement bloated the table by
2x (you could use pg_relation_size to find out exactly; the details
probably depend on fillfactor which you might want to lower if you're
going to do lots of updates). That ought to mean that count(*) has to
grovel through twice as much data, so instead of taking 9 seconds it
ought to take 18 seconds. Where the heck is the other 1:12 going?
This might sort of make sense if the original table was laid out
sequentially on disk and the updated table was not, but how and why
would that happen?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company