Обсуждение: performance question on VACUUM FULL (Postgres 8.4.2)

Поиск
Список
Период
Сортировка

performance question on VACUUM FULL (Postgres 8.4.2)

От
PG User 2010
Дата:
Hello,

We are running into some performance issues with running VACUUM FULL on the pg_largeobject table in Postgres (8.4.2 under Linux), and I'm wondering if anybody here might be able to suggest anything to help address the issue.  Specifically, when running VACUUM FULL on the pg_largeobject table, it appears that one of our CPUs is pegged at 100% (we have 8 on this particular box), and the I/O load on the machine is VERY light (10-20 I/O operations per second--nowhere near what our array is capable of).  Our pg_largeobject table is about 200 gigabytes, and I suspect that about 30-40% of the table are dead rows (after having run vacuumlo and deleting large numbers of large objects).  We've tuned vacuum_cost_delay to 0.

I have read that doing a CLUSTER might be faster and less intrusive, but trying that on the pg_largeobject table yields this: ERROR:  "pg_largeobject" is a system catalog

One other thing: it is possible to run VACUUM FULL for a while, interrupt it, then run it again later and have it pick up from where it left off?  If so, then we could just break up the VACUUM FULL into more manageable chunks and tackle it a few hours at a time when our users won't care.  I thought I read that some of the FSM changes in 8.4 would make this possible, but I'm not sure if that applies here.

If anybody has any info here, it would be greatly appreciated.   Thanks!

Sam

Re: performance question on VACUUM FULL (Postgres 8.4.2)

От
Jeff Davis
Дата:
On Tue, 2010-01-19 at 12:19 -0800, PG User 2010 wrote:
> Hello,
>
> We are running into some performance issues with running VACUUM FULL
> on the pg_largeobject table in Postgres (8.4.2 under Linux), and I'm
> wondering if anybody here might be able to suggest anything to help
> address the issue.

Are you running VACUUM (without FULL) regularly? And if so, is that
insufficient?

> Our pg_largeobject table is about 200 gigabytes, and I suspect that
> about 30-40% of the table are dead rows (after having run vacuumlo and
> deleting large numbers of large objects).

You can always expect some degree of bloat. Can you give an exact number
before and after the VACUUM FULL? Or is this a one-shot attempt that
never finished?

If large objects are being added/removed regularly, it might be better
just to wait (and do regular VACUUMs), and the table will naturally
compact after the rows at the end are removed.

Regards,
    Jeff Davis


Re: performance question on VACUUM FULL (Postgres 8.4.2)

От
PG User 2010
Дата:
Hi Jeff,

Are you running VACUUM (without FULL) regularly? And if so, is that
insufficient?

Unfortunately, we have not run vacuumlo as often as we would like, and that has caused a lot of garbage blobs to get generated by our application.

You can always expect some degree of bloat. Can you give an exact number
before and after the VACUUM FULL? Or is this a one-shot attempt that
never finished?

If large objects are being added/removed regularly, it might be better
just to wait (and do regular VACUUMs), and the table will naturally
compact after the rows at the end are removed.

Our vacuum full is still running after several days, so I'm unsure when it will finish (it would be nice to be able to get a rough idea of % complete for vacuum full, but I don't know of any way to do that).  I estimate that there are probably several million dead blobs taking up ~ 80 gigabytes of space.

I believe that once we are running vacuumlo regularly, then normal vacuums will work fine and we won't have much of a wasted space issue.  However, right now we have LOTS of dead space and that is causing operational issues (primarily slower + larger backups, maybe some other slight performance issues).

So, here are my questions (maybe I should post these to the -general or -admin mailing lists?):

1) is there any easy way to fiddle with the vacuum process so that it is not CPU bound and doing very little I/O?  Why would vacuum full be CPU bound anyway???

2) is it possible to interrupt VACUUM FULL, then re-start it later on and have it pick up where it was working before?

3) are there any alternatives, such as CLUSTER (which doesn't seem to be allowed since pg_largeboject is a system table) that would work?

Thanks so much!

Sam