Обсуждение: Very busy 24x7 databases and VACUUM

Поиск
Список
Период
Сортировка

Very busy 24x7 databases and VACUUM

От
"David F. Skoll"
Дата:
Hi,

Does anyone run a very busy PostgreSQL datatabase, with lots of read
and write operations that run 24x7?  (We're talking on the
neighbourhood of 40 to 60 queries/second, with probably 5% to 10% of
them being INSERT or UPDATE.)

Some of our clients run such a DB, and the nightly VACUUM slows things
down tremendously while it is running.  I see that in 8.0, you can
reduce the VACUUM's I/O impact, but from reading the code, it also
looks like that means the VACUUM will hold locks for longer, which is
probably bad news.

Doing VACUUM more often than nightly is not a good idea; the tables
tend to be pretty large and it looks like VACUUM has to scan all the
tuples each time.  (The nightly VACUUM is already taking several hours
in some cases.)

How do we handle this situation?  Are there any plans for some kind
of "incremental" vacuum that recovers a few pages here and there in the
background?  Is such a thing even possible?

If we defer some write operations until after the VACUUM has finished,
will that speed up the VACUUM?  There are some things we can save up until
after VACUUM is finished.

Regards,

David.

Re: Very busy 24x7 databases and VACUUM

От
Christopher Browne
Дата:
In the last exciting episode, dfs@roaringpenguin.com ("David F. Skoll") wrote:
> Does anyone run a very busy PostgreSQL datatabase, with lots of read
> and write operations that run 24x7?  (We're talking on the
> neighbourhood of 40 to 60 queries/second, with probably 5% to 10% of
> them being INSERT or UPDATE.)

Yup...  [Hand goes up...]

> Some of our clients run such a DB, and the nightly VACUUM slows
> things down tremendously while it is running.  I see that in 8.0,
> you can reduce the VACUUM's I/O impact, but from reading the code,
> it also looks like that means the VACUUM will hold locks for longer,
> which is probably bad news.

Yes, there's a trade-off there.  The "lazier" vacuum will indeed hold
its locks longer.

> Doing VACUUM more often than nightly is not a good idea; the tables
> tend to be pretty large and it looks like VACUUM has to scan all the
> tuples each time.  (The nightly VACUUM is already taking several
> hours in some cases.)

Yes, vacuum does need to scan all the tuples.  There's no shortcut at
this point.

> How do we handle this situation?  Are there any plans for some kind
> of "incremental" vacuum that recovers a few pages here and there in
> the background?  Is such a thing even possible?

There has been some talk of a "VACUUM CACHE" idea, where the idea
would be to walk through the shared buffer cache and vacuum just those
pages.  Recently updated pages ought to be in the cache, so we might
expect this to be reasonably fruitful, as well as being rather quick.

If that were to work out well, I would think it potentially fruitful
to have a perhaps longer list of "pages of interest" whereby
UPDATE/DELETE operations might throw pages that they touch into a
queue for later re-examination.  In a table that contains both "active
regions" and large, seldom-updated "inactive regions," it would be
nice to have a way to focus on the "active" bits.

> If we defer some write operations until after the VACUUM has
> finished, will that speed up the VACUUM?  There are some things we
> can save up until after VACUUM is finished.

One thing that would be somewhat helpful would be to be sure that each
of the tables that you are vacuuming is handled in a separate
transaction.

Supposing it's six tables that each take 1/2h to vacuum, if you do
each in a separate transaction, some locks may get established over
and over, but the "lease" will be dropped and renewed each half hour,
which ought to be helpful.  Any MVCC-relevant logic will just have
1/2h periods for which tuples are held onto instead of there being a
big 3h "lock" put on them.  That's not an explicit lock, but rather an
inability to purge entries out...
--
(format nil "~S@~S" "cbbrowne" "gmail.com")
http://www.ntlug.org/~cbbrowne/rdbms.html
If two people love each other, there can be no happy end to it.
-- Hemingway

Re: Very busy 24x7 databases and VACUUM

От
Andrew Sullivan
Дата:
On Sun, Dec 05, 2004 at 11:18:47PM -0500, Christopher Browne wrote:
> One thing that would be somewhat helpful would be to be sure that each
> of the tables that you are vacuuming is handled in a separate
> transaction.

What Chris said.  In particular, I've found that pg_autovacuum has
the ability really to kill performance if it starts at the wrong
moment.  You need something much more hand-crafted at the moment.
I'll bet, however, that vacuuming some tables more often than once a
day will still be a net win for you.

A
--
Andrew Sullivan  | ajs@crankycanuck.ca
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.
        --Dennis Ritchie