Обсуждение: is autovacuum recommended?

Поиск
Список
Период
Сортировка

is autovacuum recommended?

От
Willy-Bas Loos
Дата:
Hi,

Whenever i start a big action, like inserting millions of recs or doing a large update, the autovacuum fires on top of that.
It has some adverse effects on performance when i need it most. More than once a postgres service crashed on me because of it.
Sure, it had too little memory, but it wouldn't have happened if it wasn't for the autovacuum.

Should ik keep autovacuum on, or is it better to run a script like:
vacuumdb -az (daily except one day)
vacuumdb -azf (once a week)

My database is growing, so really freeing up space isn't such an issue. What I want is optimal performance.
Of course autovacuum is more generic, but for me i think the daily scheme will be sufficient.
A connection told me it would be better to enable the autovacuum, because it does more than the above script. Can anyone verify that?

Cheers,

WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw

Re: is autovacuum recommended?

От
Alvaro Herrera
Дата:
Willy-Bas Loos escribió:
> Hi,
>
> Whenever i start a big action, like inserting millions of recs or doing a
> large update, the autovacuum fires on top of that.
> It has some adverse effects on performance when i need it most. More than
> once a postgres service crashed on me because of it.
> Sure, it had too little memory, but it wouldn't have happened if it wasn't
> for the autovacuum.

1. the server should never crash.  If it does you have another problem
that perhaps is triggered by the high load.  Investigate that.

2. what PG version are you running?  What operating system?

3. You can turn autovacuum off for particular tables.  This would allow
you to have it process most tables, and manually vacuum only the
specific tables that are causing you grief.

> Should ik keep autovacuum on, or is it better to run a script like:
> vacuumdb -az (daily except one day)
> vacuumdb -azf (once a week)

4. VACUUM FULL is (almost) never needed unless your server is not
properly tuned.  If you find a need for it, it's a symptom that you need
to tweak something somewhere.  Need more details about the problem to
give you more advice.

2a. Upgrading to 8.4 may give you several benefits in this area.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: is autovacuum recommended?

От
Andres Freund
Дата:
Hi,

On Thursday 09 July 2009 19:25:15 Willy-Bas Loos wrote:
> Whenever i start a big action, like inserting millions of recs or doing a
> large update, the autovacuum fires on top of that.
You can configure autovacuum to use less resources.
http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html

> It has some adverse effects on performance when i need it most. More than
> once a postgres service crashed on me because of it.
> Sure, it had too little memory, but it wouldn't have happened if it wasn't
> for the autovacuum.
That sounds like misconfiguration. PG got OOM killed?

It is also possible that your max_fsm_pages were to small, so autovacuum hat
to start all over.

> My database is growing, so really freeing up space isn't such an issue.
> What I want is optimal performance.
Well, not doing vacuum regulary can lead to significant bloat which makes
lookups (and thus updates/deletes) way much slower.

> Of course autovacuum is more generic, but for me i think the daily scheme
> will be sufficient.
That may be. Hard to say without knowing more though. How fast are your tables
changing? Inserts only or also updates/deletes?

> A connection told me it would be better to enable the autovacuum, because
> it does more than the above script. Can anyone verify that?
I dont think it really does more.


> vacuumdb -azf (once a week)
Using -f is normally not a good idea. It takes an exclusive lock on the tables
- so you cannot do anything in those tables.

Which version of PG are you using?

Another possibility would be to upgrade to 8.4 - keeping track of free space
works quite different there.

Andres

Re: is autovacuum recommended?

От
Bill Moran
Дата:
In response to Willy-Bas Loos <willybas@gmail.com>:
>
> Whenever i start a big action, like inserting millions of recs or doing a
> large update, the autovacuum fires on top of that.
> It has some adverse effects on performance when i need it most. More than
> once a postgres service crashed on me because of it.
> Sure, it had too little memory, but it wouldn't have happened if it wasn't
> for the autovacuum.

If autovacuum is interfering with performance, you have two basic choices:
disable autovac or add hardware (assuming your system is already optimally
tuned).

> Should I keep autovacuum on, or is it better to run a script like:
> vacuumdb -az (daily except one day)
> vacuumdb -azf (once a week)

VACUUM FULL is seldom a good choice for a scheduled activity.  That being
said, there are some corner use cases where a regular VACUUM FULL is a
good idea.

Also, it's likely that a daily vacuum won't be enough.  If that were the
case, then autovac wouldn't be kicking off as often as it does.  Once you
start vacuuming multiple times per day, you're back to interfering with
performance again.  However, if you're doing it on a controlled schedule,
it's possible that you can schedule it in such a way that it's less
intrusive.

Overall, however, you're probably going to have to solve the problem by
adding hardware, or better tuning your system.

> My database is growing, so really freeing up space isn't such an issue. What
> I want is optimal performance.

If you _only_ do inserts, then vacuum isn't required at all.  If you do _ANY_
UPDATE or DELETE operations, then you still need vacuum or your filesystem
will fill up with data that's no longer used.  If autovacuum is kicking off,
then you need to do vacuums, as autovac isn't random, it uses statistics on
the usage of tables to determine if it's needed.

You can also manually vacuum as part of your application.  For example,
vacuum a table manually after a bunch of UPDATEs.  If you do this
intelligently, you can leave autovacuum enabled, and it will only pick
vacuum tables that aren't done manually.

> Of course autovacuum is more generic, but for me i think the daily scheme
> will be sufficient.

Unlikely.  Unless your database sees very few updates, daily vacuum probably
isn't often enough.  However, your usage pattern will dictate that.

> A connection told me it would be better to enable the autovacuum, because it
> does more than the above script. Can anyone verify that?

Autovacuum is smarter -- it won't vacuum tables that don't need it, whereas
the above script vacuums everything, even if it doesn't need it.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: is autovacuum recommended?

От
Willy-Bas Loos
Дата:
Hi,

Thanks for your answers!
I'm using 8.1 and 8.2 on windows2003 servers, and it's true that i could probably configure them much better.
We've recently moved to brand new dedicated database servers with pg8.3 on debian in 2 projects and it has been much easier to configure these correctly. There I don't encounter the probems that i described.

The thing is that the whole concept of autovacuum is not feeling right.
Per design, the vacuum is likely to kick off when i am doing something big.
And when i am doing something big, a vacuum is the last thing i'd wish for. I'd wish for a vacuum when the database is doing nothing at all, but the autovacuum will NEVER kick off in such a moment.

That's why i feel better scheduling the vacuum at times at which i know things will be generally quiet.

To be honest, i am a bit surprised that all 3 reactions recommend using autovacuum, even if it means i have to buy a new server for this purpouse.
I was thinking that autovacuum was just a mechanism to ensure that postgres works well out of the box, but that it would be recommended to schedule your own vacuum tailored to your specific needs.
I agree though, that it is a tough tailoring job and that the autovacuum must be doing a better job than i am. It just fires at the wrong time.

Just a thought (to think positively..): wouldn't it be possible to let the autovacuum wait until the load goes down, or until the end of the transaction that triggered the autovacuum?

Cheers,

WBL

Re: is autovacuum recommended?

От
Scott Marlowe
Дата:
On Fri, Jul 10, 2009 at 2:47 PM, Willy-Bas Loos<willybas@gmail.com> wrote:
> Hi,
>
> Thanks for your answers!
> I'm using 8.1 and 8.2 on windows2003 servers, and it's true that i could
> probably configure them much better.

Note that support for 8.1 on windows is gone, as it is no longer
considered supportable due to design / engineering issues.  Upgrading
them all to 8.3 or 8.4 is probably a good idea since they are better
at autovacuum and such than 8.2 and before.

> We've recently moved to brand new dedicated database servers with pg8.3 on
> debian in 2 projects and it has been much easier to configure these
> correctly. There I don't encounter the probems that i described.

Smart move.  The amount of effort needed to learn debian or any other
linux distro is usually less than the amount of ongoing effort to keep
a production pg server happy on windows.  Also, 64 bit pgsql on 64 bit
unix/linux is capable of better scaling and handling more memory.

> The thing is that the whole concept of autovacuum is not feeling right.
> Per design, the vacuum is likely to kick off when i am doing something big.

That assumes that autovacuum always runs in some mode that must
interfere with db operation.  If you set the
autovacuum_vacuum_cost_delay, autovacuum_vacuum_cost_limit,
autovacuum_vacuum_scale_factor, autovacuum_vacuum_threshold parameters
properly, as well as the free space map settings large enough to hold
all your dead tuples, then autovacuum should not cause a lot of
issues, unless your machine is already IO bound.  And if it's already
IO bound and dragging butt, then the problem isn't autovacuum, but a
machine without enough IO bandwidth to do its job well.

> And when i am doing something big, a vacuum is the last thing i'd wish for.

I don't even notice when it kicks in on my servers.

> I'd wish for a vacuum when the database is doing nothing at all, but the
> autovacuum will NEVER kick off in such a moment.

Again, if the delay and such are set right, then autovac will use so
little IO as to be unnoticeable.

> That's why i feel better scheduling the vacuum at times at which i know
> things will be generally quiet.

For some very large tables on slow IO machines, it makes sense to
remove them from the purview of autovac, I agree. Take a look at the
pg_autovacuum table. it's pretty easy to see how it works.

> To be honest, i am a bit surprised that all 3 reactions recommend using
> autovacuum, even if it means i have to buy a new server for this purpouse.

Well, autovacuum is more primitive in the versions you're running, and
more likely to get in the way.  8.3, and to a greater extent 8.4,
remove a lot of these issues.

> I was thinking that autovacuum was just a mechanism to ensure that postgres
> works well out of the box, but that it would be recommended to schedule your
> own vacuum tailored to your specific needs.
> I agree though, that it is a tough tailoring job and that the autovacuum
> must be doing a better job than i am. It just fires at the wrong time.

And that's kind of the point, that it's better to have a db that runs
a little slow than one that explodes in sheets of flame.

> Just a thought (to think positively..): wouldn't it be possible to let the
> autovacuum wait until the load goes down, or until the end of the
> transaction that triggered the autovacuum?

You could set up a cron job that updated the pg_autovacuum table at
certain times to accomplish this.  I'm not sure how easy it would be
to program autovac to do the same thing.  You could certainly set the
cost delay higher than normal (like 20 or 40 ms) for some tables so
that autovac didn't get in the way, but then you run the risk of it
never keeping up, and on 8.1 with only one thread to autovac, that
could be bad.  Definitely consider upgrading pg versions on your
windows machines.