Обсуждение: Autovacuum and stats_row_level

Поиск
Список
Период
Сортировка

Autovacuum and stats_row_level

От
David Wall
Дата:
Noted that to use autovacuum we need to turn on stats_row_level (along
with stats_start_collector that is on by default).  Since
stats_row_level is off by default, I wonder what sort of overhead is
incurred since it sounds like it could add up if it's storing additional
stats information on every row update, though it's not clear to me just
what it means to have row level stats.

We currently use cron to run vacuum and analyze daily in the early
morning.  It's  never been an issue, but the databases are getting
bigger and we wonder if autovacuum is a good addition to our operations.

Are the default values for autovacuum generally good enough for most
deployments?  The various parameters are a bit complicated to
understand, so tweaking them is something we're naturally concerned
about doing.

Will autovacuum running allow regular vacuum and analyze commands to run
faster?  Can it replace them entirely, or do we still need to run them
from time to time?

Can autovacuum be configured to run in a backup server that is in
"recovery" mode handling pg_standby WAL file updates?

Thanks,
David

Re: Autovacuum and stats_row_level

От
Chander Ganesan
Дата:
David Wall wrote:
> Noted that to use autovacuum we need to turn on stats_row_level (along
> with stats_start_collector that is on by default).  Since
> stats_row_level is off by default, I wonder what sort of overhead is
> incurred since it sounds like it could add up if it's storing
> additional stats information on every row update, though it's not
> clear to me just what it means to have row level stats.
There is overhead. I think it is somewhat related to your query load
However, I do know that there is some rate limiting logic in place to
prevent the backends from sending too many messages to the stats collector.
>
> We currently use cron to run vacuum and analyze daily in the early
> morning.  It's  never been an issue, but the databases are getting
> bigger and we wonder if autovacuum is a good addition to our operations.
AVD doesn't pay attention to the load on your server, and I assume you
cron that stuff when things are at a lull.  In that case, you might be
worse off.    Also, if your load is such that AVD wouldn't kick off
anyways during the day, cron would give you better results than AVD
(though you could tune AVD to balance this out)

The benefit is that AVD, properly adjusted, will keep the number of dead
tuples from getting "too bad".  Suppose you had a system that did lots
of updates and deletes throughout the day, with load increasing daily.
A nightly vacuum might allow your tables to continue to grow (albeit
more slowly, since more dead tuples would accumulate before a nightly
vacuum), whereas a well tuned AVD would prevent that (by performing
periodic vacuums throughout the day).
>
> Are the default values for autovacuum generally good enough for most
> deployments?  The various parameters are a bit complicated to
> understand, so tweaking them is something we're naturally concerned
> about doing.
To really understand that one would have to know about your specific
deployment, and how you use your tables.  The reason AVD is so
customizable is because different environments have different needs.  I
would leave them at the defaults to start with, but keep in mind that
they might not be aggressive enough - or perhaps too aggressive - for
your particular needs.  Once you have some stats, you can look at the
statistics and figure out an optimum value for how often AVD should
process a table (based on load, statistics, table sizes, etc.)
>
> Will autovacuum running allow regular vacuum and analyze commands to
> run faster?  Can it replace them entirely, or do we still need to run
> them from time to time?
You shouldn't need to run them if you have AVD enabled.  It won't speed
them up...it will just periodically run them when the number of changed
rows is significant, or dead tuples is significant...thus allowing you
to not have to lose sleep over vacuum operations occurring.
>
> Can autovacuum be configured to run in a backup server that is in
> "recovery" mode handling pg_standby WAL file updates?
>
No.  Though vacuums on the master would in effect also occur on the
standby node.  So as long as you do vacuums on the master you'll be okay.

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com
Ask me about our Expert PostgreSQL & PostGIS Training



--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com


Re: Autovacuum and stats_row_level

От
Tom Lane
Дата:
Chander Ganesan <chander@otg-nc.com> writes:
> David Wall wrote:
>> Noted that to use autovacuum we need to turn on stats_row_level (along
>> with stats_start_collector that is on by default).  Since
>> stats_row_level is off by default, I wonder what sort of overhead is
>> incurred since it sounds like it could add up if it's storing
>> additional stats information on every row update, though it's not
>> clear to me just what it means to have row level stats.

> There is overhead. I think it is somewhat related to your query load
> However, I do know that there is some rate limiting logic in place to
> prevent the backends from sending too many messages to the stats collector.

The rate-limit logic is new for 8.3, but it's always been true that at
most one message is sent per query (and in fact one per transaction,
in recent releases).

            regards, tom lane