Re: Vacuums on large busy databases

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Vacuums on large busy databases
Дата
Msg-id 1158280502.29889.183.camel@dogma.v10.wvs
обсуждение исходный текст
Ответ на Re: Vacuums on large busy databases  (Francisco Reyes <lists@stringsutils.com>)
Список pgsql-performance
On Thu, 2006-09-14 at 19:30 -0400, Francisco Reyes wrote:
> Will have to talk to the developers. In particular for every insert there
> are updates. I know they have at least one table that gets udpated to have
> summarized totals.
>

If the table being updated is small, you have no problems at all. VACUUM
that table frequently, and the big tables rarely. If the big tables are
only INSERTs and SELECTs, the only reason to VACUUM is to avoid the xid
wraparound. See:

<http://www.postgresql.org/docs/8.1/static/maintenance.html>

See which tables need VACUUM, and how often. Use the statistics to see
if VACUUMing will gain you anything before you do it.

> One of the reasons I was doing the vacuumdb of the entire DB was to get the
> number of shared-buffers. Now that I have an idea of how much I need I will
> likely do something along the lines of what you suggest. One full for
> everything at night and during the days perhaps do the tables that get more
> updated. I also set more aggresive values on autovacuum so that should help
> some too.

Why VACUUM FULL? That is generally not needed. Re-evaluate whether
you're gaining things with all these VACUUMs.

> > You can run ANALYZE more frequently on all the
> > tables, because it does not have to read the entire table and doesn't
> > interfere with the rest of the operations.
>
> On a related question. Right now I have my autovacuums set as:
> autovacuum_vacuum_threshold = 50000
> autovacuum_analyze_threshold = 100000
> autovacuum_vacuum_scale_factor = 0.05
> autovacuum_analyze_scale_factor = 0.1
>
> Based on what you described above then I could set my analyze values to the
> same as the vacuum to have something like
> autovacuum_vacuum_threshold = 50000
> autovacuum_analyze_threshold = 50000
> autovacuum_vacuum_scale_factor = 0.05
> autovacuum_analyze_scale_factor = 0.05
>
> For DBs with hundreds of GBs would it be better to get
> autovacuum_analyze_scale_factor to even 0.01? The permanent DB is over 200GB
> and growing.. the 100GB ones are staging.. By the time we have finished
> migrating all the data from the old system it will be at least 300GB. 0.01
> is still 3GB.. pretty sizable.

Just test how long an ANALYZE takes, and compare that to how quickly
your statistics get out of date. As long as postgres is choosing correct
plans, you are ANALYZE-ing often enough.

ANALYZE takes statistical samples to avoid reading the whole table, so
it's really not a major influence on performance in my experience.

> Do the thresholds tabke presedence over the scale factors? Is it basically
> if either one of them gets hit that the action will take place?

u = number of tuples UPDATE-ed or DELETE-ed (i.e. dead tuples)
r = the (estimated) number of total live tuples in the relation

In a loop, autovacuum checks to see if u >
(r*autovacuum_vacuum_scale_factor + autovacuum_vacuum_threshold), and if
so, it runs VACUUM. If not, it sleeps. It works the same way for
ANALYZE.

So, in a large table, the scale_factor is the dominant term. In a small
table, the threshold is the dominant term. But both are taken into
account.

Regards,
    Jeff Davis


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Steinar H. Gunderson"
Дата:
Сообщение: Re: sql-bench
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Vacuums on large busy databases