Re: pg_autovacuum

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: pg_autovacuum
Дата
Msg-id 200403222132.04732.josh@agliodbs.com
обсуждение исходный текст
Ответ на Re: [DEFAULT] Daily digest v1.4346 (20 messages)  ("Matthew T. O'Connor" <matthew@zeut.net>)
Ответы Re: pg_autovacuum  ("Matthew T. O'Connor" <matthew@zeut.net>)
Список pgsql-hackers
Matt,

> So interesting, most uses request the per table  settings, guess there 
> is sufficient demand for both.

The reason for us is that multi-database installations generally have 
significantly different purposes for each database; for example, with a 
"reporting" database on the same server that needs no vacuuming at all.

> You might be missing the point, the advantage of using pg_autovacuum is 
> that it wouldn't waste cycles doing vacuums on tables that don't need 
> it.  If we have persistent data (saving state information periodically) 
> then this is a very easy feature to add.

OK, I can see that.

> What I'm thinking is that the VACUUM command could be modified to write 
> down some data from the stats system at vacuum time.  Once the VACUUM 
> command writes this down for itself then pg_autovacuum just uses that 
> number to make its decision.  Again, we are trying to reduce as much as 
> possible superfluous vacuums.  If an admin vacuums his whole cluster 
> every Sunday night that may prevent lots of  vacuums occurring during 
> business hours that effect processing.

Be nice, yes.    However, my experience is that mixing manual and autovacuums 
is bound to lead to endless support requests, because conflicts *will* arise.   
So in some ways you'd be working to please those who can't be pleased.

> Backend integration should solve the 1st issue.  Parallel vacuums is 
> something that could be worked on at some point.  Would it  make sense 
> to incorporate this with tablespaces?  The vacuum daemon would only 
> issue one vacuum command per tablespace, but could issue as many 
> parallel vacuums as you have independent tablespaces.

Hmmm ... that's an interesting idea.    I'd more been thinking about vacuums 
of small tables, where a high-end server under low load could vacuum several 
tables in parallel, one per CPU.   However, working through tablespaces would 
make a lot of sense.

> I think timeout issue would need to be a part of vacuum proper, and I'm 
> not sure about the "backing up" issue.

Well, we've discussed timeout for vacuum.   

Thing is, autovacuum changes the equation somewhat.   Imagine that the 
transaction rate of your tables accelerates so that autovacuum with a 0.3 
scale setting is triggered every 23 minutes.   But say that it takes 29 
minutes to vacuum through all of your tables ... or even 49 minutes if you 
have "slow vacuum" turned on!

You would get into a cycle where you are running vacuum continuously, all the 
time.  This is a very bad situation and the admin should be warned about it 
via the logs.

Hmmm ... thinkiing about that, are we changing the defaults for threshold and 
scale?  You and I have discussed this, yes?

> The reason it's similar is that once pg_autovacuum data is persistent, 
> it would be trivial to implement this feature, and the data that any 
> tool would need to make these decisions is the same as what 
> pg_autovacuum is already tracking.

Well, if it's easy to do, then go for it.   I can see how some would find it 
useful.   Once it's sufficently bulletproof, it could replace the standard 
VACUUM (whole db).

> I think the patch was submitted to either the hackers or patches list.  
> If you can't find it, I'll look around and see if I still have a copy.  
> The person who submitted said it was simple, but was working for him in 
> production.

Thanks for the forward.

-- 
-Josh BerkusAglio Database SolutionsSan Francisco



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: float8 regression test failure in head
Следующее
От: Tom Lane
Дата:
Сообщение: Re: bug in 7.4 SET WITHOUT OIDs