Re: Autovacuum Improvements

Поиск
Список
Период
Сортировка
От Christopher Browne
Тема Re: Autovacuum Improvements
Дата
Msg-id 871wm2utvr.fsf@wolfe.cbbrowne.com
обсуждение исходный текст
Ответ на Autovacuum Improvements (was: Second attempt, roll your own autovacuum)  (Matthew O'Connor <matthew@zeut.net>)
Список pgsql-general
A long time ago, in a galaxy far, far away, nagy@ecircle-ag.com (Csaba Nagy) wrote:
> On Mon, 2007-01-08 at 22:29, Chris Browne wrote:
> [snip]
>> Based on the three policies I've seen, it could make sense to assign
>> worker policies:
>>
>> 1. You have a worker that moves its way through the queue in some sort of
>>    sequential order, based on when the table is added to the queue, to
>>    guarantee that all tables get processed, eventually.
>>
>> 2. You have workers that always pull the "cheapest" tables in the
>>    queue, perhaps with some sort of upper threshold that they won't go
>>    past.
>>
>> 3. You have workers that alternate between eating from the two ends of the
>>    queue.
>>
>> Only one queue is needed, and there's only one size parameter
>> involved.
>> Having multiple workers of type #2 seems to me to solve the problem
>> you're concerned about.
>
> This sounds better, but define "cheapest" in #2... I actually want to
> continuously vacuum tables which are small, heavily recycled
> (insert/update/delete), and which would bloat quickly. So how do you
> define the cost function for having these tables the "cheapest" ?

Cost would be based on the number of pages in the table.  The smallest
tables are obviously the cheapest to vacuum.

That's separate from the policy for adding tables to the queue; THAT
would sensibly be based on the number of dead tuples; the current
policy of autovacuum seems not unreasonable...

> And how will you define the worker thread count policy ? Always 1
> worker per category, or you can define the number of threads in the
> 3 categories ? Or you still have in mind time window policies with
> allowed number of threads per worker category ? (those numbers could
> be 0 to disable a a worker category).

It would make a lot of sense to have time ranges that would indicate
when different values were wanted.  Good question...

> Other thing, how will the vacuum queue be populated ? Or the "queue"
> here means nothing, all workers will always go through all tables to
> pick one based on their own criteria ? My concern here is that the
> current way of checking 1 DB per minute is not going to work with
> category #2 tables, they really have to be vacuumed continuously
> sometimes.

I think it makes considerable sense to have a queue table for this.

Having one of the threads look for new entries makes considerable
sense.

Offering the Gentle DBA the ability to add in entries based on their
special knowledge would also seem sensible.
--
(format nil "~S@~S" "cbbrowne" "gmail.com")
http://linuxdatabases.info/info/slony.html
Keeping instructions  and operands  in  different memories  saves  .20
(.09) microseconds.

В списке pgsql-general по дате отправления:

Предыдущее
От: "Mike Poe"
Дата:
Сообщение: Question - Query based on WHERE OR
Следующее
От: Hannes Dorbath
Дата:
Сообщение: Cluster all tables in database to PK index