Re: [GENERAL] Autovacuum Improvements

Поиск
Список
Период
Сортировка
От Steve Atkins
Тема Re: [GENERAL] Autovacuum Improvements
Дата
Msg-id 46EC7F08-54DE-49AD-A546-CB2824E8092D@blighty.com
обсуждение исходный текст
Ответ на Re: [GENERAL] Autovacuum Improvements  (Richard Huxton <dev@archonet.com>)
Список pgsql-hackers
On Jan 22, 2007, at 11:16 AM, Richard Huxton wrote:

> Bruce Momjian wrote:
>> Yep, agreed on the random I/O issue.  The larger question is if
>> you have
>> a huge table, do you care to reclaim 3% of the table size, rather
>> than
>> just vacuum it when it gets to 10% dirty?  I realize the vacuum is
>> going
>> to take a lot of time, but vacuuming to relaim 3% three times
>> seems like
>> it is going to be more expensive than just vacuuming the 10%
>> once.  And
>> vacuuming to reclaim 1% ten times seems even more expensive.  The
>> partial vacuum idea is starting to look like a loser to me again.
>
> Buying a house with a 25-year mortgage is much more expensive than
> just paying cash too, but you don't always have a choice.
>
> Surely the key benefit of the partial vacuuming thing is that you
> can at least do something useful with a large table if a full
> vacuum takes 24 hours and you only have 4 hours of idle I/O.
>
> It's also occurred to me that all the discussion of scheduling way
> back when isn't directly addressing the issue. What most people
> want (I'm guessing) is to vacuum *when the user-workload allows*
> and the time-tabling is just a sysadmin first-approximation at that.

Yup. I'd really like for my app to be able to say "Hmm. No
interactive users at the moment, no critical background tasks. Now
would be a really good time for the DB to do some maintenance." but
also to be able to interrupt the maintenance process if some new
users or other system load show up.

> With partial vacuuming possible, we can arrange things with just
> three thresholds and two measurements:
>   Measurement 1 = system workload
>   Measurement 2 = a per-table "requires vacuuming" value
>   Threshold 1 = workload at which we do more vacuuming
>   Threshold 2 = workload at which we do less vacuuming
>   Threshold 3 = point at which a table is considered worth vacuuming.
> Once every 10 seconds, the manager compares the current workload to
> the thresholds and starts a new vacuum, kills one or does nothing.
> New vacuum processes keep getting started as long as there is
> workload spare and tables that need vacuuming.
>
> Now the trick of course is how you measure system workload in a
> meaningful manner.

I'd settle for a "start maintenance", "stop maintenance" API.
Anything else (for instance the heuristics you suggest above) would
definitely be gravy.

It's not going to be simple to do, though, I don't think.

Cheers,
   Steve


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: autovacuum process handling
Следующее
От:
Дата:
Сообщение: Re: 10 weeks to feature freeze (Pending Work)