Re: Deadline-Based Vacuum Delay

Поиск
Список
Период
Сортировка
От Galy Lee
Тема Re: Deadline-Based Vacuum Delay
Дата
Msg-id 459E1E49.7030203@oss.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Deadline-Based Vacuum Delay  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Deadline-Based Vacuum Delay  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tom Lane wrote:> I think the context for this is that you have an agreed-on maintenance> window, say extending from 2AM
to6AM local time, and you want to get> all your vacuuming done in that window without undue spikes in the> system load
(becauseyou do still have live users then, just not as many> as during prime time).  If there were a decent way to
estimatethe> amount of work to be done then it'd be possible to spread the work> fairly evenly across the window.  What
Ido not see is where you get> that estimate from --- especially since you probably have more than one> table to vacuum
inyour window.
 

It is true that there is not a decent way to estimate the amount of work 
to be done. But the purpose in here is not “spread the vacuum over 6 
hours exactly”, it is “finish vacuum within 6 hours, and spread the 
spikes as much as possible”. So the maximum estimation of the work is 
enough to refine the vacuum within the window, it is fine if vacuum run 
quickly than schedule. Also we don’t need to estimate the time of 
vacuum, we only need to compare the actual progress of time window and 
the progress of the work, and then adjust them to have the same pace in 
the delay point.

The maximum of the work of vacuum can be estimated by size of the heap, 
the size of the index, and the number of dead tuples. For example the 
lazy vacuum has the following works: 1. scan heap 2. vacuum index 3. vacuum heap 4. truncate heap
Although 2 and 4 are quite unpredictable, but the total amount of work 
including 1, 2, 3, and 4 can be estimated.
> The other problem is that "vacuum only during a maintenance window"> doesn't seem all that compelling a policy
anyway. We see a lot of> examples of tables that need to be vacuumed much more often than once> a day.  So I'd rather
puteffort into making sure that vacuum can be run> in the background even under high load, instead of designing around
a>maintenance-window assumption.
 

This feature is not necessary has a maintenance window assumption. For 
example, if a table needs to be vacuumed every 3 hours to sweep the 
garbage, then instead of tuning cost delay GUC hardly to refine vacuum 
in 3 hours, we can make vacuum finish within the time frame by “VACUUM 
IN time” feature.

If we can find a good way to tune the cost delay GUC to enable vacuum to 
catch up with the speed of garbage generation in the high frequency 
update system, then we won’t need this feature. For example, the 
interval of two vacuums can be estimated by tracking the speed of the 
dead tuple generation, but how can you tune the vacuum time to fit in 
the interval of two vacuums? It seems that there is not easy to tune the 
delay time of vacuum correctly.

Best Regards
-- 
Galy Lee <lee.galy _at_ oss.ntt.co.jp>
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Zeugswetter Andreas ADI SD"
Дата:
Сообщение: Re: proposal - new SPI cursor function
Следующее
От: Stefan Kaltenbrunner
Дата:
Сообщение: Re: ideas for auto-processing patches