Обсуждение: Adjust autovacuum naptime automatically
Hi hackers, There is a comment in autovacuum.c: | XXX todo: implement sleep scale factor that existed in contrib code. and the attached is a patch to implement it. In contrib code, sleep scale factor was used to adjust naptime only to lengthen the naptime. But I changed the behavior to be able to shorten it. In the case of a heavily update workload, the default naptime (60 seconds) is too long to keep the number of dead tuples low. With my patch, the naptime will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps) with default other autovacuum parameters. I have something that I want to discuss with you: - Can we use the process-exitcode to make autovacuum daemon to communicate with postmaster? I used it to notify there are any vacuum jobs or not. - I removed autovacuum_naptime guc variable, because it is adjusted automatically now. Is it appropriate? Comments welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Вложения
ITAGAKI Takahiro wrote: > In the case of a heavily update workload, the default naptime (60 seconds) > is too long to keep the number of dead tuples low. With my patch, the naptime > will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps) > with default other autovacuum parameters. Interesting. To be frank I don't know what the sleep scale factor was supposed to do. > I have something that I want to discuss with you: > - Can we use the process-exitcode to make autovacuum daemon to communicate > with postmaster? I used it to notify there are any vacuum jobs or not. I can only tell you we do this is Mammoth Replicator and it works for us. Whether this is a very good idea, I don't know. I didn't find any other means to communicate stuff from dying processes to the postmaster. > - I removed autovacuum_naptime guc variable, because it is adjusted > automatically now. Is it appropriate? I think we should provide the user with a way to stop the naptime from changing at all. Eventually we will have the promised "maintenance windows" feature which will mean the user will not have to worry at all about the naptime, but in the meantime I think we should keep it. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > ITAGAKI Takahiro wrote: > >> In the case of a heavily update workload, the default naptime (60 seconds) >> is too long to keep the number of dead tuples low. With my patch, the naptime >> will be adjusted around 3 seconds at the case of pgbench (scale=10, 80 tps) >> with default other autovacuum parameters. What is this based on? That is, based on what information is it deciding to reduce the naptime? > Interesting. To be frank I don't know what the sleep scale factor was > supposed to do. I'm not sure that sleep scale factor is a good idea or not at this point, but what I was thinking back in the day when i originally wrote the contrib autovacuum is that I didn't want the system to get bogged down constantly vacuuming. So, if it just spent a long time working on one database, it would sleep for long time. Given that we can now specify the vacuum cost delay settings for autovacuum and disable tables and everything else, I'm not sure we this anymore, at least not as it was originally designed. It sounds like Itagaki is doing things a little different with his patch, but I'm not sure I understand it. >> - I removed autovacuum_naptime guc variable, because it is adjusted >> automatically now. Is it appropriate? > > I think we should provide the user with a way to stop the naptime from > changing at all. Eventually we will have the promised "maintenance > windows" feature which will mean the user will not have to worry at all > about the naptime, but in the meantime I think we should keep it. I'm not sure that's true. I believe we will want the naptime GUC option even after we have the maintenance window. I think we might ignore the naptime during the maintenance window, but even after we have the maintenance window, we will still vacuum during the day as required. My vision of the maintenance window has always been very simple, that is, during the maintenance window the thresholds get reduced by some factor (probably a GUC variable) so during the day it might take 10000 updates on a table to cause a vacuum but during the naptime it might be 10% of that, 1000. Is this in-line with what others were thinking?
Autovacuum maintenance window (was Re: Adjust autovacuum naptime automatically)
От
Alvaro Herrera
Дата:
Matthew T. O'Connor wrote: > My vision of the maintenance window has always been very simple, that > is, during the maintenance window the thresholds get reduced by some > factor (probably a GUC variable) so during the day it might take 10000 > updates on a table to cause a vacuum but during the naptime it might be > 10% of that, 1000. Is this in-line with what others were thinking? My vision is a little more complex than that. You define group of tables, and separately you define time intervals. For each combination of group and interval you can configure certain parameters, like a multiplier for the autovacuum thresholds and factors; and also the "enable" bit. So you can disable vacuum for some intervals, and refine the equation factors for some others. This is all configured in tables, not in GUC, so you have more flexibility in choosing stuff for different groups of tables (say, you really want the small-but-high-update tables to be still vacuumed even during peak periods, but you don't want that big fat table to be vacuumed at all during the same period). I had intended to work on this during the code sprint, but got distracted. I intend to do it for 8.3 instead. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Re: [HACKERS] Autovacuum maintenance window (was Re: Adjust autovacuum
От
"Matthew T. O'Connor"
Дата:
Alvaro Herrera wrote: > My vision is a little more complex than that. You define group of > tables, and separately you define time intervals. For each combination > of group and interval you can configure certain parameters, like a > multiplier for the autovacuum thresholds and factors; and also the > "enable" bit. So you can disable vacuum for some intervals, and refine > the equation factors for some others. This is all configured in tables, > not in GUC, so you have more flexibility in choosing stuff for different > groups of tables (say, you really want the small-but-high-update tables > to be still vacuumed even during peak periods, but you don't want that > big fat table to be vacuumed at all during the same period). That sounds good. I worry a bit that it's going to get overly complex. I suppose if we create the concept of a default window that all new tables will be automatically be added to when created, then out of the box we can create 1 default 24 hour maintenance window that would effectively give us the same functionality we have now. Could we also use these groups to be used for concurrent vacuums? That is autovacuum will loop through each group of tables independently thus allowing multiple simultaneous vacuums on different tables and giving us a solution to the constantly updated table problem.