Обсуждение: Remove autovacuum GUC?
Hello, After all these years, we are still regularly running into people who say, "performance was bad so we disabled autovacuum". I am not talking about once in a while, it is often. I would like us to consider removing the autovacuum option. Here are a few reasons: 1. It does not hurt anyone 2. It removes a foot gun 3. Autovacuum is *not* optional, we shouldn't let it be 4. People could still disable it at the table level for those tables that do fall into the small window of, no maintenance is o.k. 5. People would still have the ability to decrease the max_workers to 1 (although I could argue about that too). Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
On Wed, Oct 19, 2016 at 9:24 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > After all these years, we are still regularly running into people who say, > "performance was bad so we disabled autovacuum". I am not talking about once > in a while, it is often. I would like us to consider removing the autovacuum > option. Here are a few reasons: > > 1. It does not hurt anyone > 2. It removes a foot gun > 3. Autovacuum is *not* optional, we shouldn't let it be > 4. People could still disable it at the table level for those tables that do > fall into the small window of, no maintenance is o.k. > 5. People would still have the ability to decrease the max_workers to 1 > (although I could argue about that too). Setting autovacuum=off is at least useful for testing purposes and I've used it that way. On the other hand, I haven't seen a customer disable this unintentionally in years. Generally, the customers I've worked with have found subtler ways of hosing themselves with autovacuum. One of my personal favorites is autovacuum_naptime='1 d' -- for the record, that did indeed work out very poorly. I think that this the kind of problem that can only properly be solved by education. If somebody thinks that they want to turn off autovacuum, and you keep them from turning it off, they just get frustrated. Sometimes, they then find a back-door way of getting what they want, like setting up a script to kill it whenever it starts, or changing the other thresholds so that it barely ever runs. But whether they resort to such measures or not, there is no real chance that they will be happy with PostgreSQL. And why should they be? It doesn't let them configure the settings that they want to configure. When some other program doesn't let me do what I want, I decide it's stupid. Pretty much the same thing here. The only way you actually get out from under this problem is by teaching people the right way to think about the settings they're busy misconfiguring. I'd actually rather go the other way with this and add a new autovacuum setting, autovacuum=really_off, that doesn't let autovacuum run even for wraparound. For example, let's say I've just recovered a badly damaged cluster using pg_resetxlog. I want to start it up and try to recover my data. I do *not* want VACUUM to decide to start removing things that I'm trying to recover. But there's no way to guarantee that today. So, you can't start up the cluster, look around, and then shut it down with the intent to change the next transaction ID if it's not right. Your data will be disappearing underneath you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/20/2016 07:12 AM, Robert Haas wrote: > On Wed, Oct 19, 2016 at 9:24 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > > Setting autovacuum=off is at least useful for testing purposes and > I've used it that way. On the other hand, I haven't seen a customer > disable this unintentionally in years. Generally, the customers I've > worked with have found subtler ways of hosing themselves with > autovacuum. One of my personal favorites is autovacuum_naptime='1 d' > -- for the record, that did indeed work out very poorly. Yes, I have seen that as well and you are right, it ends poorly. > > I think that this the kind of problem that can only properly be solved > by education. If somebody thinks that they want to turn off > autovacuum, and you keep them from turning it off, they just get > frustrated. Sometimes, they then find a back-door way of getting what I think I am coming at this from a different perspective than the -hackers. Let me put this another way. The right answer isn't the answer founded in the reality for many if not most of our users. What do I mean by that? I mean that the right answer for -hackers isn't necessarily the right answer for users. Testing? Users don't test. They deploy. Education? If most people read the docs, CMD and a host of other companies would be out of business. I am not saying I have the right solution but I am saying I think we need a *different* solution. Something that limits a *USERS* choice to turn off autovacuum. If -hackers need testing or enterprise developers need testing, let's account for that but for the user that says this: My machine/instance bogs down every time autovacuum runs, oh I can turn it off.... Let's fix *that* problem. Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
On Thu, Oct 20, 2016 at 11:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote: > The right answer isn't the answer founded in the reality for many if not > most of our users. I think that's high-handed nonsense. Sure, there are some unsophisticated users who do incredibly stupid things and pay the price for it, but there are many users who are very sophisticated and make good decisions and don't want or need the system itself to act as a nanny. When we constrain the range of possible choices because somebody might do the wrong thing, sophisticated users are hurt because they can no longer make meaningful choices, and stupid users still get in trouble because that's what being stupid does. The only way to fix that is to help people be less stupid. You can't tell adult users running enterprise-grade software "I'm sorry, Dave, I can't do that". Or at least it can cause a few problems. > I mean that the right answer for -hackers isn't necessarily the right answer > for users. Testing? Users don't test. They deploy. Education? If most people > read the docs, CMD and a host of other companies would be out of business. I've run into these kinds of situations, but I know for a fact that there are quite a few EnterpriseDB customers who test extremely thoroughly, read the documentation in depth, and really understand the system at a very deep level. I can't say whether the majority of our customers fall into that category, but we certainly spend a lot of time working with the ones who do. > I am not saying I have the right solution but I am saying I think we need a > *different* solution. Something that limits a *USERS* choice to turn off > autovacuum. If -hackers need testing or enterprise developers need testing, > let's account for that but for the user that says this: > > My machine/instance bogs down every time autovacuum runs, oh I can turn it > off.... And I've seen customers solve real production problems by doing exactly that, and scheduling vacuums manually. Have I seen people cause bigger problems than the ones they were trying to solve? Yes. Have I recommended something a little less aggressive than completely shutting autovacuum off as perhaps being a better solution? Of course. But I'm not going to sit here and tell somebody "well, you know, what you are doing is working whereas the old thing was not working, but trust me, the way that didn't work was way better...". > Let's fix *that* problem. I will say again that I do not think that problem has a technical solution. It is a problem of knowledge, not technology. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 20, 2016 at 9:24 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Oct 20, 2016 at 11:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote: >> The right answer isn't the answer founded in the reality for many if not >> most of our users. > > I think that's high-handed nonsense. Sure, there are some > unsophisticated users who do incredibly stupid things and pay the > price for it, but there are many users who are very sophisticated and > make good decisions and don't want or need the system itself to act as > a nanny. When we constrain the range of possible choices because > somebody might do the wrong thing, sophisticated users are hurt > because they can no longer make meaningful choices, and stupid users > still get in trouble because that's what being stupid does. The only > way to fix that is to help people be less stupid. You can't tell > adult users running enterprise-grade software "I'm sorry, Dave, I > can't do that". Or at least it can cause a few problems. +1 I really don't like this paternalistic mindset. -- Peter Geoghegan
On 10/20/2016 09:24 AM, Robert Haas wrote: > On Thu, Oct 20, 2016 at 11:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote: >> The right answer isn't the answer founded in the reality for many if not >> most of our users. > > I think that's high-handed nonsense. Sure, there are some > unsophisticated users who do incredibly stupid things and pay the > price for it, but there are many users who are very sophisticated and > make good decisions and don't want or need the system itself to act as > a nanny. When we constrain the range of possible choices because That argument suggests we shouldn't have autovacuum :P > somebody might do the wrong thing, sophisticated users are hurt > because they can no longer make meaningful choices, and stupid users > still get in trouble because that's what being stupid does. The only > way to fix that is to help people be less stupid. You can't tell > adult users running enterprise-grade software "I'm sorry, Dave, I > can't do that". Or at least it can cause a few problems. As mentioned in an other reply, I am not suggesting we can't turn off autovacuum as a whole. Heck, I even suggested being able to turn it off per database (versus just per table). I am suggesting that we get rid of a foot gun for unsophisticated (and thus majority of our users). > >> I mean that the right answer for -hackers isn't necessarily the right answer >> for users. Testing? Users don't test. They deploy. Education? If most people >> read the docs, CMD and a host of other companies would be out of business. > > I've run into these kinds of situations, but I know for a fact that > there are quite a few EnterpriseDB customers who test extremely > thoroughly, read the documentation in depth, and really understand the > system at a very deep level. Those aren't exactly the users we are talking about are we? I also run into those users all the time. > > And I've seen customers solve real production problems by doing > exactly that, and scheduling vacuums manually. Have I seen people 1 != 10 > cause bigger problems than the ones they were trying to solve? Yes. > Have I recommended something a little less aggressive than completely > shutting autovacuum off as perhaps being a better solution? Of > course. But I'm not going to sit here and tell somebody "well, you > know, what you are doing is working whereas the old thing was not > working, but trust me, the way that didn't work was way better...". > I find it interesting that we are willing to do that every time we add a feature but once we have that feature it is like pulling teeth to show the people that implemented those features that some people don't think it was better :P Seriously though. I am only speaking from experience from 20 years of customers. CMD also has customers just like yours but we also run into lots and lots of people that still do really silly things like we have already discussed. Sincerely, Joshua D. Drake -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
Hello, What about a simpler solution to all of this. Let's just remove it from postgresql.conf. Out of sight. If someone needs to test they can but a uneducated user won't immediately know what to do about that "autovacuum process" and when they look it up the documentation is exceedingly blunt about why to *not* turn it off. Sincerely, JD
On Thu, Oct 20, 2016 at 12:32 PM, Joshua D. Drake <jd@commandprompt.com> wrote: > That argument suggests we shouldn't have autovacuum :P It certainly does not. That, too, would be removing a useful option. In fact, it would be removing the most useful option that is the right choice for most users in 99% of cases. > As mentioned in an other reply, I am not suggesting we can't turn off > autovacuum as a whole. Heck, I even suggested being able to turn it off per > database (versus just per table). I am suggesting that we get rid of a foot > gun for unsophisticated (and thus majority of our users). It has to be possible to shut it off in postgresql.conf, before starting the server. Anything per-database wouldn't have that characteristic. >> I've run into these kinds of situations, but I know for a fact that >> there are quite a few EnterpriseDB customers who test extremely >> thoroughly, read the documentation in depth, and really understand the >> system at a very deep level. > > Those aren't exactly the users we are talking about are we? I also run into > those users all the time. Well, we can't very well ship the autovacuum option only to the smart customers and remove it for the dumb ones, can we? The option either exists or it doesn't. > 1 != 10 I concede the truth of that statement, but not whatever it is you intend to imply thereby. > I find it interesting that we are willing to do that every time we add a > feature but once we have that feature it is like pulling teeth to show the > people that implemented those features that some people don't think it was > better :P Well, we don't allow much dumb stuff to get added in the first place, so there isn't much to take out. I'm not trying to argue that we're perfect here. There are certainly changes I'd like to make that other people oppose and, well, I think they are wrong. And I know I'm wrong about some things, too: I just don't know which things, or I'd change my mind about just those. > Seriously though. I am only speaking from experience from 20 years of > customers. CMD also has customers just like yours but we also run into lots > and lots of people that still do really silly things like we have already > discussed. I think it would be better to confine this thread to the specific issue of whether removing the autovacuum GUC is a good idea rather than turning it into a referendum on whether you are an experience PostgreSQL professional. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
<p dir="ltr">On 21 Oct. 2016 12:57 am, "Joshua D. Drake" <<a href="mailto:jd@commandprompt.com">jd@commandprompt.com</a>>wrote:<br /> ><br /> > Hello,<br /> ><br /> >What about a simpler solution to all of this. Let's just remove it from postgresql.conf. Out of sight. If someone needsto test they can but a uneducated user won't immediately know what to do about that "autovacuum process" and when theylook it up the documentation is exceedingly blunt about why to *not* turn it off.<p dir="ltr">Then they'll just do whatI've seen at multiple sites: create a cron job that kills it as soon as it starts. Then their DB performance issuesgo away ... for a while. By the time they're forced to confront it their DB is immensely listed and barely staggeringalong, or has reached wraparound shutdown. So we get the fun job of trying to fix it using special freeze toolsetc because they broke the normal ones...<p dir="ltr">We still have fsync=off available. If you want a user foot gunto crusade against, start there. Even that's useful and legitimate though I wish it were called enable_crash_safety =off. It's legit to use it in testing, in data ingestion where you'll fsync afterward, in cloud deployments where you relyon replication and the whole instance gets nuked if it crashes anyway.<p dir="ltr">There are similarly legit reasonsto turn autovac off but the consequences are less bad.<p dir="ltr">Personally what I think is needed here is to makemonitoring and bloat visibility not completely suck. So we can warn users if tables haven't been vac'd in ages and haverecent churn. And so they can easily SELECT a view to get bloat estimates with an estimate of how much drift there could'vebeen since last vacuum.<p dir="ltr">Users turn off vacuum because they cannot see that it is doing anything exceptwasting I/O and cpu. So:<p dir="ltr">* A TL;DR in the docs saying what vac does and why not to turn it off. In particularwarning that turning autovac off will make a slow SB get slower even though it seems to help at first.<p dir="ltr">*A comment in the conf file with the same TL;DR. Comments are free, let's use a few lines. <p dir="ltr">* Warnon startup when autovac is off?<p dir="ltr">Personally I wouldn't mind encouraging most users to prefer table or db levelautovac controls. Though we really need to make them more visible. If that improved I wouldn't really mind removingthe global autovac option from the conf file though I'd prefer to just give it a decent comment.
On 10/20/16 11:50 PM, Craig Ringer wrote: > Personally what I think is needed here is to make monitoring and bloat > visibility not completely suck. So we can warn users if tables haven't > been vac'd in ages and have recent churn. And so they can easily SELECT > a view to get bloat estimates with an estimate of how much drift there > could've been since last vacuum. +10. I've seen people spend a bunch of time screwing around with the 2 major "bloat queries", both of which have some issues. And there is *no* way to actually quantify whether autovac is keeping up with things or not. > Users turn off vacuum because they cannot see that it is doing anything > except wasting I/O and cpu. So: > > * A TL;DR in the docs saying what vac does and why not to turn it off. > In particular warning that turning autovac off will make a slow SB get > slower even though it seems to help at first. IMHO we should also suggest that for users that have periods of lower activity that they run a manual vacuum. That reduces the odds of autovac ruining your day unexpectedly, as well as allowing it to to focus on high-velocity tables that need more vacuuming and not on huge tables that just happened to surpass their threshold during a site busy period. > * A comment in the conf file with the same TL;DR. Comments are free, > let's use a few lines. > > * Warn on startup when autovac is off? Well, I suspect that someone who thinks autovac=off is a good idea probably doesn't monitor their logs either, but it wouldn't hurt. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461