Обсуждение: Remove autovacuum GUC?

Поиск
Список
Период
Сортировка

Remove autovacuum GUC?

От
"Joshua D. Drake"
Дата:
Hello,

After all these years, we are still regularly running into people who 
say, "performance was bad so we disabled autovacuum". I am not talking 
about once in a while, it is often. I would like us to consider removing 
the autovacuum option. Here are a few reasons:

1. It does not hurt anyone
2. It removes a foot gun
3. Autovacuum is *not* optional, we shouldn't let it be
4. People could still disable it at the table level for those tables 
that do fall into the small window of, no maintenance is o.k.
5. People would still have the ability to decrease the max_workers to 1 
(although I could argue about that too).

Sincerely,

JD

-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.



Re: Remove autovacuum GUC?

От
Robert Haas
Дата:
On Wed, Oct 19, 2016 at 9:24 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> After all these years, we are still regularly running into people who say,
> "performance was bad so we disabled autovacuum". I am not talking about once
> in a while, it is often. I would like us to consider removing the autovacuum
> option. Here are a few reasons:
>
> 1. It does not hurt anyone
> 2. It removes a foot gun
> 3. Autovacuum is *not* optional, we shouldn't let it be
> 4. People could still disable it at the table level for those tables that do
> fall into the small window of, no maintenance is o.k.
> 5. People would still have the ability to decrease the max_workers to 1
> (although I could argue about that too).

Setting autovacuum=off is at least useful for testing purposes and
I've used it that way.  On the other hand, I haven't seen a customer
disable this unintentionally in years.  Generally, the customers I've
worked with have found subtler ways of hosing themselves with
autovacuum.  One of my personal favorites is autovacuum_naptime='1 d'
-- for the record, that did indeed work out very poorly.

I think that this the kind of problem that can only properly be solved
by education.  If somebody thinks that they want to turn off
autovacuum, and you keep them from turning it off, they just get
frustrated.  Sometimes, they then find a back-door way of getting what
they want, like setting up a script to kill it whenever it starts, or
changing the other thresholds so that it barely ever runs.  But
whether they resort to such measures or not, there is no real chance
that they will be happy with PostgreSQL.  And why should they be?  It
doesn't let them configure the settings that they want to configure.
When some other program doesn't let me do what I want, I decide it's
stupid.  Pretty much the same thing here.  The only way you actually
get out from under this problem is by teaching people the right way to
think about the settings they're busy misconfiguring.

I'd actually rather go the other way with this and add a new
autovacuum setting, autovacuum=really_off, that doesn't let autovacuum
run even for wraparound.  For example, let's say I've just recovered a
badly damaged cluster using pg_resetxlog.  I want to start it up and
try to recover my data.  I do *not* want VACUUM to decide to start
removing things that I'm trying to recover.  But there's no way to
guarantee that today.  So, you can't start up the cluster, look
around, and then shut it down with the intent to change the next
transaction ID if it's not right.  Your data will be disappearing
underneath you.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove autovacuum GUC?

От
"Joshua D. Drake"
Дата:
On 10/20/2016 07:12 AM, Robert Haas wrote:
> On Wed, Oct 19, 2016 at 9:24 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> Setting autovacuum=off is at least useful for testing purposes and
> I've used it that way.  On the other hand, I haven't seen a customer
> disable this unintentionally in years.  Generally, the customers I've
> worked with have found subtler ways of hosing themselves with
> autovacuum.  One of my personal favorites is autovacuum_naptime='1 d'
> -- for the record, that did indeed work out very poorly.

Yes, I have seen that as well and you are right, it ends poorly.

>
> I think that this the kind of problem that can only properly be solved
> by education.  If somebody thinks that they want to turn off
> autovacuum, and you keep them from turning it off, they just get
> frustrated.  Sometimes, they then find a back-door way of getting what

I think I am coming at this from a different perspective than the 
-hackers. Let me put this another way.

The right answer isn't the answer founded in the reality for many if not 
most of our users.

What do I mean by that?

I mean that the right answer for -hackers isn't necessarily the right 
answer for users. Testing? Users don't test. They deploy. Education? If 
most people read the docs, CMD and a host of other companies would be 
out of business.

I am not saying I have the right solution but I am saying I think we 
need a *different* solution. Something that limits a *USERS* choice to 
turn off autovacuum. If -hackers need testing or enterprise developers 
need testing, let's account for that but for the user that says this:

My machine/instance bogs down every time autovacuum runs, oh I can turn 
it off....

Let's fix *that* problem.

Sincerely,

JD



-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.



Re: Remove autovacuum GUC?

От
Robert Haas
Дата:
On Thu, Oct 20, 2016 at 11:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
> The right answer isn't the answer founded in the reality for many if not
> most of our users.

I think that's high-handed nonsense.  Sure, there are some
unsophisticated users who do incredibly stupid things and pay the
price for it, but there are many users who are very sophisticated and
make good decisions and don't want or need the system itself to act as
a nanny.  When we constrain the range of possible choices because
somebody might do the wrong thing, sophisticated users are hurt
because they can no longer make meaningful choices, and stupid users
still get in trouble because that's what being stupid does.  The only
way to fix that is to help people be less stupid.  You can't tell
adult users running enterprise-grade software "I'm sorry, Dave, I
can't do that".  Or at least it can cause a few problems.

> I mean that the right answer for -hackers isn't necessarily the right answer
> for users. Testing? Users don't test. They deploy. Education? If most people
> read the docs, CMD and a host of other companies would be out of business.

I've run into these kinds of situations, but I know for a fact that
there are quite a few EnterpriseDB customers who test extremely
thoroughly, read the documentation in depth, and really understand the
system at a very deep level.   I can't say whether the majority of our
customers fall into that category, but we certainly spend a lot of
time working with the ones who do.

> I am not saying I have the right solution but I am saying I think we need a
> *different* solution. Something that limits a *USERS* choice to turn off
> autovacuum. If -hackers need testing or enterprise developers need testing,
> let's account for that but for the user that says this:
>
> My machine/instance bogs down every time autovacuum runs, oh I can turn it
> off....

And I've seen customers solve real production problems by doing
exactly that, and scheduling vacuums manually.  Have I seen people
cause bigger problems than the ones they were trying to solve?  Yes.
Have I recommended something a little less aggressive than completely
shutting autovacuum off as perhaps being a better solution?  Of
course.  But I'm not going to sit here and tell somebody "well, you
know, what you are doing is working whereas the old thing was not
working, but trust me, the way that didn't work was way better...".

> Let's fix *that* problem.

I will say again that I do not think that problem has a technical
solution.  It is a problem of knowledge, not technology.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove autovacuum GUC?

От
Peter Geoghegan
Дата:
On Thu, Oct 20, 2016 at 9:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Oct 20, 2016 at 11:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
>> The right answer isn't the answer founded in the reality for many if not
>> most of our users.
>
> I think that's high-handed nonsense.  Sure, there are some
> unsophisticated users who do incredibly stupid things and pay the
> price for it, but there are many users who are very sophisticated and
> make good decisions and don't want or need the system itself to act as
> a nanny.  When we constrain the range of possible choices because
> somebody might do the wrong thing, sophisticated users are hurt
> because they can no longer make meaningful choices, and stupid users
> still get in trouble because that's what being stupid does.  The only
> way to fix that is to help people be less stupid.  You can't tell
> adult users running enterprise-grade software "I'm sorry, Dave, I
> can't do that".  Or at least it can cause a few problems.

+1

I really don't like this paternalistic mindset.

-- 
Peter Geoghegan



Re: Remove autovacuum GUC?

От
"Joshua D. Drake"
Дата:
On 10/20/2016 09:24 AM, Robert Haas wrote:
> On Thu, Oct 20, 2016 at 11:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
>> The right answer isn't the answer founded in the reality for many if not
>> most of our users.
>
> I think that's high-handed nonsense.  Sure, there are some
> unsophisticated users who do incredibly stupid things and pay the
> price for it, but there are many users who are very sophisticated and
> make good decisions and don't want or need the system itself to act as
> a nanny.  When we constrain the range of possible choices because

That argument suggests we shouldn't have autovacuum :P

> somebody might do the wrong thing, sophisticated users are hurt
> because they can no longer make meaningful choices, and stupid users
> still get in trouble because that's what being stupid does.  The only
> way to fix that is to help people be less stupid.  You can't tell
> adult users running enterprise-grade software "I'm sorry, Dave, I
> can't do that".  Or at least it can cause a few problems.

As mentioned in an other reply, I am not suggesting we can't turn off 
autovacuum as a whole. Heck, I even suggested being able to turn it off 
per database (versus just per table). I am suggesting that we get rid of 
a foot gun for unsophisticated (and thus majority of our users).

>
>> I mean that the right answer for -hackers isn't necessarily the right answer
>> for users. Testing? Users don't test. They deploy. Education? If most people
>> read the docs, CMD and a host of other companies would be out of business.
>
> I've run into these kinds of situations, but I know for a fact that
> there are quite a few EnterpriseDB customers who test extremely
> thoroughly, read the documentation in depth, and really understand the
> system at a very deep level.

Those aren't exactly the users we are talking about are we? I also run 
into those users all the time.

>
> And I've seen customers solve real production problems by doing
> exactly that, and scheduling vacuums manually.  Have I seen people

1 != 10

> cause bigger problems than the ones they were trying to solve?  Yes.
> Have I recommended something a little less aggressive than completely
> shutting autovacuum off as perhaps being a better solution?  Of
> course.  But I'm not going to sit here and tell somebody "well, you
> know, what you are doing is working whereas the old thing was not
> working, but trust me, the way that didn't work was way better...".
>

I find it interesting that we are willing to do that every time we add a 
feature but once we have that feature it is like pulling teeth to show 
the people that implemented those features that some people don't think 
it was better :P

Seriously though. I am only speaking from experience from 20 years of 
customers. CMD also has customers just like yours but we also run into 
lots and lots of people that still do really silly things like we have 
already discussed.

Sincerely,

Joshua D. Drake


-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.



Re: Remove autovacuum GUC?

От
"Joshua D. Drake"
Дата:
Hello,

What about a simpler solution to all of this. Let's just remove it from 
postgresql.conf. Out of sight. If someone needs to test they can but a 
uneducated user won't immediately know what to do about that "autovacuum 
process" and when they look it up the documentation is exceedingly blunt 
about why to *not* turn it off.

Sincerely,

JD



Re: Remove autovacuum GUC?

От
Robert Haas
Дата:
On Thu, Oct 20, 2016 at 12:32 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
> That argument suggests we shouldn't have autovacuum :P

It certainly does not.  That, too, would be removing a useful option.
In fact, it would be removing the most useful option that is the right
choice for most users in 99% of cases.

> As mentioned in an other reply, I am not suggesting we can't turn off
> autovacuum as a whole. Heck, I even suggested being able to turn it off per
> database (versus just per table). I am suggesting that we get rid of a foot
> gun for unsophisticated (and thus majority of our users).

It has to be possible to shut it off in postgresql.conf, before
starting the server.  Anything per-database wouldn't have that
characteristic.

>> I've run into these kinds of situations, but I know for a fact that
>> there are quite a few EnterpriseDB customers who test extremely
>> thoroughly, read the documentation in depth, and really understand the
>> system at a very deep level.
>
> Those aren't exactly the users we are talking about are we? I also run into
> those users all the time.

Well, we can't very well ship the autovacuum option only to the smart
customers and remove it for the dumb ones, can we?  The option either
exists or it doesn't.

> 1 != 10

I concede the truth of that statement, but not whatever it is you
intend to imply thereby.

> I find it interesting that we are willing to do that every time we add a
> feature but once we have that feature it is like pulling teeth to show the
> people that implemented those features that some people don't think it was
> better :P

Well, we don't allow much dumb stuff to get added in the first place,
so there isn't much to take out.

I'm not trying to argue that we're perfect here.  There are certainly
changes I'd like to make that other people oppose and, well, I think
they are wrong.  And I know I'm wrong about some things, too: I just
don't know which things, or I'd change my mind about just those.

> Seriously though. I am only speaking from experience from 20 years of
> customers. CMD also has customers just like yours but we also run into lots
> and lots of people that still do really silly things like we have already
> discussed.

I think it would be better to confine this thread to the specific
issue of whether removing the autovacuum GUC is a good idea rather
than turning it into a referendum on whether you are an experience
PostgreSQL professional.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove autovacuum GUC?

От
Craig Ringer
Дата:
<p dir="ltr">On 21 Oct. 2016 12:57 am, "Joshua D. Drake" <<a
href="mailto:jd@commandprompt.com">jd@commandprompt.com</a>>wrote:<br /> ><br /> > Hello,<br /> ><br />
>What about a simpler solution to all of this. Let's just remove it from postgresql.conf. Out of sight. If someone
needsto test they can but a uneducated user won't immediately know what to do about that "autovacuum process" and when
theylook it up the documentation is exceedingly blunt about why to *not* turn it off.<p dir="ltr">Then they'll just do
whatI've seen at multiple sites:  create a cron job that kills it as soon as it starts. Then their DB performance
issuesgo away ... for a while. By the time they're forced to confront it their DB is immensely listed and barely
staggeringalong, or has reached wraparound shutdown. So we get the fun job of trying to fix it using special freeze
toolsetc because they broke the normal ones...<p dir="ltr">We still have fsync=off available. If you want a user foot
gunto crusade against, start there. Even that's useful and legitimate though I wish it were called enable_crash_safety
=off. It's legit to use it in testing, in data ingestion where you'll fsync afterward, in cloud deployments where you
relyon replication and the whole instance gets nuked if it crashes anyway.<p dir="ltr">There are similarly legit
reasonsto turn autovac off but the consequences are less bad.<p dir="ltr">Personally what I think is needed here is to
makemonitoring and bloat visibility not completely suck. So we can warn users if tables haven't been vac'd in ages and
haverecent churn. And so they can easily SELECT a view to get bloat estimates with an estimate of how much drift there
could'vebeen since last vacuum.<p dir="ltr">Users turn off vacuum because they cannot see that it is doing anything
exceptwasting I/O and cpu. So:<p dir="ltr">* A TL;DR in the docs saying what vac does and why not to turn it off. In
particularwarning that turning autovac off will make a slow SB get slower even though it seems to help at first.<p
dir="ltr">*A comment in the conf file with the same TL;DR. Comments are free, let's use a few lines. <p dir="ltr">*
Warnon startup when autovac is off?<p dir="ltr">Personally I wouldn't mind encouraging most users to prefer table or db
levelautovac controls. Though we really need to make them more visible. If that improved I wouldn't really mind
removingthe global autovac option from the conf file though I'd prefer to just give it a decent comment. 

Re: Remove autovacuum GUC?

От
Jim Nasby
Дата:
On 10/20/16 11:50 PM, Craig Ringer wrote:
> Personally what I think is needed here is to make monitoring and bloat
> visibility not completely suck. So we can warn users if tables haven't
> been vac'd in ages and have recent churn. And so they can easily SELECT
> a view to get bloat estimates with an estimate of how much drift there
> could've been since last vacuum.

+10. I've seen people spend a bunch of time screwing around with the 2 
major "bloat queries", both of which have some issues. And there is *no* 
way to actually quantify whether autovac is keeping up with things or not.

> Users turn off vacuum because they cannot see that it is doing anything
> except wasting I/O and cpu. So:
>
> * A TL;DR in the docs saying what vac does and why not to turn it off.
> In particular warning that turning autovac off will make a slow SB get
> slower even though it seems to help at first.

IMHO we should also suggest that for users that have periods of lower 
activity that they run a manual vacuum. That reduces the odds of autovac 
ruining your day unexpectedly, as well as allowing it to to focus on 
high-velocity tables that need more vacuuming and not on huge tables 
that just happened to surpass their threshold during a site busy period.

> * A comment in the conf file with the same TL;DR. Comments are free,
> let's use a few lines.
>
> * Warn on startup when autovac is off?

Well, I suspect that someone who thinks autovac=off is a good idea 
probably doesn't monitor their logs either, but it wouldn't hurt.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461