Обсуждение: autovacuum maintenance_work_mem

Поиск
Список
Период
Сортировка

autovacuum maintenance_work_mem

От
Alvaro Herrera
Дата:
Magnus was just talking to me about having a better way of controlling
memory usage on autovacuum.  Instead of each worker using up to
maintenance_work_mem, which ends up as a disaster when DBA A sets to a
large value and DBA B raises autovacuum_max_workers, we could simply
have an "autovacuum_maintenance_memory" setting (name TBD), that defines
the maximum amount of memory that autovacuum is going to use regardless
of the number of workers.

So for the initial implementation, we could just have each worker set
its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers.
That way there's never excessive memory usage.

This implementation is not ideal, because most of the time they wouldn't
use that much memory, and so vacuums could be slower.  But I think it's
better than what we currently have.

Thoughts?

(A future implementation could improve things by using something like
the balancing code we have for cost_delay.  But I don't want to go there
now.)

-- 
Álvaro Herrera <alvherre@alvh.no-ip.org>


Re: autovacuum maintenance_work_mem

От
Itagaki Takahiro
Дата:
On Wed, Nov 17, 2010 at 01:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> So for the initial implementation, we could just have each worker set
> its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers.
> That way there's never excessive memory usage.

It sounds reasonable, but is there the same issue for normal connections?
We can limit max connections per user, but there are no quota for total
memory consumed by the user. It might not be an autovacuum-specifix issue.

-- 
Itagaki Takahiro


Re: autovacuum maintenance_work_mem

От
Tom Lane
Дата:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> On Wed, Nov 17, 2010 at 01:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>> So for the initial implementation, we could just have each worker set
>> its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers.
>> That way there's never excessive memory usage.

> It sounds reasonable, but is there the same issue for normal connections?
> We can limit max connections per user, but there are no quota for total
> memory consumed by the user. It might not be an autovacuum-specifix issue.

I agree with Itagaki-san: this isn't really autovacuum's fault.

Another objection to the above approach is that anytime you have fewer
than max_workers AV workers, you're not using the memory well.  And not
using the memory well has a *direct* cost in terms of increased effort,
ie, extra indexscans.  So this isn't something to mess with lightly.

I can see the possible value of decoupling autovacuum's setting from
foreground operations, though.  What about creating
autovacuum_maintenance_mem but defining it as being the
maintenance_work_mem setting that each AV worker can use?  If people
can't figure out that the total possible hit is maintenance_work_mem
times max_workers, their license to use a text editor should be revoked.
        regards, tom lane


Re: autovacuum maintenance_work_mem

От
Heikki Linnakangas
Дата:
On 16.11.2010 18:12, Alvaro Herrera wrote:
> Thoughts?

Sounds reasonable, but you know what would be even better? Use less 
memory in vacuum, so that it doesn't become an issue to begin with. 
There was some discussion on that back in 2007 
(http://archives.postgresql.org/pgsql-hackers/2007-02/msg01814.php). 
That seems like low-hanging fruit, it should be simple to switch to more 
compact representation. I believe you could easily more than half the 
memory consumption in typical scenarios.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: autovacuum maintenance_work_mem

От
Robert Haas
Дата:
On Tue, Nov 16, 2010 at 11:12 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Magnus was just talking to me about having a better way of controlling
> memory usage on autovacuum.  Instead of each worker using up to
> maintenance_work_mem, which ends up as a disaster when DBA A sets to a
> large value and DBA B raises autovacuum_max_workers, we could simply
> have an "autovacuum_maintenance_memory" setting (name TBD), that defines
> the maximum amount of memory that autovacuum is going to use regardless
> of the number of workers.
>
> So for the initial implementation, we could just have each worker set
> its local maintenance_work_mem to autovacuum_maintenance_memory / max_workers.
> That way there's never excessive memory usage.
>
> This implementation is not ideal, because most of the time they wouldn't
> use that much memory, and so vacuums could be slower.  But I think it's
> better than what we currently have.
>
> Thoughts?

I'm a little skeptical about creating more memory tunables.  DBAs who
are used to previous versions of PG will find that their vacuum is now
really slow, because they adjusted maintenance_work_mem but not this
new parameter.  If we could divide up the vacuum memory intelligently
between the workers in some way, that would be a win.  But just
creating a different variable that controls the same thing in
different units doesn't seem to add much.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: autovacuum maintenance_work_mem

От
Josh Berkus
Дата:
On 11/16/10 9:27 AM, Robert Haas wrote:
> I'm a little skeptical about creating more memory tunables.  DBAs who
> are used to previous versions of PG will find that their vacuum is now
> really slow, because they adjusted maintenance_work_mem but not this

Also, generally people who are using autovacuum don't do much manual
vacuuming, and when they do, it's easy enough to do a SET before you
issue the VACUUM statement.

So, -1 for yet another GUC.

> new parameter.  If we could divide up the vacuum memory intelligently
> between the workers in some way, that would be a win.  But just
> creating a different variable that controls the same thing in
> different units doesn't seem to add much.

Actually, that's not unreasonable.  The difficulty with allocating
work_mem out of a pool involves concurrency, but use of maint_work_mem
is very low-concurrency; it wouldn't be that challenging to have the
autovac workers pull from a pool of preset size instead of each being
allocated the full maint_work_mem.  And that would help with over/under
allocation of memory.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: autovacuum maintenance_work_mem

От
Robert Haas
Дата:
On Tue, Nov 16, 2010 at 1:36 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 11/16/10 9:27 AM, Robert Haas wrote:
>> I'm a little skeptical about creating more memory tunables.  DBAs who
>> are used to previous versions of PG will find that their vacuum is now
>> really slow, because they adjusted maintenance_work_mem but not this
>
> Also, generally people who are using autovacuum don't do much manual
> vacuuming, and when they do, it's easy enough to do a SET before you
> issue the VACUUM statement.
>
> So, -1 for yet another GUC.
>
>> new parameter.  If we could divide up the vacuum memory intelligently
>> between the workers in some way, that would be a win.  But just
>> creating a different variable that controls the same thing in
>> different units doesn't seem to add much.
>
> Actually, that's not unreasonable.  The difficulty with allocating
> work_mem out of a pool involves concurrency, but use of maint_work_mem
> is very low-concurrency; it wouldn't be that challenging to have the
> autovac workers pull from a pool of preset size instead of each being
> allocated the full maint_work_mem.  And that would help with over/under
> allocation of memory.

I think the difficulty is figuring out what to get the existing
workers to give us some memory when a new one comes along.  You want
the first worker to potentially use ALL the memory... until worker #2
arrives.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: autovacuum maintenance_work_mem

От
"Joshua D. Drake"
Дата:
On Tue, 2010-11-16 at 10:36 -0800, Josh Berkus wrote:
> On 11/16/10 9:27 AM, Robert Haas wrote:
> > I'm a little skeptical about creating more memory tunables.  DBAs who
> > are used to previous versions of PG will find that their vacuum is now
> > really slow, because they adjusted maintenance_work_mem but not this
> 
> Also, generally people who are using autovacuum don't do much manual
> vacuuming, and when they do, it's easy enough to do a SET before you
> issue the VACUUM statement.
> 
> So, -1 for yet another GUC.

Agreed. If we are going to do anything, it should be pushed to a per
object level (at least table) with ALTER TABLE. We don't need yet
another global variable.

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: autovacuum maintenance_work_mem

От
Josh Berkus
Дата:
> I think the difficulty is figuring out what to get the existing
> workers to give us some memory when a new one comes along.  You want
> the first worker to potentially use ALL the memory... until worker #2
> arrives.

Yeah, doing this would mean that you couldn't give worker #1 all the
memory, because on most OSes it can't release the memory even if it
wants to.

The unintelligent way to do this is to do it by halves, which is what I
think Oracle 8 did for sort memory, and what the old time-sharing
systems used to do.  That is, the first worker can use up to 50%.   The
second worker gets up to 25%, or what's left over from the first worker
+ 25%, whichever is greater, up to 50%.  The third worker gets similar,
up to max_workers.  This kind of a system actually works pretty well,
*except* that it causes underallocation in the case (the most common
one, I might add) where only one worker is needed.

The intelligent way would be for Postgres to track the maximum
concurrent workers which have been needed in the past, and allocate
(again, by halves) based on that number.

Either of these systems would require some new registers in shared
memory to track such things.

Relevant to this is the question: *when* does vacuum do its memory
allocation?  Is memory allocation reasonably front-loaded, or does
vacuum keep grabbing more RAM until it's done?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: autovacuum maintenance_work_mem

От
Alvaro Herrera
Дата:
Excerpts from Josh Berkus's message of mar nov 16 15:52:14 -0300 2010:
> 
> > I think the difficulty is figuring out what to get the existing
> > workers to give us some memory when a new one comes along.  You want
> > the first worker to potentially use ALL the memory... until worker #2
> > arrives.
> 
> Yeah, doing this would mean that you couldn't give worker #1 all the
> memory, because on most OSes it can't release the memory even if it
> wants to.

Hmm, good point.

> Relevant to this is the question: *when* does vacuum do its memory
> allocation?  Is memory allocation reasonably front-loaded, or does
> vacuum keep grabbing more RAM until it's done?

All at start.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: autovacuum maintenance_work_mem

От
Josh Berkus
Дата:
>> Relevant to this is the question: *when* does vacuum do its memory
>> allocation?  Is memory allocation reasonably front-loaded, or does
>> vacuum keep grabbing more RAM until it's done?
> 
> All at start.

That means that "allocation by halves" would work fine.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: autovacuum maintenance_work_mem

От
Bruce Momjian
Дата:
Josh Berkus wrote:
> 
> > I think the difficulty is figuring out what to get the existing
> > workers to give us some memory when a new one comes along.  You want
> > the first worker to potentially use ALL the memory... until worker #2
> > arrives.
> 
> Yeah, doing this would mean that you couldn't give worker #1 all the
> memory, because on most OSes it can't release the memory even if it
> wants to.

FYI, what normally happens in this case is that the memory is pushed to
swap by the kernel and never paged in from swap.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +