Обсуждение: Re: [PATCHES] smartvacuum() instead of autovacuum

Поиск
Список
Период
Сортировка

Re: [PATCHES] smartvacuum() instead of autovacuum

От
Peter Eisentraut
Дата:
Hitoshi Harada wrote:
> I am trying to implement smartvacuum(), which do vacuum only tables
> having many dead rows, instead of autovacuum.

How is this different from what autovacuum does?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: [PATCHES] smartvacuum() instead of autovacuum

От
"Hitoshi Harada"
Дата:
Hi, Peter, 

> How is this different from what autovacuum does?

My application needs to do vacuum by itself, while
autovacuum does it as daemon.
The database is updated so frequently that 
normal vacuum costs too much and tables to be updated are
not so many as the whole database is vacuumed.
I want to use autovacuum except the feature of daemon, 
but want to control when to vacuum and which table to vacuum.
So, nothing is different between autovacuum and smartvacuum(),
but former is daemon and later is user function.

c.f. I ran autovacuum before, and my batch script did vacuum while 
autovacuum did one as well on the other session at the same time. 
I found the vacuum analyze conflicts each other sometime... 
so I want to control vacuum my self.
http://archives.postgresql.org/pgsql-bugs/2002-12/msg00198.php
http://archives.postgresql.org/pgsql-general/2004-05/msg00015.php

Regards, 


Hitoshi Harada

> -----Original Message-----
> From: Peter Eisentraut [mailto:peter_e@gmx.net]
> Sent: Sunday, October 22, 2006 10:08 PM
> To: Hitoshi Harada
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [PATCHES] smartvacuum() instead of autovacuum
> 
> Hitoshi Harada wrote:
> > I am trying to implement smartvacuum(), which do vacuum only tables
> > having many dead rows, instead of autovacuum.
> 
> How is this different from what autovacuum does?
> 
> --
> Peter Eisentraut
> http://developer.postgresql.org/~petere/



Re: [PATCHES] smartvacuum() instead of autovacuum

От
Tom Lane
Дата:
"Hitoshi Harada" <hitoshi_harada@forcia.com> writes:
>> How is this different from what autovacuum does?

> My application needs to do vacuum by itself, while
> autovacuum does it as daemon.
> The database is updated so frequently that 
> normal vacuum costs too much and tables to be updated are
> not so many as the whole database is vacuumed.
> I want to use autovacuum except the feature of daemon, 
> but want to control when to vacuum and which table to vacuum.
> So, nothing is different between autovacuum and smartvacuum(),
> but former is daemon and later is user function.

This seems completely unconvincing.  What are you going to do that
couldn't be done by autovacuum?
        regards, tom lane


Re: [PATCHES] smartvacuum() instead of autovacuum

От
"Hitoshi Harada"
Дата:
Ok, 

But my point is, autovacuum may corrupt with vacuum analyze command
on another session. My intention of smartvacuum() is based on this.
Any solution for this??

Regards, 


Hitoshi Harada

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Monday, October 23, 2006 11:10 AM
> To: Hitoshi Harada
> Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
> 
> "Hitoshi Harada" <hitoshi_harada@forcia.com> writes:
> >> How is this different from what autovacuum does?
> 
> > My application needs to do vacuum by itself, while
> > autovacuum does it as daemon.
> > The database is updated so frequently that
> > normal vacuum costs too much and tables to be updated are
> > not so many as the whole database is vacuumed.
> > I want to use autovacuum except the feature of daemon,
> > but want to control when to vacuum and which table to vacuum.
> > So, nothing is different between autovacuum and smartvacuum(),
> > but former is daemon and later is user function.
> 
> This seems completely unconvincing.  What are you going to do that
> couldn't be done by autovacuum?
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings



Re: [PATCHES] smartvacuum() instead of autovacuum

От
"Jim C. Nasby"
Дата:
If the decision to vacuum based on autovacuum criteria is good enough
for you then I think you should just focus on getting autovac to do what
you want it to do. Perhaps you just need to decrease the sleep time to a
few seconds, so that autovac will quickly detect when something needs to
be vacuumed.

The only case I can think of where autovac might not work as well as
smartvacuum would be if you had a lot of databases in the cluster, since
autovacuum will only vacuum one database at a time.

On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote:
> Ok, 
> 
> But my point is, autovacuum may corrupt with vacuum analyze command
> on another session. My intention of smartvacuum() is based on this.
> Any solution for this??
> 
> Regards, 
> 
> 
> Hitoshi Harada
> 
> > -----Original Message-----
> > From: pgsql-hackers-owner@postgresql.org
> > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
> > Sent: Monday, October 23, 2006 11:10 AM
> > To: Hitoshi Harada
> > Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
> > 
> > "Hitoshi Harada" <hitoshi_harada@forcia.com> writes:
> > >> How is this different from what autovacuum does?
> > 
> > > My application needs to do vacuum by itself, while
> > > autovacuum does it as daemon.
> > > The database is updated so frequently that
> > > normal vacuum costs too much and tables to be updated are
> > > not so many as the whole database is vacuumed.
> > > I want to use autovacuum except the feature of daemon,
> > > but want to control when to vacuum and which table to vacuum.
> > > So, nothing is different between autovacuum and smartvacuum(),
> > > but former is daemon and later is user function.
> > 
> > This seems completely unconvincing.  What are you going to do that
> > couldn't be done by autovacuum?
> > 
> >             regards, tom lane
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
> 

-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)


Re: [PATCHES] smartvacuum() instead of autovacuum

От
Tom Lane
Дата:
"Jim C. Nasby" <jim@nasby.net> writes:
> The only case I can think of where autovac might not work as well as
> smartvacuum would be if you had a lot of databases in the cluster, since
> autovacuum will only vacuum one database at a time.

It's conceivable that it'd make sense to allow multiple autovac
processes running in parallel.  (The infrastructure part of this is easy
enough, the hard part is keeping them from all deciding to vacuum the
same table.)

One reason we have not done that already is the thought that multiple
vacuum processes would suck too much I/O to be reasonable.  Now you
could dial back their resource demands with the cost-delay settings,
but it's not clear that ten autovacs running at one-tenth speed are
better than one autovac using all the cycles you can spare.  Usually
I think it's best if a vacuum transaction finishes as fast as it can.

In any case, these exact same concerns would apply to manual vacuums
or a combination of manual and auto vacuum.
        regards, tom lane


Re: [PATCHES] smartvacuum() instead of autovacuum

От
Matthew O'Connor
Дата:
Tom Lane wrote:
> "Jim C. Nasby" <jim@nasby.net> writes:
>> The only case I can think of where autovac might not work as well as
>> smartvacuum would be if you had a lot of databases in the cluster, since
>> autovacuum will only vacuum one database at a time.
> 
> It's conceivable that it'd make sense to allow multiple autovac
> processes running in parallel.  (The infrastructure part of this is easy
> enough, the hard part is keeping them from all deciding to vacuum the
> same table.)
> 
> One reason we have not done that already is the thought that multiple
> vacuum processes would suck too much I/O to be reasonable.  Now you
> could dial back their resource demands with the cost-delay settings,
> but it's not clear that ten autovacs running at one-tenth speed are
> better than one autovac using all the cycles you can spare.  Usually
> I think it's best if a vacuum transaction finishes as fast as it can.

I think this is one of the reasons table specific delay settings were 
designed in from the beginning.  I think the main use cases for multiple 
vacuums at once are:
1) Vacuum per table space assuming each table space is on a different 
drive with it's own I/O.
2) the frequently updated table that can't wait to be vacuumed while a 
large table is being vacuumed.  In this case if you set a system default 
delay setting and set a more aggressive table specific delay setting for 
your hot spot tables then multiple vacuums become a clear win.  This is 
an important case that I hope we handle soon.  At this point it's one of 
the main failings of the current autovacuum system.


Re: [PATCHES] smartvacuum() instead of autovacuum

От
"Jim C. Nasby"
Дата:
On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jim@nasby.net> writes:
> > The only case I can think of where autovac might not work as well as
> > smartvacuum would be if you had a lot of databases in the cluster, since
> > autovacuum will only vacuum one database at a time.
> 
> It's conceivable that it'd make sense to allow multiple autovac
> processes running in parallel.  (The infrastructure part of this is easy
> enough, the hard part is keeping them from all deciding to vacuum the
> same table.)
It might be worth creating a generic framework that prevents multiple
vacuums from hitting a table at once, autovac or not.

> One reason we have not done that already is the thought that multiple
> vacuum processes would suck too much I/O to be reasonable.  Now you
> could dial back their resource demands with the cost-delay settings,
> but it's not clear that ten autovacs running at one-tenth speed are
> better than one autovac using all the cycles you can spare.  Usually
> I think it's best if a vacuum transaction finishes as fast as it can.
There's other things that would benefit from having some idea on what IO
resources are available. For example, having a separate bgwriter (or
reader) for each set of physical volumes. So a means of grouping
tablespaces wouldn't hurt.

> In any case, these exact same concerns would apply to manual vacuums
> or a combination of manual and auto vacuum.

Well, the advantage to manual vacuums is that you can tune things to
utilize multiple arrays...
-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)


Re: [PATCHES] smartvacuum() instead of autovacuum

От
Alvaro Herrera
Дата:
Jim C. Nasby wrote:
> On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote:
> > "Jim C. Nasby" <jim@nasby.net> writes:
> > > The only case I can think of where autovac might not work as well as
> > > smartvacuum would be if you had a lot of databases in the cluster, since
> > > autovacuum will only vacuum one database at a time.
> > 
> > It's conceivable that it'd make sense to allow multiple autovac
> > processes running in parallel.  (The infrastructure part of this is easy
> > enough, the hard part is keeping them from all deciding to vacuum the
> > same table.)
>  
> It might be worth creating a generic framework that prevents multiple
> vacuums from hitting a table at once, autovac or not.

That one is easy, because vacuum gets a lock on the affected table that
conflicts with itself.  The problem is that the second vacuum would
actually wait for the first to finish.

A naive idea is to use ConditionalLockAcquire, and if it fails just skip
the table.

> > One reason we have not done that already is the thought that multiple
> > vacuum processes would suck too much I/O to be reasonable.  Now you
> > could dial back their resource demands with the cost-delay settings,
> > but it's not clear that ten autovacs running at one-tenth speed are
> > better than one autovac using all the cycles you can spare.  Usually
> > I think it's best if a vacuum transaction finishes as fast as it can.

In the scenario where one table is huge and another is very small, it
can certainly be useful to vacuum the small table several times while
the huge one has only been vacuumed once.  For that you definitively
need the ability to run parallel vacuums.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: [PATCHES] smartvacuum() instead of autovacuum

От
"Hitoshi Harada"
Дата:
> If the decision to vacuum based on autovacuum criteria is good enough
> for you then I think you should just focus on getting autovac to do what
> you want it to do. Perhaps you just need to decrease the sleep time to a
> few seconds, so that autovac will quickly detect when something needs to
> be vacuumed.

Thanks, I'll do it.
My database is updated frequently all the day and 
runs big building process a day.
Almost all the day autovac is ok but
in the big building process autovac annoys it, 
so I wished there might be the way to order autovac to do its process.


Hitoshi Harada


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jim C. Nasby
> Sent: Tuesday, October 24, 2006 3:36 AM
> To: Hitoshi Harada
> Cc: 'Tom Lane'; 'Peter Eisentraut'; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
> 
> If the decision to vacuum based on autovacuum criteria is good enough
> for you then I think you should just focus on getting autovac to do what
> you want it to do. Perhaps you just need to decrease the sleep time to a
> few seconds, so that autovac will quickly detect when something needs to
> be vacuumed.
> 
> The only case I can think of where autovac might not work as well as
> smartvacuum would be if you had a lot of databases in the cluster, since
> autovacuum will only vacuum one database at a time.
> 
> On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote:
> > Ok,
> >
> > But my point is, autovacuum may corrupt with vacuum analyze command
> > on another session. My intention of smartvacuum() is based on this.
> > Any solution for this??
> >
> > Regards,
> >
> >
> > Hitoshi Harada
> >
> > > -----Original Message-----
> > > From: pgsql-hackers-owner@postgresql.org
> > > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
> > > Sent: Monday, October 23, 2006 11:10 AM
> > > To: Hitoshi Harada
> > > Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org
> > > Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
> > >
> > > "Hitoshi Harada" <hitoshi_harada@forcia.com> writes:
> > > >> How is this different from what autovacuum does?
> > >
> > > > My application needs to do vacuum by itself, while
> > > > autovacuum does it as daemon.
> > > > The database is updated so frequently that
> > > > normal vacuum costs too much and tables to be updated are
> > > > not so many as the whole database is vacuumed.
> > > > I want to use autovacuum except the feature of daemon,
> > > > but want to control when to vacuum and which table to vacuum.
> > > > So, nothing is different between autovacuum and smartvacuum(),
> > > > but former is daemon and later is user function.
> > >
> > > This seems completely unconvincing.  What are you going to do that
> > > couldn't be done by autovacuum?
> > >
> > >             regards, tom lane
> > >
> > > ---------------------------(end of
broadcast)---------------------------
> > > TIP 5: don't forget to increase your free space map settings
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings
> >
> 
> --
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq