Обсуждение: Suggestion; "WITH VACUUM" option

Поиск
Список
Период
Сортировка

Suggestion; "WITH VACUUM" option

От
Josh Berkus
Дата:
Tom, Folks:

Joe and I were discussing your recent discussion about the costs of VACUUM and
tuple maintainence, and I had an interesting idea.

How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE
queries?   This option would cause the regular vacuum activity -- purging the
dead tuple and its index references -- to be done immediately, as part of the
statement, instead of being deferred.

I agree that we don't want immediate tuple maintainence, most of the time.
However, a couple of my clients would really like to have a few queries do
their vacuum immediately, and are currently getting around that by vacuuming
every 5 mintutes.

Easy?  Hard?  Insane?  What do you think?

--
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Suggestion; "WITH VACUUM" option

От
"Marc G. Fournier"
Дата:
On Mon, 16 Dec 2002, Josh Berkus wrote:

> Tom, Folks:
>
> Joe and I were discussing your recent discussion about the costs of VACUUM and
> tuple maintainence, and I had an interesting idea.
>
> How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE
> queries?   This option would cause the regular vacuum activity -- purging the
> dead tuple and its index references -- to be done immediately, as part of the
> statement, instead of being deferred.
>
> I agree that we don't want immediate tuple maintainence, most of the time.
> However, a couple of my clients would really like to have a few queries do
> their vacuum immediately, and are currently getting around that by vacuuming
> every 5 mintutes.
>
> Easy?  Hard?  Insane?  What do you think?

Just curious, but wouldn't it be just as simple to issue a VACUUM call
right after the UPDATE/DELETE?


Re: Suggestion; "WITH VACUUM" option

От
Josh Berkus
Дата:
Marc,

> > Easy?  Hard?  Insane?  What do you think?
>
> Just curious, but wouldn't it be just as simple to issue a VACUUM call
> right after the UPDATE/DELETE?

Well, you can't do that as part of a transaction or procedure, whereas ....

Hmmm.  Couldn't do "with vacuum" as part of a transaction, either.

Funny, it sounded like a really good idea on the phone.

In the words of Rosanne Rosanna Danna,
"Neeeveeeeeer miiiiiiiiind."

--
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Suggestion; "WITH VACUUM" option

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
> How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE 
> queries?   This option would cause the regular vacuum activity -- purging the
> dead tuple and its index references -- to be done immediately, as part of the
> statement, instead of being deferred.

> Easy?  Hard?  Insane?  What do you think?

Impossible.  You can't vacuum a tuple until the last open transaction
that can see it is gone.  It is therefore *impossible* for a transaction
to vacuum away its own detritus; until the transaction commits, you
can't even start to wonder whether other open transactions see it or
not.

Vacuuming has to be done later, and that being the case, I don't see any
real advantage to altering the "background vacuum" design we have.
        regards, tom lane


Re: Suggestion; "WITH VACUUM" option

От
"scott.marlowe"
Дата:
On Mon, 16 Dec 2002, Tom Lane wrote:

> Josh Berkus <josh@agliodbs.com> writes:
> > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE 
> > queries?   This option would cause the regular vacuum activity -- purging the
> > dead tuple and its index references -- to be done immediately, as part of the
> > statement, instead of being deferred.
> 
> > Easy?  Hard?  Insane?  What do you think?
> 
> Impossible.  You can't vacuum a tuple until the last open transaction
> that can see it is gone.  It is therefore *impossible* for a transaction
> to vacuum away its own detritus; until the transaction commits, you
> can't even start to wonder whether other open transactions see it or
> not.
> 
> Vacuuming has to be done later, and that being the case, I don't see any
> real advantage to altering the "background vacuum" design we have.

Then, would a "commit with vacuum" work?  OR a "begin transaction with 
vacuum"  Just tossing them out there...



Re: Suggestion; "WITH VACUUM" option

От
"Marc G. Fournier"
Дата:
On Mon, 16 Dec 2002, scott.marlowe wrote:

> On Mon, 16 Dec 2002, Tom Lane wrote:
>
> > Josh Berkus <josh@agliodbs.com> writes:
> > > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE
> > > queries?   This option would cause the regular vacuum activity -- purging the
> > > dead tuple and its index references -- to be done immediately, as part of the
> > > statement, instead of being deferred.
> >
> > > Easy?  Hard?  Insane?  What do you think?
> >
> > Impossible.  You can't vacuum a tuple until the last open transaction
> > that can see it is gone.  It is therefore *impossible* for a transaction
> > to vacuum away its own detritus; until the transaction commits, you
> > can't even start to wonder whether other open transactions see it or
> > not.
> >
> > Vacuuming has to be done later, and that being the case, I don't see any
> > real advantage to altering the "background vacuum" design we have.
>
> Then, would a "commit with vacuum" work?  OR a "begin transaction with
> vacuum"  Just tossing them out there...

Tom will correct me here, but I believe what he was trying to get across
isn't that the 'current transaction' is the problem ... the problem is the
other connections who have open transactions ... my simplistic
understanding (and I hope it isn't too flawed) of MVCC is that as long as
*one* transaction is outstanding on a tuple, that tuple can't be
physically removed ... as far as any new transactions are concerned, it
has disappeared ... so if I open a transaction, then you open one 'with
vacuum', your 'with vacuum' will fail unless I happen to be out of my
transaction before you ...




Re: Suggestion; "WITH VACUUM" option

От
"scott.marlowe"
Дата:
On Mon, 16 Dec 2002, Marc G. Fournier wrote:

> On Mon, 16 Dec 2002, scott.marlowe wrote:
> 
> > On Mon, 16 Dec 2002, Tom Lane wrote:
> >
> > > Josh Berkus <josh@agliodbs.com> writes:
> > > > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE
> > > > queries?   This option would cause the regular vacuum activity -- purging the
> > > > dead tuple and its index references -- to be done immediately, as part of the
> > > > statement, instead of being deferred.
> > >
> > > > Easy?  Hard?  Insane?  What do you think?
> > >
> > > Impossible.  You can't vacuum a tuple until the last open transaction
> > > that can see it is gone.  It is therefore *impossible* for a transaction
> > > to vacuum away its own detritus; until the transaction commits, you
> > > can't even start to wonder whether other open transactions see it or
> > > not.
> > >
> > > Vacuuming has to be done later, and that being the case, I don't see any
> > > real advantage to altering the "background vacuum" design we have.
> >
> > Then, would a "commit with vacuum" work?  OR a "begin transaction with
> > vacuum"  Just tossing them out there...
> 
> Tom will correct me here, but I believe what he was trying to get across
> isn't that the 'current transaction' is the problem ... the problem is the
> other connections who have open transactions ... my simplistic
> understanding (and I hope it isn't too flawed) of MVCC is that as long as
> *one* transaction is outstanding on a tuple, that tuple can't be
> physically removed ... as far as any new transactions are concerned, it
> has disappeared ... so if I open a transaction, then you open one 'with
> vacuum', your 'with vacuum' will fail unless I happen to be out of my
> transaction before you ...

Oh, yeah I have no doubt of that.  I was thinking more along the lines of 
when a transaction ends it throws a background "vacuum table1;vacuum 
table2;vacuum tablen" command into some kind of vacuuming hopper.  I.e. it 
doesn't block waiting, it runs it as though it were run AFTER the 
transaction.  If there are a few tuples from other transactions we can't 
reclaim, no big deal.  

The other option would be some kind of GUC that set a max number of rows 
deleted/updated in a table in a transaction that would trigger this kind 
of thing automagically.

But I could see such a setting causing just as much harm (chaos theory 
anyone? :-) as good.



Re: Suggestion; "WITH VACUUM" option

От
Tom Lane
Дата:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> Oh, yeah I have no doubt of that.  I was thinking more along the lines of 
> when a transaction ends it throws a background "vacuum table1;vacuum 
> table2;vacuum tablen" command into some kind of vacuuming hopper.

Actually, the plans I liked best for driving auto-vacuum were
essentially an indirect version of that: the FSM module would keep track
of committed deletes + aborted inserts for each active table, and then
the autovacuum scheduler could use that info to decide which tables are
highest-priority to vacuum.

(Or possibly the runtime stats module would be a better place to track
it than FSM.)
        regards, tom lane


Re: Suggestion; "WITH VACUUM" option

От
mlw
Дата:

Tom Lane wrote:

>Josh Berkus <josh@agliodbs.com> writes:
>  
>
>>How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE 
>>queries?   This option would cause the regular vacuum activity -- purging the
>>dead tuple and its index references -- to be done immediately, as part of the
>>statement, instead of being deferred.
>>    
>>
>
>  
>
>>Easy?  Hard?  Insane?  What do you think?
>>    
>>
>
>Impossible.  You can't vacuum a tuple until the last open transaction
>that can see it is gone.  It is therefore *impossible* for a transaction
>to vacuum away its own detritus; until the transaction commits, you
>can't even start to wonder whether other open transactions see it or
>not.
>
>Vacuuming has to be done later, and that being the case, I don't see any
>real advantage to altering the "background vacuum" design we have.
>  
>
This does raise an interresting question, and I understand that it is 
*impossible* to do with PostgreSQL as it currently exists, however, let 
me just toss this out there:

Suppose you do this:

update largetable set foo=bar;

Lets also assume that "largetable" has tens of millions of rows. I have 
databases like this, and I sometimes do operations like this. I have 
found it more efficient to break up the update into a series of:

update largetable set foo=bar where somefield < a;
vacuum
update largetable set foo=bar where somefield < b;
vacuum
update largetable set foo=bar where somefield < c;
vacuum
update largetable set foo=bar where not foo = bar;
vacuum

On some of my databases a statement which updates all the rows is 
unworkable in PostgreSQL, on Oracle, however, there is no poblem.

For my use, it is a pain in the neck to deal with, but not unworkable. 
For some other users, it may be a bigger problem.



>  
>

>  
>



Re: Suggestion; "WITH VACUUM" option

От
Matthew Kirkwood
Дата:
On Tue, 17 Dec 2002, mlw wrote:

> update largetable set foo=bar;
>
> Lets also assume that "largetable" has tens of millions of rows.
[..]
> On some of my databases a statement which updates all the rows is
> unworkable in PostgreSQL, on Oracle, however, there is no poblem.

.. provided you have a lot of rollback space, which is
essentially what the datafile growth here is providing.

Matthew.