Обсуждение: Name for new VACUUM

Поиск
Список
Период
Сортировка

Name for new VACUUM

От
Bruce Momjian
Дата:
I was thinking about our new version of vacuum.  I think it should be
called VACUUM NOLOCK to make it clear when you should use it, and we can
keep our ordinary VACUUM the same.

If you want to get fancy, we can call our traditional vacuum VACUUM LOCK
and have a GUC parameter that controls what VACUUM without
LOCK/NOLOCK does.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Name for new VACUUM

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I was thinking about our new version of vacuum.  I think it should be
> called VACUUM NOLOCK to make it clear when you should use it, and we can
> keep our ordinary VACUUM the same.

I really don't understand why you're so hot to avoid changing the
default behavior of VACUUM.  Name me even one user who *likes* the
current behavior (ie, VACUUM grabs exclusive lock)?  IMHO the default
behavior *should* change.  Otherwise you're just forcing people to
update their cron scripts, which they wouldn't need to touch if we
do it the way I want.
        regards, tom lane


Re: Name for new VACUUM

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I was thinking about our new version of vacuum.  I think it should be
> > called VACUUM NOLOCK to make it clear when you should use it, and we can
> > keep our ordinary VACUUM the same.
> 
> I really don't understand why you're so hot to avoid changing the
> default behavior of VACUUM.  Name me even one user who *likes* the
> current behavior (ie, VACUUM grabs exclusive lock)?  IMHO the default
> behavior *should* change.  Otherwise you're just forcing people to
> update their cron scripts, which they wouldn't need to touch if we
> do it the way I want.

I am concerned because UPDATE consumes disk space that never gets
returned to the OS until a traditional vacuum is run.  It is true that
after nolock vacuum, the future UPDATE's can use the extra space.

Maybe just call the traditional vacuum VACUUM LOCK.  It was the
LOCK/NOLOCK idea that I think was important.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Name for new VACUUM

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I really don't understand why you're so hot to avoid changing the
>> default behavior of VACUUM.

> I am concerned because UPDATE consumes disk space that never gets
> returned to the OS until a traditional vacuum is run.

Not necessarily.  Concurrent VACUUM does truncate the relation if it can
do so conveniently --- for example, it will successfully reclaim space
if you do "DELETE FROM foo; VACUUM foo;".  It just doesn't try as hard
as the older VACUUM code does.

IMHO, average disk space usage for a real-world database may well be
*lower* with the new style of VACUUM than with the old style, simply
because you can afford to do new-style VACUUM more often.  The old-style
VACUUM might give you a lower space usage just after a VACUUM, but if
you can only afford to do that on nights or weekends, it's cold comfort.
Your disk hardware needs are going to be determined by peak space usage,
not minimum or even average usage, and time between VACUUMs is what
drives that.  On a peak-usage basis I have no doubt that frequent
new-style VACUUMs will win hands down over infrequent old-style.

> Maybe just call the traditional vacuum VACUUM LOCK.  It was the
> LOCK/NOLOCK idea that I think was important.

Right now it's called VACUUM FULL, but I'm not particularly wedded to
that name.  Does anyone else like VACUUM LOCK?  Or have an even better
idea?
        regards, tom lane


Re: Name for new VACUUM

От
mlw
Дата:
Tom Lane wrote:
> 
> > Maybe just call the traditional vacuum VACUUM LOCK.  It was the
> > LOCK/NOLOCK idea that I think was important.
> 
> Right now it's called VACUUM FULL, but I'm not particularly wedded to
> that name.  Does anyone else like VACUUM LOCK?  Or have an even better
> idea?

Why rename VACUUM, why not create a new command RECLAIM, or something like
that. RECLAIM does the VACUUM NOLOCK, while vacuum does the locking. The term
RECLAIM will make more sense to new comers than VACUUM, and old postgres users
already know about VACUUM.


-- 
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com


Re: Name for new VACUUM

От
Tom Lane
Дата:
mlw <markw@mohawksoft.com> writes:
> Why rename VACUUM, why not create a new command RECLAIM, or something like
> that. RECLAIM does the VACUUM NOLOCK, while vacuum does the locking.

Um, that gets the default backwards IMHO, where "default" = "what
existing scripts will do".

> The term RECLAIM will make more sense to new comers than VACUUM,

What's your basis for claiming that?

In any case, VACUUM is the term already used in all our documentation.
I have no appetite for trying to teach people and documents that
currently know "you must do VACUUM periodically" that the new truth is
"you must do VACUUM or RECLAIM periodically".  All these discussions
about which should be default aside, the bottom line is that the two
pieces of code do more-or-less the same thing from a high level
perspective.  Calling them completely different names isn't going to
make things easier for novices.  Calling them different options of the
same statement seems like the right thing to me.
        regards, tom lane


Re: Name for new VACUUM

От
Tom Lane
Дата:
mlw <markw@mohawksoft.com> writes:
> ... people looked at me like I had two heads when I told them about
> "vacuum." It wasn't obvious to them what it did.

I won't dispute that, but changing a command name that's been around for
ten or fifteen years strikes me as a recipe for more confusion, not
less.

> However, saying that VACUUM NOLOCK and VACUUM LOCK do "more-or-less
> the same thing" really isn't so. Think about it, the VACUUM LOCK,
> practically rebuilds a tables representation,

It does no such thing.  The only difference is that it's willing to move
a few tuples around if it can thereby free up (and truncate) whole pages
at the end of the table.  (In a live system you'd better hope it's only
a few tuples, anyway ;-) ... or you'll be waiting a long time.)  It
doesn't even do a complete defrag; it stops moving tuples as soon as it
finds that it won't be able to truncate the table any further.  So
there's *not* that much difference.

> VACUUM DEFRAG?
> VACUUM COMPRESS?

While these look kinda ugly to me, I can find no stronger objection than
that.  (Well, maybe I could complain that these overstate what old-style
vacuum actually does, but that's even weaker.)  What do other people
think?
        regards, tom lane


Re: Name for new VACUUM

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> I really don't understand why you're so hot to avoid changing the
> >> default behavior of VACUUM.
> 
> > I am concerned because UPDATE consumes disk space that never gets
> > returned to the OS until a traditional vacuum is run.
> 
> Not necessarily.  Concurrent VACUUM does truncate the relation if it can
> do so conveniently --- for example, it will successfully reclaim space
> if you do "DELETE FROM foo; VACUUM foo;".  It just doesn't try as hard
> as the older VACUUM code does.

But it will not reclaim from UPDATE.  You also will have to VACUUM
NOLOCK right after your delete or the next INSERT is going to go on the
end and VACUUM NOLOCK is not going to compact the table, right?

> IMHO, average disk space usage for a real-world database may well be
> *lower* with the new style of VACUUM than with the old style, simply
> because you can afford to do new-style VACUUM more often.  The old-style
> VACUUM might give you a lower space usage just after a VACUUM, but if
> you can only afford to do that on nights or weekends, it's cold comfort.
> Your disk hardware needs are going to be determined by peak space usage,
> not minimum or even average usage, and time between VACUUMs is what
> drives that.  On a peak-usage basis I have no doubt that frequent
> new-style VACUUMs will win hands down over infrequent old-style.

My contention is that we are causing more problems for administrators by
changeing VACUUM's default behavior.  Most people vacuum only at night
when no one is using the system, and they should get the LOCK version of
vacuum.  (No change to scripts.)  What will change is that people can
add VACUUM NOLOCK during the day to their cron scripts.

> > Maybe just call the traditional vacuum VACUUM LOCK.  It was the
> > LOCK/NOLOCK idea that I think was important.
> 
> Right now it's called VACUUM FULL, but I'm not particularly wedded to
> that name.  Does anyone else like VACUUM LOCK?  Or have an even better
> idea?

FULL seems overloaded to me.  Maybe LOCK or FORCE.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Re: Name for new VACUUM

От
Jan Wieck
Дата:
Tom Lane wrote:
>
> > VACUUM DEFRAG?
> > VACUUM COMPRESS?
>
> While these look kinda ugly to me, I can find no stronger objection than
> that.  (Well, maybe I could complain that these overstate what old-style
> vacuum actually does, but that's even weaker.)  What do other people
> think?
   What I think? That this entire discussion wasted far too much   time already. These commands live  usually  in
crontabs and   aren't  typed  in  that  often. So give the baby "a name" and   done. VACUUM CLASSIC and VACUUM LIGHT or
whatever.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Name for new VACUUM

От
mlw
Дата:
Tom Lane wrote:
> 
> mlw <markw@mohawksoft.com> writes:
> > Why rename VACUUM, why not create a new command RECLAIM, or something like
> > that. RECLAIM does the VACUUM NOLOCK, while vacuum does the locking.
> 
> Um, that gets the default backwards IMHO, where "default" = "what
> existing scripts will do".

Changing how the default works is always tricky. Even if you improve something
dramatically, someone still will still gripe about the change.

> 
> > The term RECLAIM will make more sense to new comers than VACUUM,
> 
> What's your basis for claiming that?

I am so used to "vacuum" and postgresql, it makes perfect sense to me. Yet, I
gave a brief discussion at work a week ago about PostgreSQL and how we can use
it to offload SQL queries from Oracle. In the pros and cons part of the
discussion, people looked at me like I had two heads when I told them about
"vacuum." It wasn't obvious to them what it did.

The term "reclaim" may be a little more obvious, but I could be wrong. It is
just that the name vacuum, from the perspective of someone new to PostgreSQL,
is a bit obscure.
> 
> In any case, VACUUM is the term already used in all our documentation.
> I have no appetite for trying to teach people and documents that
> currently know "you must do VACUUM periodically" that the new truth is
> "you must do VACUUM or RECLAIM periodically".  All these discussions
> about which should be default aside, the bottom line is that the two
> pieces of code do more-or-less the same thing from a high level
> perspective.  Calling them completely different names isn't going to
> make things easier for novices.  Calling them different options of the
> same statement seems like the right thing to me.

I understand the documentation issue completely, and it is a very strong point.
However, saying that VACUUM NOLOCK and VACUUM LOCK do "more-or-less the same
thing" really isn't so. Think about it, the VACUUM LOCK, practically rebuilds a
tables representation, in older versions of Postgres didn't it actually rewrite
the table? The new behavior of vacuum doesn't do that at all. 

Perhaps VACUUM gets changed to the new behavior, and the old behavior gets
renamed to DEFRAG or COMPRESS? Win/DOS users will find those names completely
obvious.

VACUUM DEFRAG?
VACUUM COMPRESS?


> 
>                         regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com


Re: Name for new VACUUM

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Not necessarily.  Concurrent VACUUM does truncate the relation if it can
>> do so conveniently --- for example, it will successfully reclaim space
>> if you do "DELETE FROM foo; VACUUM foo;".  It just doesn't try as hard
>> as the older VACUUM code does.

> But it will not reclaim from UPDATE.

What?  I have no idea what you mean by that.

>  You also will have to VACUUM
> NOLOCK right after your delete or the next INSERT is going to go on the
> end and VACUUM NOLOCK is not going to compact the table, right?

INSERTs don't go on the end in the first place, at least not under
steady-state conditions.  That's what the free space map is all about.

> My contention is that we are causing more problems for administrators by
> changeing VACUUM's default behavior.

This is a curious definition of causing problems: making it work better
is causing a problem?  I didn't think we'd elevated backwards
compatibility to quite that much of a holy grail.  To me, a backwards
compatibility problem is something that actually breaks an existing app.
I do not see how changing vacuum's default behavior will break anything.

>> Right now it's called VACUUM FULL, but I'm not particularly wedded to
>> that name.  Does anyone else like VACUUM LOCK?  Or have an even better
>> idea?

> FULL seems overloaded to me.  Maybe LOCK or FORCE.

LOCK is pretty overloaded too, but I don't have any other objection to
it.  "FORCE" is meaningless; what are you forcing, and just how much
force are you applying?
        regards, tom lane


Re: Re: Name for new VACUUM

От
Tom Lane
Дата:
Jan Wieck <JanWieck@yahoo.com> writes:
>     What I think? That this entire discussion wasted far too much
>     time already.

I agree.  VACUUM FULL is what's in the code and docs today, and I
haven't heard any good reason to expend the effort to change it...
none of the other proposals are visibly better, merely different.
        regards, tom lane


Re: Name for new VACUUM

От
Bruce Momjian
Дата:
> It does no such thing.  The only difference is that it's willing to move
> a few tuples around if it can thereby free up (and truncate) whole pages
> at the end of the table.  (In a live system you'd better hope it's only
> a few tuples, anyway ;-) ... or you'll be waiting a long time.)  It
> doesn't even do a complete defrag; it stops moving tuples as soon as it
> finds that it won't be able to truncate the table any further.  So
> there's *not* that much difference.
> 
> > VACUUM DEFRAG?
> > VACUUM COMPRESS?
> 
> While these look kinda ugly to me, I can find no stronger objection than
> that.  (Well, maybe I could complain that these overstate what old-style
> vacuum actually does, but that's even weaker.)  What do other people
> think?

I kind of like COMPRESS, though VACUUM NOLOCK can do compress sometimes
too.  That gets confusing. That's why I hit on LOCK.  I couldn't think
of another _unique_ thing old vacuum did.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Name for new VACUUM

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Not necessarily.  Concurrent VACUUM does truncate the relation if it can
> >> do so conveniently --- for example, it will successfully reclaim space
> >> if you do "DELETE FROM foo; VACUUM foo;".  It just doesn't try as hard
> >> as the older VACUUM code does.
> 
> > But it will not reclaim from UPDATE.
> 
> What?  I have no idea what you mean by that.

I meant that UPDATE of all rows in a table put the new rows at the end.

> >  You also will have to VACUUM
> > NOLOCK right after your delete or the next INSERT is going to go on the
> > end and VACUUM NOLOCK is not going to compact the table, right?
> 
> INSERTs don't go on the end in the first place, at least not under
> steady-state conditions.  That's what the free space map is all about.

But you are assuming you have stuff in the free space map for the table
already, right?  I as not assuming that.

> > My contention is that we are causing more problems for administrators by
> > changeing VACUUM's default behavior.
> 
> This is a curious definition of causing problems: making it work better
> is causing a problem?  I didn't think we'd elevated backwards
> compatibility to quite that much of a holy grail.  To me, a backwards
> compatibility problem is something that actually breaks an existing app.
> I do not see how changing vacuum's default behavior will break anything.

It will not break.  It is just you were saying making VACUUM NOLOCK the
default is less work for administrators because they don't have to
update their scripts.  I am saying that there is more updating required
for making NOLOCK the default.  However, maybe more typing if they do
NOLOCk more frequently.

> >> Right now it's called VACUUM FULL, but I'm not particularly wedded to
> >> that name.  Does anyone else like VACUUM LOCK?  Or have an even better
> >> idea?
> 
> > FULL seems overloaded to me.  Maybe LOCK or FORCE.
> 
> LOCK is pretty overloaded too, but I don't have any other objection to
> it.  "FORCE" is meaningless; what are you forcing, and just how much
> force are you applying?

No idea.  ANALYZE isn't the greatest word either, but it was mine.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Name for new VACUUM

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> INSERTs don't go on the end in the first place, at least not under
>> steady-state conditions.  That's what the free space map is all about.

> But you are assuming you have stuff in the free space map for the table
> already, right?  I as not assuming that.

But that is going to be the normal state of affairs, at least for people
who don't reboot their postmasters every few minutes as we developers
tend to do.

Sure, you can point to situations where lazy VACUUM doesn't do as well
as full VACUUM.  That's the point of having two implementations isn't it?
If we didn't need full VACUUM at all any more, we'd have removed it.
The existence of such situations is not justification for claiming that
lazy VACUUM isn't an appropriate default behavior.  The question is
which one is more appropriate for typical installations under typical
operating conditions --- and in that sort of scenario there *will* be
info in the free space map.

Even more to the point, those typical installations do not want
exclusive-locked VACUUM.  Haven't you paid any attention to the user
complaints we've been hearing for the last N years?  People want a
nonexclusive VACUUM (or no VACUUM at all, but that's not a choice we can
offer them now.)  That *is* what the typical dbadmin will want to run,
and that's why I say it should be the default.  If you think that most
people will want to stick with exclusive VACUUM, I'd like to see some
evidence for that position (so that I know why the time I spent on that
project was wasted ;-)).
        regards, tom lane


Re: Name for new VACUUM

От
Hannu Krosing
Дата:
Bruce Momjian wrote:
> 
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > >> Not necessarily.  Concurrent VACUUM does truncate the relation if it can
> > >> do so conveniently --- for example, it will successfully reclaim space
> > >> if you do "DELETE FROM foo; VACUUM foo;".  It just doesn't try as hard
> > >> as the older VACUUM code does.
> >
> > > But it will not reclaim from UPDATE.
> >
> > What?  I have no idea what you mean by that.
> 
> I meant that UPDATE of all rows in a table put the new rows at the end.

OTOH if you do it twice it will reclaim ;)

UPDATE everything;
VACUUM;
UPDATE everything;
VACUUM;

---------------
Hannu


Re: Re: Name for new VACUUM

От
Hannu Krosing
Дата:
Tom Lane wrote:
> 
> mlw <markw@mohawksoft.com> writes:
> > ... people looked at me like I had two heads when I told them about
> > "vacuum." It wasn't obvious to them what it did.
> 
> I won't dispute that, but changing a command name that's been around for
> ten or fifteen years strikes me as a recipe for more confusion, not
> less.
> 
> > However, saying that VACUUM NOLOCK and VACUUM LOCK do "more-or-less
> > the same thing" really isn't so. Think about it, the VACUUM LOCK,
> > practically rebuilds a tables representation,
> 
> It does no such thing.  

Just out of curiosity - does CLUSTER currently "practically rebuild
a tables representation" ?

--------------
Hannu


Re: Re: Name for new VACUUM

От
Tom Lane
Дата:
Hannu Krosing <hannu@tm.ee> writes:
> Just out of curiosity - does CLUSTER currently "practically rebuild
> a tables representation" ?

CLUSTER currently *loses* most of a table's representation :-(.
It needs work.

But since the whole point of CLUSTER is to physically rearrange the
tuples of a table, it seems to me that it's in a different category
from VACUUM anyway.
        regards, tom lane


Re: Re: Name for new VACUUM

От
Hannu Krosing
Дата:
Tom Lane wrote:
> 
> Hannu Krosing <hannu@tm.ee> writes:
> > Just out of curiosity - does CLUSTER currently "practically rebuild
> > a tables representation" ?
> 
> CLUSTER currently *loses* most of a table's representation :-(.
> It needs work.

at least \h CLUSTER in psql seems to imply that it is OK to use CLUSTER
?

Do we have some indication of last CLUSTER command (like an OID column
of 
cluster index field) in pg_relation so that VACUUM caould make better 
decisions when moving tuples ?

> But since the whole point of CLUSTER is to physically rearrange the
> tuples of a table, it seems to me that it's in a different category
> from VACUUM anyway.

Another way to look at it is as "VACUUM LOCK AND PERFORM HEAVY
REARRANGEMENTS"

Or does the current implementation actually do the rearrangement by 
appending all out-of-index-order tuples to the end and _not_ clean up 
unused space requiring an additional vacuum after CLUSTER ?

--------------
Hannu


Re: Re: Name for new VACUUM

От
Hannu Krosing
Дата:
Tom Lane wrote:
> 
> Hannu Krosing <hannu@tm.ee> writes:
> > Just out of curiosity - does CLUSTER currently "practically rebuild
> > a tables representation" ?
> 
> CLUSTER currently *loses* most of a table's representation :-(.
> It needs work.

The easiest implememntaion of CLUSTER seems to be something along the
lines of

-- lock original table for WRITE in all backends

CREATE TABLE CLUSTERED_T 
AS 
SELECT * FROM ORIGINAL_T ORDER BY INDEX_COLUMNS;

-- now do a move of CLUSTERED_T -> ORIGINAL_T
-- and then REINDEX
-- flush cache in all backends

-- unlock the altered table

This would need an actual 2xSIZE + size of dead tuples of space but
would 
probably be fastest in situations where heavy rearrangement is needed.

> But since the whole point of CLUSTER is to physically rearrange the
> tuples of a table, it seems to me that it's in a different category
> from VACUUM anyway.

OTOH it is the closest thing to VACUUM among "standard" SQL commands ;)

-----------------
Hannu