Обсуждение: Fixing Simms' vacuum problems

Поиск
Список
Период
Сортировка

Fixing Simms' vacuum problems

От
Tom Lane
Дата:
Michael Simms was kind enough to give me login privileges on his system
to poke at his problems with vacuum running concurrently with table
create/drop operations.  I am not sure why his setup seems to display
the problem easier than mine does, but it's certainly true that crashes
occur very easily there, whereas it often takes many tries for me.

Anyway, I am now convinced that his symptoms are indeed explained by the
locking and cache-invalidation problems we have been discussing.  I saw
a number of different failures, but they all seemed to trace back to one
of two common themes:

(1) The non-vacuuming backend crashes because of accessing a
system-relation tuple that isn't in the same place anymore: the tuple
is found in the local syscache, but the item location recorded there is
stale because vacuum has moved the tuple, and the non-vacuum process
hasn't noticed the SI update message for it yet.

(2) The vacuuming backend can fail because of trying to vacuum a
relation that's already been deleted.  This can be blamed on the known
bug that DROP TABLE releases its exclusive lock on the target table
before end of transaction.

I expect there are also failures due to the lack-of-lock problems that
Hiroshi recently identified, but I didn't happen to see any of those in
the limited number of cases that I watched with the debugger.

So, it looks like a solution involves two components: first, being more
careful to lock system relations appropriately, and second, being sure
that SI messages are seen soon enough.  I think the read-SI-messages-
at-lock-time code that's already in place for 6.6 will be sufficient for
the second point, if we are religious about acquiring appropriate locks.
(BTW, I think that in most cases an appropriate lock on a system table
will be less strong than AccessExclusiveLock --- Vadim, do you agree?)

Once we have the changes, the next question is do we want to risk
back-patching them into 6.5.2?  I can see several ways that we could
proceed:
1. Back-patch into REL6_5, and postpone 6.5.2 release for a while  for beta-testing.
2. Put out 6.5.2 now (since it already has several other useful fixes),  then back-patch, and release 6.5.3 after a
beta-testinginterval.
 
3. Leave these changes out of 6.5.*, and try to get 6.6 out the door  soon instead.

I am not eager to hurry 6.6 along --- I have a lot of half-done work
in the planner/optimizer that I'd like to finish for 6.6.  Perhaps
choice #2 is the way to go.  Comments?
        regards, tom lane


Re: Fixing Simms' vacuum problems

От
Michael Simms
Дата:
> Once we have the changes, the next question is do we want to risk
> back-patching them into 6.5.2?  I can see several ways that we could
> proceed:
> 1. Back-patch into REL6_5, and postpone 6.5.2 release for a while
>    for beta-testing.
> 2. Put out 6.5.2 now (since it already has several other useful fixes),
>    then back-patch, and release 6.5.3 after a beta-testing interval.
> 3. Leave these changes out of 6.5.*, and try to get 6.6 out the door
>    soon instead.
> 
> I am not eager to hurry 6.6 along --- I have a lot of half-done work
> in the planner/optimizer that I'd like to finish for 6.6.  Perhaps
> choice #2 is the way to go.  Comments?
> 
>             regards, tom lane

I woudl also suggest number 2 would be best for all. It means teh bugfix for
my (and potentially other peoples) problems gets fixed before 6.6 but there
is no delay to the 6.5.2 bugfixes being released.

I am curious, is there a reason that there is not a regular release of the
development tree also? I am aware we can get it through CVS to hammer
on it, but releases would be easier in many ways, certainly easier to develop
patches against.

Just a thought, as it seems that the linux kernel benefits greatly from
this approach.

As a final word, I would like to thank tom for his looking into
the problem. I have been really impressed with the responses
of the postgresql developers, they seem to be a lot more approachable and
willing to fix problems than in most other open source systems I have
seen.
Hopefully when I get a bit more time and get more familiar with the
postgresql code, I'll be able to actually provide some solutions
instead of just breaking it and telling you lot {:-)

Thanks!
                ~Michael


Re: [HACKERS] Fixing Simms' vacuum problems

От
The Hermit Hacker
Дата:
On Sat, 11 Sep 1999, Tom Lane wrote:

> Once we have the changes, the next question is do we want to risk
> back-patching them into 6.5.2?  I can see several ways that we could
> proceed:
> 1. Back-patch into REL6_5, and postpone 6.5.2 release for a while
>    for beta-testing.
> 2. Put out 6.5.2 now (since it already has several other useful fixes),
>    then back-patch, and release 6.5.3 after a beta-testing interval.
> 3. Leave these changes out of 6.5.*, and try to get 6.6 out the door
>    soon instead.
> 
> I am not eager to hurry 6.6 along --- I have a lot of half-done work
> in the planner/optimizer that I'd like to finish for 6.6.  Perhaps
> choice #2 is the way to go.  Comments?

Option 2 makes *me* feel the most comfortable...we were holding off on
6.5.2 due to some things ppl were working on...are those complete?  I can
roll out a 6.5.2 tonight if everyone feel comfortable with it, or wait for
a few days (Wednesday?) to make sure all is iron'd out?

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Re: Fixing Simms' vacuum problems

От
The Hermit Hacker
Дата:
On Sat, 11 Sep 1999, Michael Simms wrote:

> I am curious, is there a reason that there is not a regular release of the
> development tree also? I am aware we can get it through CVS to hammer
> on it, but releases would be easier in many ways, certainly easier to develop
> patches against.

ftp://ftp.postgresql.org/pub/postgresql-snapshot.tar.gz

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Fixing Simms' vacuum problems

От
Tatsuo Ishii
Дата:
> Once we have the changes, the next question is do we want to risk
> back-patching them into 6.5.2?  I can see several ways that we could
> proceed:
> 1. Back-patch into REL6_5, and postpone 6.5.2 release for a while
>    for beta-testing.
> 2. Put out 6.5.2 now (since it already has several other useful fixes),
>    then back-patch, and release 6.5.3 after a beta-testing interval.
> 3. Leave these changes out of 6.5.*, and try to get 6.6 out the door
>    soon instead.
> 
> I am not eager to hurry 6.6 along --- I have a lot of half-done work
> in the planner/optimizer that I'd like to finish for 6.6.  Perhaps
> choice #2 is the way to go.  Comments?

Seems #2 is good choice for me too.
---
Tatsuo Ishii


Re: [HACKERS] Fixing Simms' vacuum problems

От
Tom Lane
Дата:
The Hermit Hacker <scrappy@hub.org> writes:
> Option 2 makes *me* feel the most comfortable...we were holding off on
> 6.5.2 due to some things ppl were working on...are those complete?  I can
> roll out a 6.5.2 tonight if everyone feel comfortable with it, or wait for
> a few days (Wednesday?) to make sure all is iron'd out?

I don't have any more code changes that I want to try to squeeze into
6.5.2, but I thought Bruce still needed to update the change log etc
etc.  Dunno about the rest of the crew; anyone have more to do?
        regards, tom lane


Re: [HACKERS] Fixing Simms' vacuum problems

От
Bruce Momjian
Дата:
> The Hermit Hacker <scrappy@hub.org> writes:
> > Option 2 makes *me* feel the most comfortable...we were holding off on
> > 6.5.2 due to some things ppl were working on...are those complete?  I can
> > roll out a 6.5.2 tonight if everyone feel comfortable with it, or wait for
> > a few days (Wednesday?) to make sure all is iron'd out?
> 
> I don't have any more code changes that I want to try to squeeze into
> 6.5.2, but I thought Bruce still needed to update the change log etc
> etc.  Dunno about the rest of the crew; anyone have more to do?

Yes, I have to do that.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Fixing Simms' vacuum problems]

От
Bruce Momjian
Дата:
> The Hermit Hacker <scrappy@hub.org> writes:
> > Option 2 makes *me* feel the most comfortable...we were holding off on
> > 6.5.2 due to some things ppl were working on...are those complete?  I can
> > roll out a 6.5.2 tonight if everyone feel comfortable with it, or wait for
> > a few days (Wednesday?) to make sure all is iron'd out?
> 
> I don't have any more code changes that I want to try to squeeze into
> 6.5.2, but I thought Bruce still needed to update the change log etc
> etc.  Dunno about the rest of the crew; anyone have more to do?
> 

I have updated everything needed for 6.5.2.  Thomas, can you update the
HISTORY file for 6.5.2.  Thanks.

This is good timing.   I just finished a 4-month project yesterday.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Fixing Simms' vacuum problems]

От
The Hermit Hacker
Дата:
On Sun, 12 Sep 1999, Bruce Momjian wrote:

> > The Hermit Hacker <scrappy@hub.org> writes:
> > > Option 2 makes *me* feel the most comfortable...we were holding off on
> > > 6.5.2 due to some things ppl were working on...are those complete?  I can
> > > roll out a 6.5.2 tonight if everyone feel comfortable with it, or wait for
> > > a few days (Wednesday?) to make sure all is iron'd out?
> > 
> > I don't have any more code changes that I want to try to squeeze into
> > 6.5.2, but I thought Bruce still needed to update the change log etc
> > etc.  Dunno about the rest of the crew; anyone have more to do?
> > 
> 
> I have updated everything needed for 6.5.2.  Thomas, can you update the
> HISTORY file for 6.5.2.  Thanks.

Okay, will wrap 6.5.2 on Tuesday evening then...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Fixing Simms' vacuum problems

От
Thomas Lockhart
Дата:
> I don't have any more code changes that I want to try to squeeze into
> 6.5.2, but I thought Bruce still needed to update the change log etc
> etc.  Dunno about the rest of the crew; anyone have more to do?

I should put in my recent fix for Tatsuo regarding unspecified string
types in case statements. Should get to it this evening (Monday
morning, GMT)...
                 - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Fixing Simms' vacuum problems

От
Vadim Mikheev
Дата:
Tom Lane wrote:
> 
> So, it looks like a solution involves two components: first, being more
> careful to lock system relations appropriately, and second, being sure
> that SI messages are seen soon enough.  I think the read-SI-messages-
> at-lock-time code that's already in place for 6.6 will be sufficient for
> the second point, if we are religious about acquiring appropriate locks.
> (BTW, I think that in most cases an appropriate lock on a system table
> will be less strong than AccessExclusiveLock --- Vadim, do you agree?)

ExclusiveLock should be ok.

Vadim