Обсуждение: Vacuum analyze bug CAUGHT

Поиск
Список
Период
Сортировка

Vacuum analyze bug CAUGHT

От
Michael Simms
Дата:
Hi

Okee, I have caught the vacuum analyse crash that was giving me a load of
grief

Firstly, I create the following database

$psql crashtest
Welcome to the POSTGRESQL interactive sql monitor: Please read the file COPYRIGHT for copyright terms of POSTGRESQL
[PostgreSQL 6.5.1 on i586-pc-linux-gnu, compiled by gcc -ggdb ]
  type \? for help on slash commands  type \q to quit  type \g or terminate with semicolon to execute queryYou are
currentlyconnected to the database: crashtest
 

crashtest=> select version();
version                                                      
-------------------------------------------------------------
PostgreSQL 6.5.1 on i586-pc-linux-gnu, compiled by gcc -ggdb 
(1 row)

crashtest=> \d
Database    = crashtest+------------------+----------------------------------+----------+|  Owner           |
 Relation             |   Type   |+------------------+----------------------------------+----------+| test
|testtable                        | table    |+------------------+----------------------------------+----------+
 

crashtest=> select distinct * from testtable;
vara|varb
----+----
abc |def 
(1 row)

crashtest=> select count(*) from testtable;count
------
319800
(1 row)

-----------------------------------

Now, I then create a text file with 780 (just a random large number) new
lines that are exactly the same as the ones already in the database, so they
can be added with the copy command.

(the 319,800 rows I have in my database were actually created using copy
of this file, but I think that that is irrelavent).

I then run the followiung script (which you will note just runs forever:

------------------------------------------------------
#!/bin/sh -

(   while [ "1" = "1" ]   do
   echo "create temp table temptest ( vara text, varb text );"   echo "copy temptest from '/tmp/copyeffort';"
   echo "insert into testtable select * from temptest;"      echo "drop table temptest;"
   sleep 1
   done

) | psql crashtest

-----------------------------------------------------

I attach gdb to the backend that is handling this task

I then start a psql session to the same database

Then I attach another GDB to this backend too, so I have GDB on both

Then, in psql, I run and get the following result:

-----------

crashtest=> vacuum analyze;
NOTICE:  Rel pg_type: TID 4/3: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/5: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/6: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/7: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/8: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/9: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/10: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/11: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_attribute: TID 23/12: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  Rel pg_class: TID 3/22: InsertTransactionInProgress 129915 - can't shrink relation
NOTICE:  AbortTransaction and not in in-progress state 

------------

At this point, the backend attatched to the script doing the creates inserts
and drops has the following:


Program received signal SIGUSR2, User defined signal 2.
0x4018db4d in __libc_fsync ()
(gdb) where
#0  0x4018db4d in __libc_fsync ()
#1  0x80d5908 in pg_fsync (fd=3) at fd.c:202
#2  0x80d615a in FileSync (file=3) at fd.c:876
#3  0x80dc9ed in mdcommit () at md.c:796
#4  0x80dd1bc in smgrcommit () at smgr.c:375
#5  0x80d47f7 in FlushBufferPool (StableMainMemoryFlag=0) at bufmgr.c:1260
#6  0x8078755 in RecordTransactionCommit () at xact.c:636
#7  0x8078919 in CommitTransaction () at xact.c:940
#8  0x8078ac1 in CommitTransactionCommand () at xact.c:1177
#9  0x80df0cf in PostgresMain (argc=-1073742286, argv=0xbffff7a0, real_argc=7,    real_argv=0xbffffd04) at
postgres.c:1679
#10 0x80c91ba in DoBackend (port=0x81ea4a0) at postmaster.c:1628
#11 0x80c8cda in BackendStartup (port=0x81ea4a0) at postmaster.c:1373
#12 0x80c8429 in ServerLoop () at postmaster.c:823
#13 0x80c7f67 in PostmasterMain (argc=7, argv=0xbffffd04) at postmaster.c:616
#14 0x80a0986 in main (argc=7, argv=0xbffffd04) at main.c:97
#15 0x400fccb3 in __libc_start_main (main=0x80a0920 <main>, argc=7,    argv=0xbffffd04, init=0x8061360 <_init>,
fini=0x810d63c<_fini>,    rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffcfc)   at
../sysdeps/generic/libc-start.c:78
(gdb) 
Program received signal SIGUSR2, User defined signal 2.
0x4018db4d in __libc_fsync ()
(gdb) where
#0  0x4018db4d in __libc_fsync ()
#1  0x80d5908 in pg_fsync (fd=3) at fd.c:202
#2  0x80d615a in FileSync (file=3) at fd.c:876
#3  0x80dc9ed in mdcommit () at md.c:796
#4  0x80dd1bc in smgrcommit () at smgr.c:375
#5  0x80d47f7 in FlushBufferPool (StableMainMemoryFlag=0) at bufmgr.c:1260
#6  0x8078755 in RecordTransactionCommit () at xact.c:636
#7  0x8078919 in CommitTransaction () at xact.c:940
#8  0x8078ac1 in CommitTransactionCommand () at xact.c:1177
#9  0x80df0cf in PostgresMain (argc=-1073742286, argv=0xbffff7a0, real_argc=7,    real_argv=0xbffffd04) at
postgres.c:1679
#10 0x80c91ba in DoBackend (port=0x81ea4a0) at postmaster.c:1628
#11 0x80c8cda in BackendStartup (port=0x81ea4a0) at postmaster.c:1373
#12 0x80c8429 in ServerLoop () at postmaster.c:823
#13 0x80c7f67 in PostmasterMain (argc=7, argv=0xbffffd04) at postmaster.c:616
#14 0x80a0986 in main (argc=7, argv=0xbffffd04) at main.c:97
#15 0x400fccb3 in __libc_start_main (main=0x80a0920 <main>, argc=7,    argv=0xbffffd04, init=0x8061360 <_init>,
fini=0x810d63c<_fini>,    rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffcfc)   at
../sysdeps/generic/libc-start.c:78
(gdb) 

About 3 - 5 seconds later, the gdb attached to the vacuuming psql gets the
following - Bear in mind this is a DIFFERENT BACKEND

Program received signal SIGSEGV, Segmentation fault.
AllocSetReset (set=0x0) at aset.c:159
aset.c:159: No such file or directory.
(gdb) where
#0  AllocSetReset (set=0x0) at aset.c:159
#1  0x810ac12 in EndPortalAllocMode () at portalmem.c:938
#2  0x8078833 in AtAbort_Memory () at xact.c:800
#3  0x80789ff in AbortTransaction () at xact.c:1026
#4  0x8078aef in AbortCurrentTransaction () at xact.c:1243
#5  0x80deed6 in PostgresMain (argc=-1073742288, argv=0xbffff7a0, real_argc=7,    real_argv=0xbffffd04) at
postgres.c:1550
#6  0x80c91ba in DoBackend (port=0x81ea4a0) at postmaster.c:1628
#7  0x80c8cda in BackendStartup (port=0x81ea4a0) at postmaster.c:1373
#8  0x80c8429 in ServerLoop () at postmaster.c:823
#9  0x80c7f67 in PostmasterMain (argc=7, argv=0xbffffd04) at postmaster.c:616
#10 0x80a0986 in main (argc=7, argv=0xbffffd04) at main.c:97
#11 0x400fccb3 in __libc_start_main (main=0x80a0920 <main>, argc=7,    argv=0xbffffd04, init=0x8061360 <_init>,
fini=0x810d63c<_fini>,    rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffcfc)   at
../sysdeps/generic/libc-start.c:78
(gdb) 


------------------------------------------------------------------------

This is REPRODUCABLE

I have run this through 5 times and it only managed to vacuum two times.

Now, I have the two GDBs still running, so if you need any values such
as variable states, let me know and I will supply them. (I'll be out of
the house till about 7.30pm UK time tomorrow, so it will have to be after
then).

Hopefully, this will be something can get fixed for 6.5.2 as it is a BIG
problem for me, as it happens when I am building my postgresql search
engine.

Thanks                    M Simms


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Tom Lane
Дата:
Michael Simms <grim@argh.demon.co.uk> writes:
> Okee, I have caught the vacuum analyse crash that was giving me a load
> of grief

I spent a good while last night trying to duplicate your report, but
couldn't on either current sources or latest REL6_5 branch.  Maybe that
means we fixed the problem already --- but I think it's more likely that
there's a platform dependency involved, or some additional enabling
condition that you forgot to mention.  So you'll have to keep poking
at it.  However, I do have some comments and suggestions.

Starting at the end, the final crash:

> Program received signal SIGSEGV, Segmentation fault.
> AllocSetReset (set=0x0) at aset.c:159
> aset.c:159: No such file or directory.
> (gdb) where
> #0  AllocSetReset (set=0x0) at aset.c:159
> #1  0x810ac12 in EndPortalAllocMode () at portalmem.c:938
> #2  0x8078833 in AtAbort_Memory () at xact.c:800
> #3  0x80789ff in AbortTransaction () at xact.c:1026
> #4  0x8078aef in AbortCurrentTransaction () at xact.c:1243
> #5  0x80deed6 in PostgresMain (argc=-1073742288, argv=0xbffff7a0, real_argc=7, 
>     real_argv=0xbffffd04) at postgres.c:1550

appears to be the same bug I alluded to before: memory allocation isn't
cleaned up properly if elog(ERROR) is executed outside a transaction.
I know how to fix this, and will do so before 6.5.2, but fixing it will
just prevent a coredump after an error has already occurred.  What we
need to be looking for in your setup is the reason why an error is being
reported at all.

The first thing you can do is look to see what the error message is ---
it should be in the postmaster's stderr logfile, even though 6.5.* libpq
omits to display it because of the crash.  Another useful thing would be
to set a breakpoint at elog() so that you can examine the context of the
error report.  (Actually, since vacuum generates lots of elog(NOTICE)
and elog(DEBUG), the breakpoint had better be further down in elog,
perhaps where it's about to longjmp back to PostgresMain.)

BTW, VACUUM runs a separate transaction for each table it works on,
so although most of its work is done inside a transaction, there are
short intervals between tables where it's not in one.  The error must
be getting reported during one of these intervals.  That narrows things
down a lot.

Now, about the SIGUSR2.  That's mighty suggestive, but it's not
necessarily an indication of a bug.  There are two reasons why a
SIGUSR2 would be delivered to a backend.  One is LISTEN/NOTIFY, which
I assume we can discount, since you'd have to have explicitly done it.
The other is that the SI message buffer manager sends a broadcast
SIGUSR2 to all the backends if it thinks its message buffer is getting
dangerously full --- presumably that's what you saw happening.  So the
questions raised are (a) why is the buffer getting full, and (b) could
it have actually overflowed later, and if so did the overflow recovery
code work?

Background: the SI message buffer is a scheme for notifying backends
to discard cached copies of tuples from the system tables.  Whenever
a backend modifies a tuple from the system tables, it has to send out
an SI ("shared cache invalidation") message telling the other backends
it has changed tuple X in table Y, so that they discard their
no-longer-accurate cached copies, if any.  Messages in the buffer can be
discarded only after all active backends have read them.  Stuff like
VACUUM tends to produce a lot of SI messages, but table
creation/deletion causes some too.

The SI buffer is global across an installation (one postmaster and its
children), *not* per-database, so even if you only had these two
backends connected to your crashtest database, they could have been
affected by anything that was going on in other databases belonging
to the same postmaster.  Were there other backends alive at the time,
and if so what were they doing, or not doing?

If all the backends are busy, the SI buffer really shouldn't get
anywhere near full, although I suppose it could happen under extreme
conditions.  The case where the buffer tends to fill is when one or
more backends are sitting idle, waiting for a client command.  They'll
only read SI messages during transaction start (or, in 6.6, after
acquiring a lock), so an idle backend blocks the SI buffer from
emptying.  The point of the SIGUSR2 broadcast is to kick-start idle
backends so that they will execute a transaction and receive their
SI messages.

I am guessing that you had an idle backend connected to some other
database of the same postmaster, so that the SI buffer was getting
full of the messages being generated by the VACUUM and table create/
delete processes.  If you had no idle backends then we ought to look
for the reason for the SI buffer filling far enough to cause SIGUSR2;
but if you did then it's a normal condition.

BTW, if you have a backend stopped due to a gdb breakpoint or trap,
it's certainly not consuming SI messages.  So when you left it sit after
observing the SIGUSR2, you altered the behavior.  The VACUUM process was
still generating SI messages, and even though the hypothetical idle
backend would have eaten its messages thanks to SIGUSR2, the one you had
blocked with gdb stopped doing so.  So eventually there would have been
an SI overflow condition.  I think that would take longer than "3-5
seconds", though, unless your machine is way faster than mine.

If the SI buffer does overflow because someone isn't eating his messages
promptly, it's not supposed to be fatal.  Rather, the SI manager
reports to all the backends "Sorry, I've lost track of what's going on,
so I suggest you discard *all* your cached tuples."  However, 6.5.* has
some bugs in the shared cache reset process.  (I think I have fixed these
for 6.6, but there are enough poorly-tested revisions in the affected
modules that we'd decided not to risk trying to back-patch 6.5.*.)

Anyway, back to the point: this is all very suggestive that maybe what
you are seeing is SI overflow and a consequent failure.  But I'm not
convinced of it, partly because of the timing issue and partly because
the vacuum process would have issued a "NOTICE: SIReadEntryData: cache
state reset" message before entering the buggy code.  I didn't see one
in your trace.  However, if you find instances of this message in
your postmaster logfile, then it's definitely a possible cause.
(I would still wonder why SI overflow is occurring in the normal case
where you're not using gdb to block a backend from responding to
SIGUSR2.)

I hope this gives you enough info to poke at the problem more
intelligently.

Lastly, did you build with --enable-cassert?  The assert checks slow things
down a little, but are often real helpful when looking for backend bugs.
        regards, tom lane


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Tom Lane
Дата:
I wrote:
> appears to be the same bug I alluded to before: memory allocation isn't
> cleaned up properly if elog(ERROR) is executed outside a transaction.
> I know how to fix this, and will do so before 6.5.2, but fixing it will
> just prevent a coredump after an error has already occurred.

Here is the patch to fix that problem; line numbers are for current,
but it should apply to 6.5.1 with small offsets.

Also: have you applied the vc_abort patch discussed a month ago (see
my post in pgsql-patches, 11 Aug)?  If not, that could well be the
source of your troubles.  You might want to just grab the 6.5.2-beta
tarball and apply this patch, or even better pull the current state
of the REL6_5_PATCHES branch from the CVS server.
        regards, tom lane

*** src/include/utils/portal.h.orig    Thu Jul 15 21:11:26 1999
--- src/include/utils/portal.h    Thu Sep  9 11:59:00 1999
***************
*** 75,80 ****
--- 75,81 ---- extern void PortalDestroy(Portal *portalP); extern void StartPortalAllocMode(AllocMode mode, Size
limit);extern void EndPortalAllocMode(void);
 
+ extern void PortalResetHeapMemory(Portal portal); extern PortalVariableMemory PortalGetVariableMemory(Portal portal);
externPortalHeapMemory PortalGetHeapMemory(Portal portal); 
 
*** src/backend/utils/mmgr/portalmem.c.orig    Sat Jul 17 23:20:03 1999
--- src/backend/utils/mmgr/portalmem.c    Thu Sep  9 11:59:36 1999
***************
*** 83,89 **** static void CollectNamedPortals(Portal *portalP, int destroy); static Portal
PortalHeapMemoryGetPortal(PortalHeapMemorycontext); static PortalVariableMemory
PortalHeapMemoryGetVariableMemory(PortalHeapMemorycontext);
 
- static void PortalResetHeapMemory(Portal portal); static Portal PortalVariableMemoryGetPortal(PortalVariableMemory
context); /* ----------------
 
--- 83,88 ----
***************
*** 838,844 ****  *        BadArg if mode is invalid.  * ----------------  */
! static void PortalResetHeapMemory(Portal portal) {     PortalHeapMemory context;
--- 837,843 ----  *        BadArg if mode is invalid.  * ----------------  */
! void PortalResetHeapMemory(Portal portal) {     PortalHeapMemory context;
*** src/backend/access/transam/xact.c.orig    Sun Sep  5 13:12:34 1999
--- src/backend/access/transam/xact.c    Thu Sep  9 12:00:23 1999
***************
*** 694,712 **** AtCommit_Memory() {     Portal        portal;
-     MemoryContext portalContext;      /* ----------------
!      *    Release memory in the blank portal.
!      *    Since EndPortalAllocMode implicitly works on the current context,
!      *    first make real sure that the blank portal is the selected context.
!      *    (This is probably not necessary, but seems like a good idea...)      * ----------------      */     portal
=GetPortalByName(NULL);
 
!     portalContext = (MemoryContext) PortalGetHeapMemory(portal);
!     MemoryContextSwitchTo(portalContext);
!     EndPortalAllocMode();      /* ----------------      *    Now that we're "out" of a transaction, have the
--- 694,706 ---- AtCommit_Memory() {     Portal        portal;      /* ----------------
!      *    Release all heap memory in the blank portal.      * ----------------      */     portal =
GetPortalByName(NULL);
!     PortalResetHeapMemory(portal);      /* ----------------      *    Now that we're "out" of a transaction, have
the
***************
*** 784,802 **** AtAbort_Memory() {     Portal        portal;
-     MemoryContext portalContext;      /* ----------------
!      *    Release memory in the blank portal.
!      *    Since EndPortalAllocMode implicitly works on the current context,
!      *    first make real sure that the blank portal is the selected context.
!      *    (This is ESSENTIAL in case we aborted from someplace where it wasn't.)      * ----------------      */
portal= GetPortalByName(NULL);
 
!     portalContext = (MemoryContext) PortalGetHeapMemory(portal);
!     MemoryContextSwitchTo(portalContext);
!     EndPortalAllocMode();      /* ----------------      *    Now that we're "out" of a transaction, have the
--- 778,790 ---- AtAbort_Memory() {     Portal        portal;      /* ----------------
!      *    Release all heap memory in the blank portal.      * ----------------      */     portal =
GetPortalByName(NULL);
!     PortalResetHeapMemory(portal);      /* ----------------      *    Now that we're "out" of a transaction, have
the


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Michael Simms
Дата:
> The first thing you can do is look to see what the error message is ---
> it should be in the postmaster's stderr logfile, even though 6.5.* libpq
> omits to display it because of the crash.

Woops, I only redirected stdout. I will redirect stderr too. *2 minutes and
one crash later* Hmmm, nothing appears in the logs.

/usr/bin/postmaster -S -N 128 -B 256 -D/var/lib/pgsql/data > /tmp/postmasterout 2> /tmp/postmastererr

And nothing was in either log.

> Another useful thing would be
> to set a breakpoint at elog() so that you can examine the context of the
> error report.  (Actually, since vacuum generates lots of elog(NOTICE)
> and elog(DEBUG), the breakpoint had better be further down in elog,
> perhaps where it's about to longjmp back to PostgresMain.)
> 
> BTW, VACUUM runs a separate transaction for each table it works on,
> so although most of its work is done inside a transaction, there are
> short intervals between tables where it's not in one.  The error must
> be getting reported during one of these intervals.  That narrows things
> down a lot.
> 

<snip SI information>

Now, let me think for a moment:

Vacuum works on each table inside a transaction

The backend only reads the SI buffer when it starts a new transaction

What then happens if vacuum is vacuuming a BIG table (such as 300,000
lines) whilst another process is doing create and drop tables a lot.

Wouldnt the buffer fill up, as it was never starting a transaction
when vacuuming that big table?

However those were the only two backends active, it is a test
database on my home machine.

> an SI overflow condition.  I think that would take longer than "3-5
> seconds", though, unless your machine is way faster than mine.

Ive got an AMD K62-400 with 128 MB mmeory, not slow but not roastingly
fast either.

> I hope this gives you enough info to poke at the problem more
> intelligently.
> 
> Lastly, did you build with --enable-cassert?  The assert checks slow things
> down a little, but are often real helpful when looking for backend bugs.

Nope, I will recompile the new beta with this option, and post on the
progress. Thanks
                M Simms


RE: [HACKERS] Vacuum analyze bug CAUGHT

От
"Hiroshi Inoue"
Дата:
Hi

Vacuum couldn't preserve consistency without locking.
I'm anxious about locking for system tables.

> 
> Hi
> 
> Okee, I have caught the vacuum analyse crash that was giving me a load of
> grief
>

[snip]
> 
> Then, in psql, I run and get the following result:
> 
> -----------
> 
> crashtest=> vacuum analyze;
> NOTICE:  Rel pg_type: TID 4/3: InsertTransactionInProgress 129915 
> - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/5: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/6: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/7: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/8: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/9: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/10: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/11: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_attribute: TID 23/12: InsertTransactionInProgress 
> 129915 - can't shrink relation
> NOTICE:  Rel pg_class: TID 3/22: InsertTransactionInProgress 
> 129915 - can't shrink relation

CREATE TABLE doesn't lock system tables till end of transaction.
It's a cause of these NOTICE messages.

Should we lock system tables till end of transaction ?

Moreover CREATE TABLE doesn't acquire any lock for pg_attribute            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
while tuples are inserted into pg_attribute.
Concurrent vacuum may corrupt pg_attribute. 

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp 



Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Vadim Mikheev
Дата:
Hiroshi Inoue wrote:
> 
> > crashtest=> vacuum analyze;
> > NOTICE:  Rel pg_type: TID 4/3: InsertTransactionInProgress 129915
> > - can't shrink relation
...
> 
> CREATE TABLE doesn't lock system tables till end of transaction.
> It's a cause of these NOTICE messages.
> 
> Should we lock system tables till end of transaction ?

No, if we allow DDL statements inside BEGIN/END
(in long transaction).

> Moreover CREATE TABLE doesn't acquire any lock for pg_attribute
>                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> while tuples are inserted into pg_attribute.
> Concurrent vacuum may corrupt pg_attribute.

Should be fixed!

Vadim


RE: [HACKERS] Vacuum analyze bug CAUGHT

От
"Hiroshi Inoue"
Дата:
>
> Hiroshi Inoue wrote:
> >
> > > crashtest=> vacuum analyze;
> > > NOTICE:  Rel pg_type: TID 4/3: InsertTransactionInProgress 129915
> > > - can't shrink relation
> ...
> >
> > CREATE TABLE doesn't lock system tables till end of transaction.
> > It's a cause of these NOTICE messages.
> >
> > Should we lock system tables till end of transaction ?
>
> No, if we allow DDL statements inside BEGIN/END
> (in long transaction).
>
> > Moreover CREATE TABLE doesn't acquire any lock for pg_attribute
> >                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > while tuples are inserted into pg_attribute.
> > Concurrent vacuum may corrupt pg_attribute.
>
> Should be fixed!
>

Seems CREATE TABLE don't acquire any lock for pg_relcheck and
pg_attrdef as well as pg_attribute. There may be other pg_.......

Here is a patch.
This patch also removes UnlockRelation() in heap_destroy_with_catalog().

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

*** catalog/heap.c.orig    Tue Sep  7 08:52:04 1999
--- catalog/heap.c    Fri Sep 10 16:43:18 1999
***************
*** 547,552 ****
--- 547,553 ----      */     Assert(rel);     Assert(rel->rd_rel);
+     LockRelation(rel, AccessExclusiveLock);     hasindex = RelationGetForm(rel)->relhasindex;     if (hasindex)
 CatalogOpenIndices(Num_pg_attr_indices, Name_pg_attr_indices, idescs);
 
***************
*** 607,612 ****
--- 608,614 ----         dpp++;     }

+     UnlockRelation(rel, AccessExclusiveLock);     heap_close(rel);
     /*
***************
*** 1330,1336 ****
     rel->rd_nonameunlinked = TRUE;

-     UnlockRelation(rel, AccessExclusiveLock);
     heap_close(rel);

--- 1332,1337 ----
***************
*** 1543,1553 ****
--- 1544,1556 ----     values[Anum_pg_attrdef_adbin - 1] =
PointerGetDatum(textin(attrdef->adbin));     values[Anum_pg_attrdef_adsrc - 1] =
PointerGetDatum(textin(attrdef->adsrc));     adrel = heap_openr(AttrDefaultRelationName);
+     LockRelation(adrel, AccessExclusiveLock);     tuple = heap_formtuple(adrel->rd_att, values, nulls);
CatalogOpenIndices(Num_pg_attrdef_indices,Name_pg_attrdef_indices,
 
idescs);     heap_insert(adrel, tuple);     CatalogIndexInsert(idescs, Num_pg_attrdef_indices, adrel, tuple);
CatalogCloseIndices(Num_pg_attrdef_indices,idescs);
 
+     UnlockRelation(adrel, AccessExclusiveLock);     heap_close(adrel);
     pfree(DatumGetPointer(values[Anum_pg_attrdef_adbin - 1]));
***************
*** 1606,1616 ****
--- 1609,1621 ----     values[Anum_pg_relcheck_rcbin - 1] =
PointerGetDatum(textin(check->ccbin));     values[Anum_pg_relcheck_rcsrc - 1] =
PointerGetDatum(textin(check->ccsrc));     rcrel = heap_openr(RelCheckRelationName);
+     LockRelation(rcrel, AccessExclusiveLock);     tuple = heap_formtuple(rcrel->rd_att, values, nulls);
CatalogOpenIndices(Num_pg_relcheck_indices,Name_pg_relcheck_indices,
 
idescs);     heap_insert(rcrel, tuple);     CatalogIndexInsert(idescs, Num_pg_relcheck_indices, rcrel, tuple);
CatalogCloseIndices(Num_pg_relcheck_indices,idescs);
 
+     UnlockRelation(rcrel, AccessExclusiveLock);     heap_close(rcrel);
     pfree(DatumGetPointer(values[Anum_pg_relcheck_rcname - 1]));





Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Vadim Mikheev
Дата:
Hiroshi Inoue wrote:
> 
...

> Here is a patch.
> This patch also removes UnlockRelation() in heap_destroy_with_catalog().

Marc, I would grant to Hiroshi full CVS access...

Vadim


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Tom Lane
Дата:
Michael Simms <grim@argh.demon.co.uk> writes:
> Now, let me think for a moment:
> Vacuum works on each table inside a transaction
> The backend only reads the SI buffer when it starts a new transaction
> What then happens if vacuum is vacuuming a BIG table (such as 300,000
> lines) whilst another process is doing create and drop tables a lot.
> Wouldnt the buffer fill up, as it was never starting a transaction
> when vacuuming that big table?

Yup, could happen.  (I think it would take several hundred create/
drop cycles, but that's certainly possible during a long vacuum.)
That's why there's code to deal with the possibility of SI buffer
overrun.

But as I said, I'm not convinced you are dealing with an SI overrun
--- and the lack of messages about it seems to point away from that
theory.  I brought it up because it was a possible area for trouble.
        regards, tom lane


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Tom Lane
Дата:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>>>> Moreover CREATE TABLE doesn't acquire any lock for pg_attribute
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> while tuples are inserted into pg_attribute.
>>>> Concurrent vacuum may corrupt pg_attribute.
>> 
>> Should be fixed!
>> 

> Seems CREATE TABLE don't acquire any lock for pg_relcheck and
> pg_attrdef as well as pg_attribute. There may be other pg_.......

> Here is a patch.

Hmm, do we really need to grab AccessExclusiveLock on the pg_ tables
while creating or deleting tables?  That will mean no concurrency at
all for these operations.  Seems to me we want AccessExclusiveLock on
the table being created or deleted, but something less strong on the
system tables.  RowExclusiveLock might be appropriate --- Vadim, what
do you think?

Also, rather than running around and adding locks to every single
place that calls heap_open or heap_close, I wonder whether we shouldn't
have heap_open/heap_close themselves automatically grab or release
at least a minimal lock (AccessShareLock, I suppose).

Or maybe better: change heap_open/heap_openr/heap_close to take an
additional parameter specifying the kind of lock to grab.  That'd still
mean having to visit all the call sites, but it would force people to
think about the issue in future rather than forgetting to lock a table
they're accessing.

Comments?

BTW, while I still haven't been able to reproduce Michael Simms' crash
reliably, I did see one coredump caused by an assert failure in
heap_delete():

#6  0x16181c in ExceptionalCondition (   conditionName=0x283d4 "!(( lp)->lp_flags & 0x01)", exceptionP=0x40009a80,
detail=0x0,fileName=0x7ae4 "\003", lineNumber=1121) at assert.c:72
 
#7  0x7cc18 in heap_delete (relation=0x400891c0, tid=0x402d962c, ctid=0x0)   at heapam.c:1121
#8  0x9c208 in DeleteAttributeTuples (rel=0x40535260) at heap.c:1118
#9  0x9c4dc in heap_destroy_with_catalog (   relname=0x4e4ed7 <Address 0x4e4ed7 out of bounds>) at heap.c:1310
#10 0xa5168 in RemoveRelation (   name=0x80db9380 <Address 0x80db9380 out of bounds>) at creatinh.c:157
#11 0x129760 in ProcessUtility (parsetree=0x402d8d28, dest=Remote)   at utility.c:215

This was in the process doing table creates/drops, and I surmise that
the problem was a tuple move executed concurrently by the process doing
VACUUM.  In other words, it looks like this problem of missing lock
operations might be the cause, or one cause, of Michael's symptoms.
        regards, tom lane


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Tom Lane
Дата:
I wrote:
> This was in the process doing table creates/drops, and I surmise that
> the problem was a tuple move executed concurrently by the process doing
> VACUUM.  In other words, it looks like this problem of missing lock
> operations might be the cause, or one cause, of Michael's symptoms.

On looking closer, that's not so, because the particular path taken here
*does* have a lock --- DeleteAttributeTuples() acquires
AccessExclusiveLock on pg_attribute, which is the relation heap_delete
is failing to find a tuple in.  The tuple it's trying to delete was
located by means of SearchSysCacheTupleCopy().

What I now think is that we have a variant of the SI-too-late problem:
vacuum has moved the underlying tuple, but the backend trying to do
the deletion hasn't heard about it yet, because it hasn't executed
a transaction start or CommandCounterIncrement since VACUUM processed
the table.  This is bolstered by the postmaster log, which shows the
second backend dying just as VACUUM commits pg_attribute:

DEBUG:  Rel pg_type: Pages: 6 --> 2; Tuple(s) moved: 1. Elapsed 0/0 sec.
DEBUG:  Index pg_type_typname_index: Pages 5; Tuples 116: Deleted 1. Elapsed 0/0 sec.
DEBUG:  Index pg_type_oid_index: Pages 2; Tuples 116: Deleted 1. Elapsed 0/0 sec.
DEBUG:  --Relation pg_attribute--
DEBUG:  Pages 33: Changed 1, Reapped 28, Empty 0, New 0; Tup 438: Vac 1976, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 97,
MaxLen97; Re-using: Free/Avail. Space 214444/207148; EndEmpty/Avail. Pages 0/27. Elapsed 0/0 sec.
 
DEBUG:  Index pg_attribute_attrelid_index: Pages 15; Tuples 438: Deleted 1976. Elapsed 0/1 sec.
DEBUG:  Index pg_attribute_relid_attnum_index: Pages 15; Tuples 438: Deleted 1976. Elapsed 0/0 sec.
DEBUG:  Index pg_attribute_relid_attnam_index: Pages 48; Tuples 438: Deleted 1976. Elapsed 0/1 sec.
DEBUG:  Rel pg_attribute: Pages: 33 --> 6; Tuple(s) moved: 8. Elapsed 0/0 sec.
DEBUG:  Index pg_attribute_attrelid_index: Pages 15; Tuples 438: Deleted 8. Elapsed 0/0 sec.
DEBUG:  Index pg_attribute_relid_attnum_index: Pages 15; Tuples 438: Deleted 8. Elapsed 0/0 sec.
DEBUG:  Index pg_attribute_relid_attnam_index: Pages 48; Tuples 438: Deleted 8. Elapsed 0/0 sec.
TRAP: Failed Assertion("!(( lp)->lp_flags & 0x01):", File: "heapam.c", Line: 1121)

!(( lp)->lp_flags & 0x01) (0) [Not a typewriter]
DEBUG:  --Relation pg_proc--
DEBUG:  Pages 21: Changed 0, Reapped 0, Empty 0, New 0; Tup 1021: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 145,
MaxLen197; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. Elapsed 0/0 sec.
 
(vacuum manages to get through a couple more tables before hearing the
"thou shalt exit" signal from the postmaster...)


It's looking to me like there may be no way to fix this in 6.5.*
short of adopting the recent 6.6 relcache/SI changes.  Specifically,
the one we need is reading SI messages after acquiring a lock, but
I doubt we can pull out just that one without the rest.

I'm not real eager to do this given the little amount of testing
those changes have had, but maybe we have no choice...
        regards, tom lane


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Vadim Mikheev
Дата:
Tom Lane wrote:
> 
> Also, rather than running around and adding locks to every single
> place that calls heap_open or heap_close, I wonder whether we shouldn't
> have heap_open/heap_close themselves automatically grab or release
> at least a minimal lock (AccessShareLock, I suppose).

This could result in deadlocks...

> Or maybe better: change heap_open/heap_openr/heap_close to take an
> additional parameter specifying the kind of lock to grab.  That'd still
> mean having to visit all the call sites, but it would force people to
> think about the issue in future rather than forgetting to lock a table
> they're accessing.

This way is better.

Vadim


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Bruce Momjian
Дата:
> Tom Lane wrote:
> > 
> > Also, rather than running around and adding locks to every single
> > place that calls heap_open or heap_close, I wonder whether we shouldn't
> > have heap_open/heap_close themselves automatically grab or release
> > at least a minimal lock (AccessShareLock, I suppose).
> 
> This could result in deadlocks...
> 
> > Or maybe better: change heap_open/heap_openr/heap_close to take an
> > additional parameter specifying the kind of lock to grab.  That'd still
> > mean having to visit all the call sites, but it would force people to
> > think about the issue in future rather than forgetting to lock a table
> > they're accessing.
> 
> This way is better.

Just a reminder.  heap_getnext() already locks the _buffer_, and
heap_fetch() requires you pass a variable to hold the buffer number, so
you can release the buffer lock when you are done.

This was not the case in < 6.4 releases, and there is no reason not to
add additional parameters to function calls like I did for heap_fetch() if
it makes sense.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Tom Lane
Дата:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
>> Tom Lane wrote:
>>>> Or maybe better: change heap_open/heap_openr/heap_close to take an
>>>> additional parameter specifying the kind of lock to grab.

>> This way is better.

> ... there is no reason not to add additional parameters to function
> calls like I did for heap_fetch() if it makes sense.

OK.  Another thing that's been on my to-do list is that a lot of places
fail to check for a failure return from heap_open(r), which means you
get a null pointer dereference coredump instead of a useful message if
anything goes wrong.  (But, of course, when opening a system table
nothing can ever go wrong ... go wrogn ...)

Here's what I propose:

Add another parameter to heap_open and heap_openr, which can be any of
the lock types currently mentioned in storage/lmgr.h, or "NoLock".
With "NoLock" you get the current behavior: no lock is acquired, and
NULL is returned if the open fails; it's up to the caller to check that
and do something appropriate.  Otherwise, the routines will check for
open failure and raise a standard elog(ERROR) if they do not succeed;
furthermore, they will acquire the specified type of lock on the
relation before returning.  (And, thanks to the already-in-place call
in LockRelation, any pending SI-inval messages will be read.)

heap_close will also take an additional parameter, which is a lock type
to release the specified lock, or NoLock to release no lock.  (Note
that it is often correct not to release the lock acquired at heap_open
during heap_close; in this case, the lock is held till end of
transaction.  So, we don't want heap_close to automagically release
whatever lock was acquired by the corresponding heap_open, even if it
were easy to do so which it isn't...)

If I don't hear any objections, I'll get on with implementing that.
        regards, tom lane


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Bruce Momjian
Дата:
Sounds like a good plan.  I found it quite easy to make changes link
this because mkid, and your tool glimpse, lets you pull up all functions
that contain a certain function or call to that function, pull them into
an editor, and away you go.


> Here's what I propose:
> 
> Add another parameter to heap_open and heap_openr, which can be any of
> the lock types currently mentioned in storage/lmgr.h, or "NoLock".
> With "NoLock" you get the current behavior: no lock is acquired, and
> NULL is returned if the open fails; it's up to the caller to check that
> and do something appropriate.  Otherwise, the routines will check for
> open failure and raise a standard elog(ERROR) if they do not succeed;
> furthermore, they will acquire the specified type of lock on the
> relation before returning.  (And, thanks to the already-in-place call
> in LockRelation, any pending SI-inval messages will be read.)
> 
> heap_close will also take an additional parameter, which is a lock type
> to release the specified lock, or NoLock to release no lock.  (Note
> that it is often correct not to release the lock acquired at heap_open
> during heap_close; in this case, the lock is held till end of
> transaction.  So, we don't want heap_close to automagically release
> whatever lock was acquired by the corresponding heap_open, even if it
> were easy to do so which it isn't...)
> 
> If I don't hear any objections, I'll get on with implementing that.
> 
>             regards, tom lane
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Vacuum analyze bug CAUGHT

От
Bruce Momjian
Дата:
Is this patch still valid?


[Charset iso-8859-1 unsupported, filtering to ASCII...]
> >
> > Hiroshi Inoue wrote:
> > >
> > > > crashtest=> vacuum analyze;
> > > > NOTICE:  Rel pg_type: TID 4/3: InsertTransactionInProgress 129915
> > > > - can't shrink relation
> > ...
> > >
> > > CREATE TABLE doesn't lock system tables till end of transaction.
> > > It's a cause of these NOTICE messages.
> > >
> > > Should we lock system tables till end of transaction ?
> >
> > No, if we allow DDL statements inside BEGIN/END
> > (in long transaction).
> >
> > > Moreover CREATE TABLE doesn't acquire any lock for pg_attribute
> > >                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > while tuples are inserted into pg_attribute.
> > > Concurrent vacuum may corrupt pg_attribute.
> >
> > Should be fixed!
> >
>
> Seems CREATE TABLE don't acquire any lock for pg_relcheck and
> pg_attrdef as well as pg_attribute. There may be other pg_.......
>
> Here is a patch.
> This patch also removes UnlockRelation() in heap_destroy_with_catalog().
>
> Regards.
>
> Hiroshi Inoue
> Inoue@tpf.co.jp
>
> *** catalog/heap.c.orig    Tue Sep  7 08:52:04 1999
> --- catalog/heap.c    Fri Sep 10 16:43:18 1999
> ***************
> *** 547,552 ****
> --- 547,553 ----
>        */
>       Assert(rel);
>       Assert(rel->rd_rel);
> +     LockRelation(rel, AccessExclusiveLock);
>       hasindex = RelationGetForm(rel)->relhasindex;
>       if (hasindex)
>           CatalogOpenIndices(Num_pg_attr_indices, Name_pg_attr_indices, idescs);
> ***************
> *** 607,612 ****
> --- 608,614 ----
>           dpp++;
>       }
>
> +     UnlockRelation(rel, AccessExclusiveLock);
>       heap_close(rel);
>
>       /*
> ***************
> *** 1330,1336 ****
>
>       rel->rd_nonameunlinked = TRUE;
>
> -     UnlockRelation(rel, AccessExclusiveLock);
>
>       heap_close(rel);
>
> --- 1332,1337 ----
> ***************
> *** 1543,1553 ****
> --- 1544,1556 ----
>       values[Anum_pg_attrdef_adbin - 1] =
> PointerGetDatum(textin(attrdef->adbin));
>       values[Anum_pg_attrdef_adsrc - 1] =
> PointerGetDatum(textin(attrdef->adsrc));
>       adrel = heap_openr(AttrDefaultRelationName);
> +     LockRelation(adrel, AccessExclusiveLock);
>       tuple = heap_formtuple(adrel->rd_att, values, nulls);
>       CatalogOpenIndices(Num_pg_attrdef_indices, Name_pg_attrdef_indices,
> idescs);
>       heap_insert(adrel, tuple);
>       CatalogIndexInsert(idescs, Num_pg_attrdef_indices, adrel, tuple);
>       CatalogCloseIndices(Num_pg_attrdef_indices, idescs);
> +     UnlockRelation(adrel, AccessExclusiveLock);
>       heap_close(adrel);
>
>       pfree(DatumGetPointer(values[Anum_pg_attrdef_adbin - 1]));
> ***************
> *** 1606,1616 ****
> --- 1609,1621 ----
>       values[Anum_pg_relcheck_rcbin - 1] =
> PointerGetDatum(textin(check->ccbin));
>       values[Anum_pg_relcheck_rcsrc - 1] =
> PointerGetDatum(textin(check->ccsrc));
>       rcrel = heap_openr(RelCheckRelationName);
> +     LockRelation(rcrel, AccessExclusiveLock);
>       tuple = heap_formtuple(rcrel->rd_att, values, nulls);
>       CatalogOpenIndices(Num_pg_relcheck_indices, Name_pg_relcheck_indices,
> idescs);
>       heap_insert(rcrel, tuple);
>       CatalogIndexInsert(idescs, Num_pg_relcheck_indices, rcrel, tuple);
>       CatalogCloseIndices(Num_pg_relcheck_indices, idescs);
> +     UnlockRelation(rcrel, AccessExclusiveLock);
>       heap_close(rcrel);
>
>       pfree(DatumGetPointer(values[Anum_pg_relcheck_rcname - 1]));
>
>
>
>
> ************
>
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026


RE: [HACKERS] Vacuum analyze bug CAUGHT

От
"Hiroshi Inoue"
Дата:
> -----Original Message-----
> From: Bruce Momjian [mailto:maillist@candle.pha.pa.us]
> Sent: Tuesday, September 28, 1999 1:11 PM
> To: Hiroshi Inoue
> Cc: Vadim Mikheev; Michael Simms; pgsql-hackers@postgreSQL.org
> Subject: Re: [HACKERS] Vacuum analyze bug CAUGHT
> 
> 
> 
> Is this patch still valid?
>

No,it has been already changed by Tom together with
many other changes. 

Regards. 

Hiroshi Inoue
Inoue@tpf.co.jp