Обсуждение: 7.5dev assertion failure w/ v3 protocol and transactions

Поиск
Список
Период
Сортировка

7.5dev assertion failure w/ v3 protocol and transactions

От
Kris Jurka
Дата:
Running the JDBC driver's regression tests I came across an assertion
failure in 7.5dev.  The test does:

BEGIN;
SELECT 1/0;
COMMIT;

and gets the following stacktrace.  Note this cannot be replicated in psql
because it requires using the extended query protocol.

Kris Jurka

#0  0x402a5721 in kill () from /lib/libc.so.6
#1  0x402a54c5 in raise () from /lib/libc.so.6
#2  0x402a69e8 in abort () from /lib/libc.so.6
#3  0x082313c2 in ExceptionalCondition (
    conditionName=0x6 <Address 0x6 out of bounds>,
    errorType=0x82546fb "FailedAssertion", fileName=0x403ab550 "`\022",
    lineNumber=0) at assert.c:50
#4  0x08247342 in AtCleanup_Portals () at portalmem.c:574
#5  0x080a9336 in CleanupTransaction () at xact.c:1597
#6  0x080a9567 in CommitTransactionCommand () at xact.c:1760
#7  0x081bf024 in finish_xact_command () at postgres.c:1842
#8  0x081beb4f in exec_execute_message (portal_name=0x8377401 "",
max_rows=1)
    at postgres.c:1686
#9  0x081c04e2 in PostgresMain (argc=4, argv=0x83329ac,
    username=0x83328b8 "test") at postgres.c:3021
#10 0x081917b6 in BackendRun (port=0x833f898) at postmaster.c:2790
#11 0x08191163 in BackendStartup (port=0x833f898) at postmaster.c:2411
#12 0x0818f43a in ServerLoop () at postmaster.c:1198
#13 0x0818e685 in PostmasterMain (argc=3, argv=0x83324b8) at
postmaster.c:885
#14 0x081587d5 in main (argc=3, argv=0x83324b8) at main.c:265

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Kris Jurka
Дата:
On Thu, 29 Jul 2004, Kris Jurka wrote:

>
> Running the JDBC driver's regression tests I came across an assertion
> failure in 7.5dev.  The test does:
>

This also produces a second core file in the directory the postmaster was
started in which contains:

#0  0x402a5721 in kill () from /lib/libc.so.6
#1  0x402a54c5 in raise () from /lib/libc.so.6
#2  0x402a69e8 in abort () from /lib/libc.so.6
#3  0x08231994 in errfinish (dummy=0) at elog.c:473
#4  0x080b1e2c in ReadControlFile () at xlog.c:3130
#5  0x080b2257 in XLOGShmemInit () at xlog.c:3381
#6  0x081afd6a in CreateSharedMemoryAndSemaphores (makePrivate=0 '\0',
    maxBackends=100, port=5750) at ipci.c:135
#7  0x08190073 in reset_shared (port=0) at postmaster.c:1750
#8  0x08190928 in reaper (postgres_signal_arg=17) at postmaster.c:2087
#9  <signal handler called>
#10 0x4034cdd2 in select () from /lib/libc.so.6
#11 0x08331040 in Unix_socket_group ()
#12 0x00000000 in ?? ()
#13 0xbffff7e0 in ?? ()
#14 0x0818f260 in ServerLoop () at postmaster.c:1147
#15 0x0818e685 in PostmasterMain (argc=3, argv=0x8332630) at
postmaster.c:885
#16 0x081587d5 in main (argc=3, argv=0x8332630) at main.c:265

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Tom Lane
Дата:
Kris Jurka <books@ejurka.com> writes:
> This also produces a second core file in the directory the postmaster was
> started in which contains:

> #0  0x402a5721 in kill () from /lib/libc.so.6
> #1  0x402a54c5 in raise () from /lib/libc.so.6
> #2  0x402a69e8 in abort () from /lib/libc.so.6
> #3  0x08231994 in errfinish (dummy=0) at elog.c:473
> #4  0x080b1e2c in ReadControlFile () at xlog.c:3130
> #5  0x080b2257 in XLOGShmemInit () at xlog.c:3381

This appears to be here:

        ereport(PANIC,
                (errcode_for_file_access(),
                 errmsg("could not open control file \"%s\": %m",
                        ControlFilePath)));

but it's difficult to go any further without knowing the errno.
I certainly can't think of any good reason why the recovery process
should be unable to open pg_control.  Do you have the postmaster log?

            regards, tom lane

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Kris Jurka
Дата:
On Thu, 29 Jul 2004, Tom Lane wrote:

> This appears to be here:
>
>         ereport(PANIC,
>                 (errcode_for_file_access(),
>                  errmsg("could not open control file \"%s\": %m",
>                         ControlFilePath)));
>
> but it's difficult to go any further without knowing the errno.
> I certainly can't think of any good reason why the recovery process
> should be unable to open pg_control.  Do you have the postmaster log?
>

Nothing shows up in the log.  The test suite continues running, pounding
the restarting database with connections so I see:

LOG:  statement: BEGIN
LOG:  statement: select 1/0
ERROR:  division by zero
LOG:  statement: COMMIT
TRAP: FailedAssertion("!(portal->resowner == ((void *)0))", File:
"portalmem.c", Line: 561)
LOG:  server process (PID 7270) was terminated by signal 6
LOG:  terminating any other active server processes
LOG:  background writer process (PID 7103) exited with exit code 1
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2004-07-29 14:59:58 PDT
LOG:  checkpoint record is at 0/A471F8
LOG:  redo record is at 0/A471F8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 492; next OID: 17228
LOG:  database system was not properly shut down; automatic recovery in
progress
FATAL:  the database system is starting up
FATAL:  the database system is starting up

.. many more starting up messages

FATAL:  the database system is starting up
FATAL:  the database system is starting up
LOG:  redo starts at 0/A47238
LOG:  record with zero length at 0/D3599C
LOG:  redo done at 0/D3594C
FATAL:  the database system is starting up
FATAL:  the database system is starting up

.. many more starting up messages

FATAL:  the database system is starting up
FATAL:  the database system is starting up
LOG: database system is ready

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Tom Lane
Дата:
Kris Jurka <books@ejurka.com> writes:
> Running the JDBC driver's regression tests I came across an assertion
> failure in 7.5dev.  The test does:

> #3  0x082313c2 in ExceptionalCondition (
>     conditionName=0x6 <Address 0x6 out of bounds>,
>     errorType=0x82546fb "FailedAssertion", fileName=0x403ab550 "`\022",
>     lineNumber=0) at assert.c:50
> #4  0x08247342 in AtCleanup_Portals () at portalmem.c:574
> #5  0x080a9336 in CleanupTransaction () at xact.c:1597

There's no Assert at line 574 in CVS tip; how far back is your copy?
I think this might be related to some changes I made recently, so
please try if it still happens in CVS tip.

            regards, tom lane

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Kris Jurka
Дата:
On Thu, 29 Jul 2004, Tom Lane wrote:

> There's no Assert at line 574 in CVS tip; how far back is your copy?
> I think this might be related to some changes I made recently, so
> please try if it still happens in CVS tip.
>

This is odd.  I updated earlier today, and just to make sure I did a make
distclean ; cvs update, and rebuilt now.  I still get the same stacktrace
showing portalmem.c:574, but in the postmaster log the assertion line is
attributed to line 561, which does have an assert.

The error related to the control file was due to an errant script I had
which did an rm -rf data ; initdb -D data from underneath a postmaster
that was in the process of shutting down.

Kris Jurka

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Kris Jurka
Дата:
On Thu, 29 Jul 2004, Kris Jurka wrote:

>
>
> On Thu, 29 Jul 2004, Tom Lane wrote:
>
> > There's no Assert at line 574 in CVS tip; how far back is your copy?
> > I think this might be related to some changes I made recently, so
> > please try if it still happens in CVS tip.
> >

The line number seems compiler related although the error is the same.
Building with gcc-2.95 instead of 3.3.4 produces this stacktrace:

Kris Jurka

#0  0x40238781 in kill () from /lib/libc.so.6
#1  0x40238464 in raise () from /lib/libc.so.6
#2  0x40239be1 in abort () from /lib/libc.so.6
#3  0x081ebb4b in Letext () at assert.c:50
#4  0x081fdf18 in AtCleanup_Portals () at portalmem.c:561
#5  0x0809e910 in CleanupTransaction () at xact.c:1500
#6  0x0809eacc in CommitTransactionCommand () at xact.c:1663
#7  0x0818ab0f in finish_xact_command () at postgres.c:1836
#8  0x0818a80c in exec_execute_message (portal_name=0x833b18c "",
max_rows=1)
    at postgres.c:1680
#9  0x0818c127 in PostgresMain (argc=4, argv=0x82f6424,
    username=0x82f6330 "test") at postgres.c:3013
#10 0x08164465 in BackendRun (port=0x8303558) at postmaster.c:2789
#11 0x08163c40 in BackendStartup (port=0x8303558) at postmaster.c:2410
#12 0x08162198 in ServerLoop () at postmaster.c:1197
#13 0x081619ff in PostmasterMain (argc=3, argv=0x82f5f50) at
postmaster.c:884
#14 0x0813330d in main (argc=3, argv=0xbffffd74) at main.c:265

Re: 7.5dev assertion failure w/ v3 protocol and transactions

От
Tom Lane
Дата:
Kris Jurka <books@ejurka.com> writes:
> Running the JDBC driver's regression tests I came across an assertion
> failure in 7.5dev.  The test does:

> BEGIN;
> SELECT 1/0;
> COMMIT;

> and gets the following stacktrace.  Note this cannot be replicated in psql
> because it requires using the extended query protocol.

Fixed --- the problem is just an overprotective Assert().

            regards, tom lane