BUG #14053: postgres crashes in plpgsql under load of concurrent transactions

Поиск
Список
Период
Сортировка
От maxim@jeslalabs.com
Тема BUG #14053: postgres crashes in plpgsql under load of concurrent transactions
Дата
Msg-id 20160330070414.8944.52106@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #14053: postgres crashes in plpgsql under load of concurrent transactions  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      14053
Logged by:          Maxim Zakharov
Email address:      maxim@jeslalabs.com
PostgreSQL version: 9.4.6
Operating system:   Ubuntu
Description:

2016-03-30 14:30:09 AEDT [21935-17] LOG:  database system is ready to accept
connections
2016-03-30 15:14:08 AEDT [21935-18] LOG:  server process (PID 4070) was
terminated by signal 11: Segmentation fault
2016-03-30 15:14:08 AEDT [21935-19] LOG:  terminating any other active
server processes
2016-03-30 15:14:08 AEDT [5441-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5441-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5441-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5402-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5402-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5402-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5419-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5419-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5419-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5400-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5400-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5400-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5398-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5398-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5398-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5395-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5395-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5395-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5385-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5385-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5385-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [5377-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [5377-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [5377-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [3855-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [3855-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [3855-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [29127-2] WARNING:  terminating connection because
of crash of another server process
2016-03-30 15:14:08 AEDT [29127-3] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2016-03-30 15:14:08 AEDT [29127-4] HINT:  In a moment you should be able to
reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [29145-1] root@db_purgora WARNING:  terminating
connection because of crash of another server process
2016-03-30 15:14:08 AEDT [29145-2] root@db_purgora DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2016-03-30 15:14:08 AEDT [29145-3] root@db_purgora HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2016-03-30 15:14:08 AEDT [21935-20] LOG:  all server processes terminated;
reinitializing
2016-03-30 15:14:09 AEDT [8506-1] LOG:  database system was interrupted;
last known up at 2016-03-30 15:10:10 AEDT
2016-03-30 15:14:09 AEDT [8507-1] root@db_purgora FATAL:  the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8508-1] root@db_purgora FATAL:  the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8509-1] root@db_purgora FATAL:  the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8510-1] root@db_purgora FATAL:  the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8511-1] root@db_purgora FATAL:  the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8512-1] root@db_purgora FATAL:  the database
system is in recovery mode
2016-03-30 15:14:09 AEDT [8506-2] LOG:  database system was not properly
shut down; automatic recovery in progress
2016-03-30 15:14:09 AEDT [8506-3] LOG:  redo starts at 0/F3EB6208
2016-03-30 15:14:09 AEDT [8506-4] LOG:  record with zero length at
0/F418A208
2016-03-30 15:14:09 AEDT [8506-5] LOG:  redo done at 0/F418A1D8
2016-03-30 15:14:09 AEDT [8506-6] LOG:  last completed transaction was at
log time 2016-03-30 15:14:08.091602+11
2016-03-30 15:14:09 AEDT [8506-7] LOG:  MultiXact member wraparound
protections are now enabled
2016-03-30 15:14:09 AEDT [8516-1] LOG:  autovacuum launcher started
2016-03-30 15:14:09 AEDT [21935-21] LOG:  database system is ready to accept
connections


# gdb /usr/lib/postgresql/9.4/bin/postgres
/tmp/core-postgres-11-120-129-4070-1459311248
GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.4/bin/postgres...Reading symbols
from
/usr/lib/debug/.build-id/e4/3a4b938ce055dcbff82e81a44d43e708d6081f.debug...done.
done.
[New LWP 4070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: postgres db_purgora [local] SELECT
                        '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f8f42521521 in exec_eval_simple_expr (rettype=<optimised out>,
isNull=0x7ffcef42e143 "",
    result=<synthetic pointer>, expr=0x55d62c60dab8, estate=0x7ffcef42e320)
    at
/build/postgresql-9.4-IgGWiy/postgresql-9.4-9.4.6/build/../src/pl/plpgsql/src/pl_exec.c:5203
5203    /build/postgresql-9.4-IgGWiy/postgresql-9.4-9.4.6/build/../src/pl/plpgsql/src/pl_exec.c:
No such file or directory.
(gdb) p *expr
$1 = {dtype = 5, dno = 0, query = 0x55d62c60db48 "SELECT false", plan =
0x55d62ca07570, paramnos = 0x0,
  func = 0x55d62ca0ccb0, ns = 0x55d62c991e20, expr_simple_expr =
0x55d62ca07b90, expr_simple_generation = 1,
  expr_simple_type = 16, expr_simple_state = 0x0, expr_simple_in_use = 1
'\001', expr_simple_lxid = 0}
(gdb) p curlxid
$2 = <optimised out>
(gdb) p MyProc->lxid
$3 = 0

It looks like both MyProc->lxid (and therefore curlxid) and
expr->expr_simple_lxid are equal to InvalidLocalTransactionId (which is
defined to be 0) and are equal to each other, which means the following code
under if statement of exec_eval_simple_expr() in
src/pl/plpgsql/src/pl_exec.c (line 5158)  is not executed:

    if (expr->expr_simple_lxid != curlxid)
    {
        oldcontext =
MemoryContextSwitchTo(estate->simple_eval_estate->es_query_cxt);
        expr->expr_simple_state = ExecInitExpr(expr->expr_simple_expr, NULL);
        expr->expr_simple_in_use = false;
        expr->expr_simple_lxid = curlxid;
        MemoryContextSwitchTo(oldcontext);
    }

Perhaps the fix should be as follow:
        if (expr->expr_simple_lxid != curlxid || curlxid ==
InvalidLocalTransactionId)
    {
        oldcontext =
MemoryContextSwitchTo(estate->simple_eval_estate->es_query_cxt);
        expr->expr_simple_state = ExecInitExpr(expr->expr_simple_expr, NULL);
        expr->expr_simple_in_use = false;
        expr->expr_simple_lxid = curlxid;
        MemoryContextSwitchTo(oldcontext);
    }

Sorry, I have not checked the validity of this fix.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Rodrigo Morales Q
Дата:
Сообщение: error to instaal PLTCL
Следующее
От: tsingle@muddypaddock.com
Дата:
Сообщение: BUG #14054: "create index using gist ..." on large table never completes.