Re: BUG #13490: Segmentation fault on pg_stat_activity

Поиск
Список
Период
Сортировка
От Michael Bommarito
Тема Re: BUG #13490: Segmentation fault on pg_stat_activity
Дата
Msg-id CAN=rtBhSfzWC4H10pdFLevEWTnu8U8yn_2G14-MmewbjoVPMrg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #13490: Segmentation fault on pg_stat_activity  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Compiled from source with --debug using 9.5alpha1 with -O0 -ggdb
-fno-omit-frame-pointer -mno-red-zone.  Reset to default postgresql.conf.
Was able to generate segfaults repeatedly loading pghero dashboard.  First,
some errors from the pg logs.
========================================================================
========================================================
2015-07-18 15:07:58 UTC [27112-1] postgres@database ERROR:  attribute
number 2 exceeds number of columns 0

2015-07-18 15:07:58 UTC [27112-2] postgres@database STATEMENT:  SELECT
application_name AS source, client_addr AS ip, COUNT(*) AS
total_connections FROM pg_stat_activity WHERE pid <> pg_backend_pid() GROUP
BY application_name, ip ORDER BY COUNT(*) DESC, application_name ASC,
client_addr ASC

2015-07-18 15:08:23 UTC [27112-3] postgres@database ERROR:  invalid
varattno 66

2015-07-18 15:08:23 UTC [27112-4] postgres@database STATEMENT:  SELECT
relname AS table, indexrelname AS index,
pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size, idx_scan as
index_scans FROM pg_stat_user_indexes ui INNER JOIN pg_index i ON
ui.indexrelid = i.indexrelid WHERE NOT indisunique AND idx_scan < 50 ORDER
BY pg_relation_size(i.indexrelid) DESC, relname ASC

2015-07-18 15:17:19 UTC [3605-1] postgres@database ERROR:  tupdesc
reference 0x2bdd8a8 is not owned by resource owner Portal
========================================================================
========================================================


Next, attached to the backend PID under sudo gdb and `cont`:
========================================================================
========================================================
2015-07-18 15:48:38 UTC [10281-1] postgres@database ERROR:  tupdesc
reference 0xf77248 is not owned by resource owner Portal
2015-07-18 15:48:54 UTC [8812-4] LOG:  server process (PID 10538) was
terminated by signal 11: Segmentation fault
2015-07-18 15:48:54 UTC [8812-5] LOG:  terminating any other active server
processes
2015-07-18 15:48:54 UTC [10523-1] postgres@database WARNING:  terminating
connection because of crash of another server process
2015-07-18 15:48:54 UTC [10523-2] postgres@database DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2015-07-18 15:48:54 UTC [10523-3] postgres@database HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2015-07-18 15:48:54 UTC [10239-1] postgres@database WARNING:  terminating
connection because of crash of another server process
2015-07-18 15:48:54 UTC [10239-2] postgres@database DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2015-07-18 15:48:54 UTC [10239-3] postgres@database HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2015-07-18 15:48:54 UTC [10522-1] postgres@database WARNING:  terminating
connection because of crash of another server process
2015-07-18 15:48:54 UTC [10522-2] postgres@database DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2015-07-18 15:48:54 UTC [10522-3] postgres@database HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2015-07-18 15:48:54 UTC [10409-1] postgres@database WARNING:  terminating
connection because of crash of another server process
2015-07-18 15:48:54 UTC [10409-2] postgres@database DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2015-07-18 15:48:54 UTC [10409-3] postgres@database HINT:  In a moment you
should be able to reconnect to the database and repeat your command.
2015-07-18 15:48:54 UTC [10408-1] postgres@database WARNING:  terminating
connection because of crash of another server process
2015-07-18 15:48:54 UTC [10408-2] postgres@database DETAIL:  The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2015-07-18 15:48:54 UTC [10408-3] postgres@database HINT:  In a moment you
should be able to reconnect to the database and repeat your command.

Program received signal SIGQUIT, Quit.
0x00007f84fe78c110 in __poll_nocancel () at
../sysdeps/unix/syscall-template.S:81
81      in ../sysdeps/unix/syscall-template.S
(gdb) bt full
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address
0x7ffc65188bd8:
#0  0x00007f84fe78c110 in __poll_nocancel () at
../sysdeps/unix/syscall-template.S:81
No locals.
Cannot access memory at address 0x7ffc65188bd8
========================================================================
========================================================

Thanks,
Michael J. Bommarito II, CEO
Bommarito Consulting, LLC
*Web:* http://www.bommaritollc.com
*Mobile:* +1 (646) 450-3387

On Tue, Jul 14, 2015 at 8:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Michael Bommarito <michael@bommaritollc.com> writes:
> > If you can provide a patch that performs input validation in
> > get_tle_by_resno and logs the condition, I can compile and test with it.
>
> Wouldn't prove anything one way or another.  Somehow, a corrupt query tree
> is being fed to the planner; what we need to understand is what conditions
> cause that to happen.  I doubt that getting more details at the point
> where the code trips over the corruption will teach us that.
>
>                         regards, tom lane
>

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tobias Pfeiffer
Дата:
Сообщение: Re: BUG #13504: Types in math functions table is incorrect
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Lack of Sanity Checking in file 'pctcl.c' for PostgreSQL 9.4.x