[HACKERS] SIGSEGV in BRIN autosummarize

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема [HACKERS] SIGSEGV in BRIN autosummarize
Дата
Msg-id 20171014035732.GB31726@telsasoft.com
обсуждение исходный текст
Ответы Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
I upgraded one of our customers to PG10 Tuesday night, and Wednesday replaced
an BTREE index with BRIN index (WITH autosummarize).

Today I see:
< 2017-10-13 17:22:47.839 -04  >LOG:  server process (PID 32127) was terminated by signal 11: Segmentation fault
< 2017-10-13 17:22:47.839 -04  >DETAIL:  Failed process was running: autovacuum: BRIN summarize public.gtt 747263

postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in postgres[400000+692000]

[pryzbyj@database ~]$ rpm -qa postgresql10
postgresql10-10.0-1PGDG.rhel6.x86_64

Oct 13 17:22:45 database kernel: postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in
postgres[400000+692000]
Oct 13 17:22:47 database abrtd: Directory 'ccpp-2017-10-13-17:22:47-32127' creation detected
Oct 13 17:22:47 database abrt[32387]: Saved core dump of pid 32127 (/usr/pgsql-10/bin/postgres) to
/var/spool/abrt/ccpp-2017-10-13-17:22:47-32127(15040512 bytes)
 

..unfortunately:
Oct 13 17:22:47 database abrtd: Package 'postgresql10-server' isn't signed with proper key
Oct 13 17:22:47 database abrtd: 'post-create' on '/var/spool/abrt/ccpp-2017-10-13-17:22:47-32127' exited with 1
Oct 13 17:22:47 database abrtd: DELETING PROBLEM DIRECTORY '/var/spool/abrt/ccpp-2017-10-13-17:22:47-32127'

postgres=# SELECT * FROM bak_postgres_log_2017_10_13_1700 WHERE pid=32127 ORDER BY log_time DESC LIMIT 9;
-[ RECORD 1
]----------+---------------------------------------------------------------------------------------------------------
log_time               | 2017-10-13 17:22:45.56-04
pid                    | 32127
session_id             | 59e12e67.7d7f
session_line           | 2
command_tag            | 
session_start_time     | 2017-10-13 17:21:43-04
error_severity         | ERROR
sql_state_code         | 57014
message                | canceling autovacuum task
context                | processing work entry for relation
"gtt.public.cdrs_eric_egsnpdprecord_2017_10_13_recordopeningtime_idx"
-[ RECORD 2
]----------+---------------------------------------------------------------------------------------------------------
log_time               | 2017-10-13 17:22:44.557-04
pid                    | 32127
session_id             | 59e12e67.7d7f
session_line           | 1
session_start_time     | 2017-10-13 17:21:43-04
error_severity         | ERROR
sql_state_code         | 57014
message                | canceling autovacuum task
context                | automatic analyze of table "gtt.public.cdrs_huawei_sgsnpdprecord_2017_10_13"

Time: 375.552 ms

It looks like this table was being inserted into simultaneously by a python
program using multiprocessing.  It looks like each subprocess was INSERTing
into several tables, each of which has one BRIN index on timestamp column.

gtt=# \dt+ cdrs_eric_egsnpdprecord_2017_10_13public | cdrs_eric_egsnpdprecord_2017_10_13 | table | gtt   | 5841 MB | 

gtt=# \di+ cdrs_eric_egsnpdprecord_2017_10_13_recordopeningtime_idx public |
cdrs_eric_egsnpdprecord_2017_10_13_recordopeningtime_idx| index | gtt   | cdrs_eric_egsnpdprecord_2017_10_13 | 136 kB |


I don't have any reason to believe there's memory issue on the server, So I
suppose this is just a "heads up" to early adopters until/in case it happens
again and I can at least provide a stack trace.

Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: [HACKERS] Re: heap/SLRU verification, relfrozenxid cut-off, andfreeze-the-dead bug (Was: amcheck (B-Tree integrity checking tool))
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] show precise repos version for dev builds?