v12.0: segfault in reindex CONCURRENTLY

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема v12.0: segfault in reindex CONCURRENTLY
Дата
Msg-id 20191012004446.GT10470@telsasoft.com
обсуждение исходный текст
Ответы Re: v12.0: segfault in reindex CONCURRENTLY
Список pgsql-hackers
One of our servers crashed last night like this:

< 2019-10-10 22:31:02.186 EDT postgres >STATEMENT:  REINDEX INDEX CONCURRENTLY
child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx
< 2019-10-10 22:31:02.399 EDT  >LOG:  server process (PID 29857) was terminated by signal 11: Segmentation fault
< 2019-10-10 22:31:02.399 EDT  >DETAIL:  Failed process was running: REINDEX INDEX CONCURRENTLY
child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx
< 2019-10-10 22:31:02.399 EDT  >LOG:  terminating any other active server processes

ts=# \d+ child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx
Index "child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx"
 Column  |  Type   | Key? | Definition | Storage | Stats target
---------+---------+------+------------+---------+--------------
 site_id | integer | yes  | site_id    | plain   |
btree, for table "child.eric_umts_rnc_utrancell_hsdsch_eul_201910"

That's an index on a table partition, but not itself a child of a relkind=I
index.

Unfortunately, there was no core file, and I'm still trying to reproduce it.

I can't see that the table was INSERTed into during the reindex...
But looks like it was SELECTed from, and the report finished within 1sec of the
crash:

(2019-10-10 22:30:50,485 - p1604 t140325365622592 - INFO): PID 1604 finished running report; est=None rows=552;
cols=83;[...] duration:12
 

postgres=# SELECT log_time, pid, session_id, left(message,99), detail FROM postgres_log_2019_10_10_2200 WHERE pid=29857
OR(log_time BETWEEN '2019-10-10 22:31:02.18' AND '2019-10-10 22:31:02.4' AND NOT message~'crash of another') ORDER BY
log_timeLIMIT 9;
 
 2019-10-10 22:30:24.441-04 | 29857 | 5d9fe93f.74a1 | temporary file: path
"base/pgsql_tmp/pgsql_tmp29857.0.sharedfileset/0.0",size 3096576      | 
 
 2019-10-10 22:30:24.442-04 | 29857 | 5d9fe93f.74a1 | temporary file: path
"base/pgsql_tmp/pgsql_tmp29857.0.sharedfileset/1.0",size 2809856      | 
 
 2019-10-10 22:30:24.907-04 | 29857 | 5d9fe93f.74a1 | process 29857 still waiting for ShareLock on virtual transaction
30/103010after 333.078 ms | Process holding the lock: 29671. Wait queue: 29857.
 
 2019-10-10 22:31:02.186-04 | 29857 | 5d9fe93f.74a1 | process 29857 acquired ShareLock on virtual transaction 30/103010
after37611.995 ms        | 
 
 2019-10-10 22:31:02.186-04 | 29671 | 5d9fe92a.73e7 | duration: 50044.778 ms  statement: SELECT fn, sz FROM
                        +| 
 
                            |       |               |                         (SELECT file_name fn, file_size_bytes sz,
                        +| 
 
                            |       |               |
                         | 
 
 2019-10-10 22:31:02.399-04 |  1161 | 5d9cad9e.489  | terminating any other active server processes
                         | 
 
 2019-10-10 22:31:02.399-04 |  1161 | 5d9cad9e.489  | server process (PID 29857) was terminated by signal 11:
Segmentationfault                  | Failed process was running: REINDEX INDEX CONCURRENTLY
child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx

Justin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: let's make the list of reportable GUCs configurable (was Re: Add%r substitution for psql prompts to show recovery status)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: let's make the list of reportable GUCs configurable (was Re: Add %r substitution for psql prompts to show recovery status)