checkpointer and other server processes crashing

Поиск
Список
Период
Сортировка
От Joe Abbate
Тема checkpointer and other server processes crashing
Дата
Msg-id e426330d-bdbb-98d8-5e74-09998258503d@freedomcircle.com
обсуждение исходный текст
Ответы Re: checkpointer and other server processes crashing  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: checkpointer and other server processes crashing  (Tim Cross <theophilusx@gmail.com>)
Re: checkpointer and other server processes crashing  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Список pgsql-general
Hello,

We've been experiencing PG server process crashes about every other week 
on a mostly read only website (except for a single insert/update on page 
access).  Typical log entries look like

LOG:  checkpointer process (PID 11200) was terminated by signal 9: Killed
LOG:  terminating any other active server processes

Other than the checkpointer, the server process that was terminated was 
either doing a "BEGIN READ WRITE", a "COMMIT" or executing a specific 
SELECT.

The database is always recovered within a second and everything else 
appears to resume normally.  We're not certain about what triggers this, 
but in several instances the web logs show an external bot issuing 
multiple HEAD requests on what is logically a single page.  The web 
server logs show "broken pipe" and EOF errors, and PG logs sometimes 
shows a number of "incomplete startup packet" messages before the 
termination message.

This started roughly when the site was migrated to Go, whose web 
"processes" run as "goroutines", scheduled by Go's runtime (previously 
the site used Python and Gunicorn to serve the pages, which probably 
isolated the PG processes from a barrage of nearly simultaneous requests).

As I understand it, the PG server processes doing a SELECT are spawned 
as children of the Go process, so presumably if a "goroutine" dies, the 
associated PG process would die too, but I'm not sure I grasp why that 
would cause a recovery/restart.  I also don't understand where the 
checkpointer process fits in the picture (and what would cause it to die).

For the record, this is on PG 11.9 running on Debian.

TIA,

Joe



В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_stat_user_tables.n_mod_since_analyze persistence?
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: checkpointer and other server processes crashing