BUG #13454: Embedded python can stop WAL streaming and hot standby mode

Поиск
Список
Период
Сортировка
От chris+postgresql@qwirx.com
Тема BUG #13454: Embedded python can stop WAL streaming and hot standby mode
Дата
Msg-id 20150618165827.2737.42412@wrigleys.postgresql.org
обсуждение исходный текст
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      13454
Logged by:          Chris Wilson
Email address:      chris+postgresql@qwirx.com
PostgreSQL version: 9.4.1
Operating system:   Linux 2.6.32-220.30.1.el6.x86_64
Description:

I don't think this is actually a bug in Postgres, but perhaps the
documentation can be improved. I thought I should at least report it
somewhere public in case anyone else has the same problem.

One of our replicating hot standbys failed to come up properly after a
restart. We got a consistent state:

LOG:  consistent recovery state reached at CEC/AD9B8660

but not followed by:

LOG:  database system is ready to accept read only connections

nor:

LOG:  started streaming WAL from primary at CEE/17000000 on timeline 1

Instead, there was no clue why hot_standby mode didn't start, even at debug3
level, and lots of flip-flopping between stream and archive WAL sources
instead of successfully streaming:

DEBUG:  switched WAL source from stream to archive after failure
DEBUG:  switched WAL source from archive to stream after failure

This turned out to be a problem with our embedded python interpreter
(plpython2) having a site-wide sitecustomize.py script in PYTHONPATH, which
did something bad to Postgres (installing a fault handler for SIGUSR1) which
managed to stop it initialising completely, I guess.

The documentation implies that "consistent recovery state reached" will
always be followed by "database system is ready to accept read only
connections", but it isn't, and it's not clear why not.

There's also no clue what "failure" caused the "switched WAL source from
stream to archive after failure". Strace showed that postgres didn't even
try to connect to the remote server, so it must have known internally that
something was wrong, but it didn't tell us :)

I understand that you can't defend against everything that can be done in a
turing-complete embedded language, but it might be worth pointing a finger
at plugins if either of these expected progressions doesn't hold.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #13440: unaccent does not remove all diacritics
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #13440: unaccent does not remove all diacritics