Re: OOM in libpq and infinite loop with getCopyStart()

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: OOM in libpq and infinite loop with getCopyStart()
Дата
Msg-id 20160309151234.GA963998@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: OOM in libpq and infinite loop with getCopyStart()  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
Ответы Re: OOM in libpq and infinite loop with getCopyStart()  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
Aleksander Alekseev wrote:

> pg_receivexlog: could not send replication command "START_REPLICATION":
> out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
> again pg_receivexlog: starting log streaming at 0/1000000 (timeline 1)
> 
> Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
> msgLength=3) at fe-protocol3.c:1398 1398        const char
> *errmsg = NULL;
> ```
> 
> Granted this behaviour is a bit better then the current one. But
> basically it's the same infinite loop only with pauses and warnings. I
> wonder if this is a behaviour we really want. For instance wouldn't it
> be better just to terminate an application in out-of-memory case? "Let
> it crash" as Erlang programmers say.

Hmm.  It would be useful to retry in the case that there is a chance
that the program releases memory and can continue later.  But if it will
only stay there doing nothing other than retrying, then that obviously
will not happen.  One situation where this might help is if the overall
*system* is short on memory and we expect that situation to resolve
itself after a while -- after all, if the system is so loaded that it
can't allocate a few more bytes for the COPY message, then odds are that
other things are also crashing and eventually enough memory will be
released that pg_receivexlog can continue.

On the other hand, if the system is so loaded, perhaps it's better to
"let it crash" and have it restart later -- presumably once the admin
notices the problem and restarts it manually after cleaning up the mess.

If all programs are well behaved and nothing crashes when OOM but they
all retry instead, then everything will continue to retry infinitely and
make no progress.  That cannot be good.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Crash with old Windows on new CPU
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: More stable query plans via more predictable column statistics