Re: [GENERAL] postmaster deadlock while logging after syslogger exited

Поиск

Список

Период

Сортировка

От	David Pacheco
Тема	Re: [GENERAL] postmaster deadlock while logging after syslogger exited
Дата	21 ноября 2017 г. 00:13:57
Msg-id	CACukRjPp+JicsFFKD-P6ORQhjs_sEw_OFOXXkefWbk0ToNywAg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [GENERAL] postmaster deadlock while logging after syslogger exited (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [GENERAL] postmaster deadlock while logging after sysloggerexited
Список	pgsql-general

Дерево обсуждения

Responding to several points below.

Tom Lane wrote:

> David's report isn't too clear: did the syslogger process actually run

> out of memory and exit of its own volition after an ENOMEM, or did it get

> killed by the dreaded OOM killer? In either case, it's unclear whether

> it was really using an excessive amount of memory. We have not heard

> reports suggesting a memory leak in the syslogger, but maybe there is

> one under unusual circumstances?

I'm running on illumos, where there is no OOM killer.

I expect what happened is that the syslogger process attempted to allocate

memory, failed because the system was low, and explicitly exited. That's

consistent with an exited process, no core file generated, and the "FATAL" "out

of memory" entries in the log I posted.

Of course, that doesn't mean the syslogger was using much memory. It means only

that it attempted to allocate any memory in a context where it could not handle

failing.

Andres Freund wrote:

> I'm somewhat inclined to say that one has to live with this if the

> system is so resource constrainted that processes barely using memory

> get killed.

Again, there's no OOM killer here, so the process has the opportunity to respond

differently. I can understand the database not functioning well when the system

is out of memory, but this half-functioning, half-deadlocked state is quite

disruptive. By comparison, while not ideal, if the syslogger had raised SIGABRT

instead, that would have been significantly better for my use-case. Our process

manager would have terminated all of the processes associated with the database

and restarted it.

Tom Lane wrote:

> Hm, so that's another angle David didn't report on: is it possible that

> his workload could have resulted in a very large volume of incomplete

> in-progress log messages?

Yes. I mentioned in my Nov 6 mail that large log messages over a short period

appear to have been a major contributing factor:

> Here are all of the entries in the PostgreSQL log from 23:19:12 until the top of

> the next hour:

> https://gist.githubusercontent.com/davepacheco/c5541bb464532075f2da761dd990a457/raw/2ba242055aca2fb374e9118045a830d08c590e0a/gistfile1.txt

...

> There are also many log entries for some very long SQL queries. I'm sure that

> contributed to this problem by filling up the pipe. I was able to extract the

> contents of the pipe while the system was hung, and it was more of these giant

> query strings.

Andres Freund wrote:

> Note that there's plenty of cases where you could run into this even

> without being unable to fork new processes. You'd e.g. could also run

> into this while logging the exit of some other subprocess or such,

> there's enough ereports in postmaster.

This is a critical point. As far as I can tell, all that's necessary for this

deadlock to occur is:

- the syslogger is unable to make forward progress (e.g., because it segfaulted)

- enough other processes write to the pipe that it fills up before the

postmaster can restart the syslogger

- the postmaster attempts to write to the log as part of any of its signal

handling operations

It seems to me that the same thing can happen if there were a bug in the

syslogger that resulted in a segfault around the same time that the postmaster

attempted to start an autovacuum worker, for example (if there was also

reasonably high log traffic).

Thanks,

Dave

В списке pgsql-general по дате отправления:

Предыдущее

От: Noel
Дата: 21 ноября 2017 г., 00:07:15
Сообщение: unsubscribe

Следующее

От: "Yang, Xiaoguang"
Дата: 21 ноября 2017 г., 00:16:34
Сообщение: unsubscribe

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [GENERAL] postmaster deadlock while logging after syslogger exited

Предыдущее

Следующее