Re: [HACKERS] jacana hung after failing to acquire random number

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] jacana hung after failing to acquire random number
Дата
Msg-id 26907.1481556920@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] jacana hung after failing to acquire random number  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 12/12/2016 03:40 PM, Andrew Dunstan wrote:
>> Should one or more of these errors be fatal? Or should we at least get
>> pg_regress to try to shut down the postmaster if it can't connect after
>> 120 seconds?

> Making it fatal, i.e. bringing down the server, doesn't seem like an 
> improvement. If the failure is transient, you don't want to kill the 
> whole server, when one connection attempt fails.

> It would be nice to fail earlier if it's permanently failing, though. 
> Like, if someone does "rm /dev/urandom". Perhaps we should perform one 
> pg_strong_random() call at postmaster startup, and if that fails, refuse 
> to start up.

That's sort of contradictory.  If you're worried about transient failures,
allowing a single failed try to cause postmaster startup failure isn't the
way to make things more robust.  Giving up after a bunch of failed tries
over a very short interval isn't much better.

I'm not sure how hard we need to work here.  The case at hand seems
to be one of simply not having gotten the bugs out of the initial
implementation, so maybe we shouldn't read too much into it.

I do agree that the buildfarm needs to be more robust against broken
postmasters, because finding bugs is its raison d' etre.  But I'm not
convinced that it's a good idea to have the postmaster itself conclude
that there's something wrong with its configured random-number source.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] background sessions
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] exposing wait events for non-backends (was: Trackingwait event for latches)