Re: Tracing down buildfarm "postmaster does not shut down" failures

Поиск

Список

Период

Сортировка

От	Andrew Dunstan
Тема	Re: Tracing down buildfarm "postmaster does not shut down" failures
Дата	10 февраля 2016 г. 07:23:57
Msg-id	56BABAD4.7040006@dunslane.net обсуждение исходный текст
Ответ на	Re: Tracing down buildfarm "postmaster does not shut down" failures (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Tracing down buildfarm "postmaster does not shut down" failures (Andrew Dunstan <andrew@dunslane.net>)
Список	pgsql-hackers

Дерево обсуждения


On 02/09/2016 10:27 PM, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
>> On Tue, Feb 09, 2016 at 10:02:17PM -0500, Tom Lane wrote:
>>> I wonder if it's worth sticking some instrumentation into stats
>>> collector shutdown?
>> I wouldn't be surprised if the collector got backlogged during the main phase
>> of testing and took awhile to chew through its message queue before even
>> starting the write of the final stats.
> But why would the ecpg tests show such an effect when the main regression
> tests don't?  AFAIK the ecpg tests don't exactly stress the server ---
> note the trivial amount of data written by the shutdown checkpoint,
> for instance.


The main regression tests run with the stats file on the ramdisk.


>
> The other weird thing is that it's only sometimes slow.  If you look at
> the last buildfarm result from axolotl, for instance, the tail end of
> the ecpg log is
>
> LOG:  ShutdownSUBTRANS() complete at 2016-02-09 16:31:14.784 EST
> LOG:  database system is shut down at 2016-02-09 16:31:14.784 EST
> LOG:  lock files all released at 2016-02-09 16:31:14.817 EST
>
> so we only spent ~50ms on stats write that time.


That part is puzzling.

> The idea I was toying with is that previous filesystem activity (making
> the temp install, the server's never-fsync'd writes, etc) has built up a
> bunch of dirty kernel buffers, and at some point the kernel goes nuts
> writing all that data.  So the issues we're seeing would come and go
> depending on the timing of that I/O spike.  I'm not sure how to prove
> such a theory from here.


Yeah. It's faintly possible that a kernel upgrade will  help.

cheers

andrew

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Amit Kapila
Дата: 10 февраля 2016 г., 07:15:02
Сообщение: Re: Speed up Clog Access by increasing CLOG buffers

Следующее

От: Michael Paquier
Дата: 10 февраля 2016 г., 07:53:40
Сообщение: Re: GinPageIs* don't actually return a boolean

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Tracing down buildfarm "postmaster does not shut down" failures

Предыдущее

Следующее