Re: Tracing down buildfarm "postmaster does not shut down" failures

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Tracing down buildfarm "postmaster does not shut down" failures
Дата
Msg-id 24722.1455058439@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Tracing down buildfarm "postmaster does not shut down" failures  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Tracing down buildfarm "postmaster does not shut down" failures  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Tracing down buildfarm "postmaster does not shut down" failures  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
I wrote:
> I'm not sure whether there's anything to be gained by leaving the tracing
> code in there till we see actual buildfarm fails.  There might be another
> slowdown mechanism somewhere, but I rather doubt it.  Thoughts?

Hmmm ... I take that back.  AFAICT, the failures on Noah's AIX zoo are
sufficiently explained by the "mdpostckpt takes a long time after the
regression tests" theory.  However, there is something else happening
on axolotl.  Looking at the HEAD and 9.5 branches, there are three very
similar failures in the ECPG step within the past 60 days:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2016-02-08%2014%3A49%3A23
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-12-15%2018%3A49%3A31
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-12-12%2001%3A44%3A39

In all three, we got "pg_ctl: server does not shut down", but the
postmaster log claims that it shut down, and pretty speedily too.
For example, in the 2015-12-12 failure,

LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
LOG:  shutting down
LOG:  checkpoint starting: shutdown immediate
LOG:  checkpoint complete: wrote 176 buffers (1.1%); 0 transaction log file(s) added, 0 removed, 0 recycled;
write=0.039s, sync=0.000 s, total=0.059 s; sync files=0, longest=0.000 s, average=0.000 s; distance=978 kB,
estimate=978kB
 
LOG:  database system is shut down

We have no theory that would account for postmaster shutdown stalling
after the end of ShutdownXLOG, but that seems to be what happened.
How come?  Why does only the ECPG test seem to be affected?

It's also pretty fishy that we have three failures in 60 days on HEAD+9.5
but none before that, and none in the older branches.  That smells like
a recently-introduced bug, though I have no idea what.

Andrew, I wonder if I could prevail on you to make axolotl run "make
check" on HEAD in src/interfaces/ecpg/ until it fails, so that we can
see if the logging I added tells anything useful about this.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: proposal: schema PL session variables
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Tracing down buildfarm "postmaster does not shut down" failures