Re: FW: Intermittent Stats Failiures: firefly: HEAD

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: FW: Intermittent Stats Failiures: firefly: HEAD
Дата
Msg-id 27826.1137011321@sss.pgh.pa.us
обсуждение исходный текст
Ответ на FW: Intermittent Stats Failiures: firefly: HEAD  ("Larry Rosenman" <lrosenman@pervasive.com>)
Список pgsql-hackers
"Larry Rosenman" <lrosenman@pervasive.com> writes:
>> Ever since the stats collector changes, I've seen intermittent
>> failures on 'firefly' in the buildfarm.

Yeah, you're not the only one.  We haven't figured out what's causing
them.  But while fooling with Joachim Wieland's pg_sleep patch just
now, I was struck by an idea: on machines where select() is
interruptible by signals, it is possible that the do_sleep() function
won't wait as long as specified.  This could easily cause the observed
regression diff, if the test doesn't wait long enough for the stats
collector to update the stats.

It's not immediately obvious what signal might be arriving at the
backend, given that there's not supposed to be any other database
operations going on.  It's barely possible that a SIGUSR1 (sinval
catchup interrupt) could be generated here, if one of the previous
group of tests were still in the process of shutting down its backend.
So I'm not sure about this theory ... but at least it's a theory.

If the theory is correct then the just-committed pg_sleep patch
should provide a permanent solution.  We'll have to wait and see
if we see any more of those errors.

If we don't see any more such errors in HEAD for awhile, it might
be worth back-patching the implementation of pg_sleep into the
older branches' regression tests, so we don't keep seeing intermittent
regression failures in them either.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Overflow of bgwriter's request queue
Следующее
От: Robert Treat
Дата:
Сообщение: sort operation leads planner to different number of rows?