Re: last_statrequest is in the future

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: last_statrequest is in the future
Дата
Msg-id 19946.1269535540@sss.pgh.pa.us
обсуждение исходный текст
Ответ на last_statrequest is in the future  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> ... lookee what we have here:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2010-03-24%2004:00:07

Modifying the code to print the actual timestamps resulted in a wealth
of information that can be dredged from the postmaster log here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2010-03-25%2004:00:04
Just grep for log messages like
[4baafb13.cd2:1] LOG:  last_statrequest 2010-03-25 06:58:42.118734+01 is later than collector's time 2010-03-25
06:58:40.817027+01

What this shows is that system timekeeping on jaguar is just utterly
broken.  gettimeofday() sometimes returns answers that are 5 seconds or
more skewed in different processes, and there are episodes where its
value fails to move at all in the stats collector while multiple seconds
elapse in the opinion of another process.

I wonder whether this doesn't also explain jaguar's propensity to show
some other unexplainable failures, like these recent examples:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2010-03-20%2004:00:03
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2010-03-15%2004:00:04
which are not at all unique in that machine's buildfarm history.

I had always supposed that these were caused by the extreme overall
slowness induced by CLOBBER_CACHE_ALWAYS, but now I think they are
more likely explained by system timekeeping issues.  In particular the
last-cited regression failure can be explained if now() fails to change
value across pg_sleep(0.1), and that's isomorphic to what seems to be
happening in the stats collector in some of the episodes captured today.

In short: either this machine has got broken clock hardware, or there's
a very serious bug in the kernel code for reading the clock.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: ALTER ROLE/DATABASE RESET ALL versus security
Следующее
От: Josh Berkus
Дата:
Сообщение: Conditional trigger missing from pg_dump