Re: Apparent race condition in standby promotion

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: Apparent race condition in standby promotion
Дата
Msg-id 20160504061042.GC2409136@tornado.leadboat.com
обсуждение исходный текст
Ответ на Apparent race condition in standby promotion  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, Apr 25, 2016 at 02:09:27PM -0400, Tom Lane wrote:
> I'm looking at buildfarm member tern's recent failure:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2016-04-25%2001%3A08%3A08

> # Running: pg_ctl -D /home/nm/farm/gcc32/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/data_standby_fTzy/pgdata -l
/home/nm/farm/gcc32/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/log/001_basic_standby.logpromote 
> server promoting
> #
> Timed out while waiting for promotion of standby at RewindTest.pm line 166.
>
> But look at what's in the standby's log:
>
> LOG:  database system was interrupted; last known up at 2016-04-25 01:58:39 UTC
> LOG:  entering standby mode
> LOG:  redo starts at 0/2000028
> LOG:  consistent recovery state reached at 0/2000C00
> LOG:  database system is ready to accept read only connections
> LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
> LOG:  statement: SELECT NOT pg_is_in_recovery()
> ... 88 occurrences of that removed ...
> LOG:  statement: SELECT NOT pg_is_in_recovery()
> LOG:  received promote request
> FATAL:  terminating walreceiver process due to administrator command
>
> The standby server didn't notice the promote request until AFTER
> the perl script had probed 90 times, once a second, for promotion
> to finish.  What's up with that?

This resembles symptoms I studied last September.

> The other theory is that the startup process received the SIGUSR2
> but failed to act on it for a long time.  It checks for that only
> in CheckForStandbyTrigger(), and the calls to that function are
> in assorted rather unprincipled places, which makes it hard to
> convince oneself that such a call will always happen soon.

Right.  In every case, I caught the startup process taking too long in a
single XLOG_DBASE_CREATE replay.  I don't suspect a bug; that is a costly
record to replay, especially given the machine's slow filesystem metadata
operations.  I could reproduce such symptoms reliably, on any system, with the
attached demonstration patch.  While we could hack things until the message
prints earlier, that won't make promotion finish earlier.

> We've seen previous examples of the same thing, eg
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2016-02-09%2019%3A16%3A09
> Noah raised the test's timeout from 30 seconds to 90 in response
> to that failure, but it looks to me like this was a mis-diagnosis
> of the problem.

I can understand you guessing the 2016-02-09 failure inspired that change, but
it was either a coincidence or merely reminded me to commit an already-written
change.  I had performed burn-in runs, with timeouts raised to 900s, totalling
25177 executions of src/bin/pg_rewind/t/002_databases.pl.  The longest promote
duration among those runs was 43s, so I doubled that quantity and rounded up.

For the 2016-04-25 failure, note this bit of 001_basic_master.log:

LOG:  statement: CHECKPOINT
LOG:  statement: SELECT pg_current_xlog_location() = write_location FROM pg_stat_replication WHERE application_name =
'rewind_standby';
... 39 duplicate lines removed ...
LOG:  statement: SELECT pg_current_xlog_location() = write_location FROM pg_stat_replication WHERE application_name =
'rewind_standby';
LOG:  received immediate shutdown request

That's 41s just to copy WAL.  The slowest of those 002_databases.pl runs used
7s, and the worst I've seen in 2579 runs of 001_basic.pl is 22s.  The failed
run likely coincided with exceptional system load.

If this happens again anytime soon, I should triple (or so) that 90s timeout.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: text search changes vs. binary upgrade
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011