Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl
Дата
Msg-id 2588.1499028049@sss.pgh.pa.us
обсуждение исходный текст
Ответ на [HACKERS] Race-like failure in recovery/t/009_twophase.pl  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> The reporting critters are all on the slow side, so I suspected
> a timing problem, especially since it only started to show up
> after changing pg_ctl's timing behavior.  I can't reproduce it
> locally on unmodified sources, but I could after putting my thumb
> on the scales like this:
>
> diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
> index ea9d21a..7ef22c1 100644
> *** a/src/backend/replication/walreceiver.c
> --- b/src/backend/replication/walreceiver.c
> *************** WalReceiverMain(void)
> *** 446,451 ****
> --- 446,454 ----
>                                * Something was received from master, so reset
>                                * timeout
>                                */
> +
> +                             pg_usleep(100000); /* be slow about processing */
> +
>                               last_recv_timestamp = GetCurrentTimestamp();
>                               ping_sent = false;
>                               XLogWalRcvProcessMsg(buf[0], &buf[1], len - 1);


BTW, I neglected to mention that this hack also produces a second failure
in the recovery test suite:

[20:28:31] t/001_stream_rep.pl .................. 11/28
#   Failed test 'xmin of cascaded slot null with no hs_feedback'
#   at t/001_stream_rep.pl line 178.
#          got: undef
#     expected: ''

#   Failed test 'catalog xmin of cascaded slot null with no hs_feedback'
#   at t/001_stream_rep.pl line 179.
#          got: undef
#     expected: ''
[20:28:31] t/001_stream_rep.pl .................. 28/28 # Looks like you failed 2 tests of 28.
[20:28:31] t/001_stream_rep.pl .................. Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/28 subtests

I had supposed last night that this was merely a matter of needing
to insert a wait_slot_xmins() call before those tests, but doing so
doesn't help.  And, looking closer, the report is complaining that
$xmin and $catalog_xmin are coming back as undef, not empty strings.
My Perl-fu is too weak to figure out how that could possibly be ---
the documentation for PostgresNode::query_hash certainly states
flatly that it will never happen.  And it even more certainly
shouldn't happen if we already successfully probed the
pg_replication_slots row in wait_slot_xmins().

The corresponding entries in the postmaster's log don't look abnormal:

2017-07-02 20:07:29.313 UTC [30862] LOG:  database system is ready to accept read only connections
2017-07-02 20:07:29.318 UTC [30867] LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
2017-07-02 20:07:29.415 UTC [30870] t/001_stream_rep.pl LOG:  statement: SELECT
pg_create_physical_replication_slot('standby_2');
2017-07-02 20:07:29.619 UTC [30863] LOG:  redo starts at 0/3029458
2017-07-02 20:07:29.665 UTC [30882] t/001_stream_rep.pl LOG:  statement:         SELECT xmin IS NULL AND catalog_xmin
ISNULL        FROM pg_catalog.pg_replication_slots        WHERE slot_name = 'standby_2';     
2017-07-02 20:07:29.679 UTC [30884] t/001_stream_rep.pl LOG:  statement: SELECT plugin, slot_type, datoid, database,
active,active_pid, xmin, catalog_xmin, restart_lsn FROM pg_catalog.pg_replication_slots WHERE slot_name = 'standby_2' 

(This is from a run with a wait_slot_xmins() call added, else the
statement logged by PID 30882 would not have been issued.)

Any ideas what's wrong there?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [BUGS] [HACKERS] Segmentation fault in libpq
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl