On Sun, Aug 27, 2017 at 12:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> And *another* replication test race condition just now:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi&dt=2017-08-26%2019%3A37%3A08
>
> As best I can interpret this, it's pointing out that this bit in
> src/test/recovery/t/009_twophase.pl:
>
> $cur_master->psql(
> 'postgres', "
> BEGIN;
> CREATE TABLE t_009_tbl2 (id int, msg text);
> SAVEPOINT s1;
> INSERT INTO t_009_tbl2 VALUES (27, 'issued to ${cur_master_name}');
> PREPARE TRANSACTION 'xact_009_13';
> -- checkpoint will issue XLOG_STANDBY_LOCK that can conflict with lock
> -- held by 'create table' statement
> CHECKPOINT;
> COMMIT PREPARED 'xact_009_13';");
>
> $cur_standby->psql(
> 'postgres',
> "SELECT count(*) FROM t_009_tbl2",
> stdout => \$psql_out);
> is($psql_out, '1', "Replay prepared transaction with DDL");
>
> contains exactly no means of ensuring that the master's transaction has
> been replayed on the standby before we check for its results. It's not
> real clear why it seems to work 99.99% of the time, because, well, there
> isn't any frickin' interlock there ...
I have noticed this one this morning, and I am planning to address it
with a proper patch soonishly. (I am still fighting a bit to get
dangomushi in a more stable stable, and things run slow on it, so it
is good at catching race conditions of this kind).
--
Michael