On Mon, Oct 18, 2021 at 08:02:12PM -0700, Noah Misch wrote:
> On Mon, Oct 18, 2021 at 06:23:05PM +0500, Andrey Borodin wrote:
> > > 17 окт. 2021 г., в 20:12, Noah Misch <noah@leadboat.com> написал(а):
> > > I think the attached version is ready for commit. Notable differences
> > > vs. v14:
Pushed. Buildfarm member conchuela (DragonFly BSD 6.0) has gotten multiple
"IPC::Run: timeout on timer" in the new tests. No other animal has.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela&dt=2021-10-24%2003%3A05%3A09
is an example run. The pgbench queries finished quickly, but the
$pgbench_h->finish() apparently timed out after 180s. I guess this would be
consistent with pgbench blocking in write(), waiting for something to empty a
pipe buffer so it can write more. I thought finish() will drain any incoming
I/O, though. This phenomenon has been appearing regularly via
src/test/recovery/t/017_shm.pl[1], so this thread doesn't have a duty to
resolve it. A stack trace of the stuck pgbench should be informative, though.
Compared to my last post, the push included two more test changes. I removed
sleeps from a test. They could add significant time on a system with coarse
sleep granularity. This did not change test sensitivity on my system.
Second, I changed background_pgbench to include stderr lines in $stdout, as it
had documented. This becomes important during the back-patch to v11, where
server errors don't cause a nonzero pgbench exit status. background_psql
still has the same bug, and I can fix it later. (The background_psql version
of the bug is not affecting current usage.)
FYI, the non-2PC test is less sensitive in older branches. It reproduces
master's bug in 25-50% of runs, but it took about six minutes on v11 and v12.
> > > One thing not done here is to change the tests to use CREATE INDEX
> > > CONCURRENTLY instead of REINDEX CONCURRENTLY, so they're back-patchable to v11
> > > and earlier. I may do that before pushing, or I may just omit the tests from
> > > older branches.
> >
> > The tests refactors PostgresNode.pm and some tests. Back-patching this would be quite invasive.
>
> That's fine with me. Back-patching a fix without its tests is riskier than
> back-patching test infrastructure changes.
Back-patching the tests did end up tricky, for other reasons. Before v12
(d3c09b9), a TAP suite in a pgxs module wouldn't run during check-world.
Before v11 (7f563c0), amcheck lacks the heapallindexed feature that the tests
rely on. Hence, for v11, v10, and v9.6, I used a plpgsql implementation of
the heapallindexed check, and I moved the tests to src/bin/pgbench.
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela&dt=2021-10-19%2012%3A58%3A08