Timeout control within tests

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Timeout control within tests
Дата
Msg-id 20220218052842.GA3627003@rfd.leadboat.com
обсуждение исходный текст
Ответы Re: Timeout control within tests  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Fri, Feb 05, 2021 at 03:55:20PM -0500, Tom Lane wrote:
> We have, almost invariably, regretted it when we tried to use short
> timeouts in test cases.

> More generally, sometimes people want to do things like run a test
> under valgrind.  So it's not just "underpowered machines" that may
> need a generous timeout.  Even if we did reduce the default, I'd
> want a way (probably via an environment variable, cf PGCTLTIMEOUT)
> to kick it back up.

I have a few use cases for that:

1. My buildfarm members have more and more competition for CPU and I/O.
   Examples where I suspect animal slowness caused a 180s timeout:
   https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&dt=2021-04-11%2003%3A11%3A39
   https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2021-07-26%2004%3A38%3A29

2. When I'm developing a change and I locally break a test in a way that leads
   to a timeout, I like to be able to lower that timeout.

3. I want more tests to use the right timeout from the start.  Low-timeout
   tests appear at least a few times per year:
   d03eeab Mon May 31 00:29:58 2021 -0700 Raise a timeout to 180s, in test 010_logical_decoding_timelines.pl.
   388b959 Sat Feb 27 07:02:56 2021 -0800 Raise a timeout to 180s, in contrib/test_decoding.
   08dde1b Tue Dec 22 11:10:12 2020 -0500 Increase timeout in 021_row_visibility.pl.
   8961355 Sat Apr 25 18:45:27 2020 -0700 Raise a timeout to 180s, in test 003_recovery_targets.pl.
   1db439a Mon Dec 10 20:15:42 2018 -0800 Raise some timeouts to 180s, in test code.

I propose to have environment variable PG_TEST_TIMEOUT_DEFAULT control the
timeout used in the places that currently hard-code 180s.  TAP tests should
retrieve the value via $PostgreSQL::Test::Utils::timeout_default.  pg_regress
tests should retrieve it via \getenv.  I would like to back-patch the TAP
part, for cause (1).  (The pg_regress part hasn't been a buildfarm problem,
and \getenv is new in v15.)  Patches attached.  I considered and excluded
other changes, for now:

a. I considered consolidating this with PGISOLATIONTIMEOUT (default 300).  One
   could remove the older variable entirely or make isolationtester use the
   first-available of [PGISOLATIONTIMEOUT, 2 * PG_TEST_TIMEOUT_DEFAULT, 360].
   Does anyone have an opinion on what, if anything, to do there?

b. I briefly made stats.sql accept PG_TEST_TIMEOUT_DEFAULT to override its
   hard-coded 30s timeout.  However, a higher timeout won't help when a UDP
   buffer fills.  If the test were structured to observe evidence of a vacant
   UDP buffer before proceeding with the test stat messages, a higher timeout
   could make more sense.  I added a comment.

c. One could remove timeout-duration function arguments (e.g. from
   pg_recvlogical_upto) and just have the function consult timeout_default.
   This felt like highly-optional refactoring.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Fix a comment in worker.c
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Timeout control within tests