Fixing WAL instability in various TAP tests

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Fixing WAL instability in various TAP tests
Дата
Msg-id 32A1FDD1-9C7B-43B1-B3EE-49198DD3F887@enterprisedb.com
обсуждение исходный текст
Ответы Re: Fixing WAL instability in various TAP tests  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
Hackers,

A few TAP tests in the project appear to be sensitive to reductions of the PostgresNode's max_wal_size setting,
resultingin tests failing due to wal files having been removed too soon.  The failures in the logs typically are of the
"requestedWAL segment %s has already been removed" variety.  I would expect tests which fail under legal alternate GUC
settingsto be hardened to explicitly set the GUCs as they need, rather than implicitly relying on the defaults.  As far
asmissing WAL files go, I would expect the TAP test to prevent this with the use of replication slots or some other
mechanism,and not simply to rely on checkpoints not happening too soon.  I'm curious if others on this list disagree
withthat point of view. 

Failures in src/test/recovery/t/015_promotion_pages.pl can be fixed by creating a physical replication slot on node
"alpha"and using it from node "beta", a technique already used in other TAP tests and apparently merely overlooked in
thisone. 

The first two tests in src/bin/pg_basebackup/t fail, and it's not clear that physical replication slots are the
appropriatesolution, since no replication is happening.  It's not immediately obvious that the tests are at fault
anyway. On casual inspection, it seems they might be detecting a live bug which simply doesn't manifest under larger
valuesof max_wal_size.  Test 010 appears to show a bug with `pg_basebackup -X`, and test 020 with `pg_receivewal`. 

The test in contrib/bloom/t/ is deliberately disabled in contrib/bloom/Makefile with a comment that the test is
unstablein the buildfarm, but I didn't find anything to explain what exactly those buildfarm failures might have been
whenI chased down the email thread that gave rise to the related commit.  That test happens to be stable on my laptop
untilI change GUC settings to both reduce max_wal_size=32MB and to set wal_consistency_checking=all. 

Thoughts?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Column Filtering in Logical Replication
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: prevent immature WAL streaming