Обсуждение: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

Поиск
Список
Период
Сортировка

BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15594
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 11.1
Operating system:   Ubuntu 18.04
Description:

Some tests in contrib/test_decoding/output_iso/ occasionally fail.
To reproduce it reliable you can do:
# sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i
contrib/test_decoding/Makefile
# make check -C contrib/test_decoding/

...
test delayed_startup              ... ok
test ondisk_startup               ... ok
test concurrent_ddl_dml           ... ok
test oldest_xmin                  ... ok
test snapshot_transfer            ... ok
============== shutting down postmaster               ==============

==========================
 19 of 1005 tests failed. 
==========================

The differences that caused some tests to fail can be viewed in the
file ".../contrib/test_decoding/output_iso/regression.diffs".  A copy of the
test summary that you see
above is saved in the file
".../contrib/test_decoding/output_iso/regression.out".

The diff file contains:
--- .../contrib/test_decoding/expected/delayed_startup.out  2018-02-08
16:57:42.816744192 +0300
+++ .../contrib/test_decoding/output_iso/results/delayed_startup.out
2019-01-16 17:46:33.138962021 +0300
@@ -12,6 +12,8 @@
 step s2start: SELECT data FROM
pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids',
'false');
 data

+BEGIN          
+COMMIT         
 step s1b: BEGIN ISOLATION LEVEL SERIALIZABLE;
 step s1w: INSERT INTO do_write DEFAULT VALUES;
 step s1c: COMMIT;

======================================================================
...


I observe this issue with the tests: mxact, delayed_startup, and
ondisk_startup.

Best regards,
Alexander


Re: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
Andres Freund
Дата:
On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      15594
> Logged by:          Alexander Lakhin
> Email address:      exclusion@gmail.com
> PostgreSQL version: 11.1
> Operating system:   Ubuntu 18.04
> Description:        
> 
> Some tests in contrib/test_decoding/output_iso/ occasionally fail.
> To reproduce it reliable you can do:
> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i
> contrib/test_decoding/Makefile
> # make check -C contrib/test_decoding/
> 
> ...
> test delayed_startup              ... ok
> test ondisk_startup               ... ok
> test concurrent_ddl_dml           ... ok
> test oldest_xmin                  ... ok
> test snapshot_transfer            ... ok
> ============== shutting down postmaster               ==============
> 
> ==========================
>  19 of 1005 tests failed. 
> ==========================

> The differences that caused some tests to fail can be viewed in the
> file ".../contrib/test_decoding/output_iso/regression.diffs".  A copy of the
> test summary that you see
> above is saved in the file
> ".../contrib/test_decoding/output_iso/regression.out".
> 
> The diff file contains:
> --- .../contrib/test_decoding/expected/delayed_startup.out  2018-02-08
> 16:57:42.816744192 +0300
> +++ .../contrib/test_decoding/output_iso/results/delayed_startup.out       
> 2019-01-16 17:46:33.138962021 +0300
> @@ -12,6 +12,8 @@
>  step s2start: SELECT data FROM
> pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids',
> 'false');
>  data
> 
> +BEGIN          
> +COMMIT         
>  step s1b: BEGIN ISOLATION LEVEL SERIALIZABLE;
>  step s1w: INSERT INTO do_write DEFAULT VALUES;
>  step s1c: COMMIT;
> 
> ======================================================================
> ...
> 
> 
> I observe this issue with the tests: mxact, delayed_startup, and
> ondisk_startup.

Interesting - I've not seen that on the buildfarm before, or in local tests
with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1'
to the options of pg_logical_slot_get_changes()?

Greetings,

Andres Freund


Re: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
Alexander Lakhin
Дата:
16.01.2019 22:09, Andres Freund wrote:
> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote:
>> I observe this issue with the tests: mxact, delayed_startup, and
>> ondisk_startup.
> Interesting - I've not seen that on the buildfarm before, or in local tests
> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1'
> to the options of pg_logical_slot_get_changes()?
>
> Greetings,
>
> Andres Freund
Thank you, Andres.
Yes, it helps. After applying the attached patch the issue is gone.
Just to note that mxact.out and ondisk_startup.out expected an empty
transaction.
If it's not important, then the fix is ready.

Best regards,
Alexander

Вложения

Re: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
Petr Jelinek
Дата:
On 16/01/2019 20:09, Andres Freund wrote:
> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference:      15594
>> Logged by:          Alexander Lakhin
>> Email address:      exclusion@gmail.com
>> PostgreSQL version: 11.1
>> Operating system:   Ubuntu 18.04
>> Description:        
>>
>> Some tests in contrib/test_decoding/output_iso/ occasionally fail.
>> To reproduce it reliable you can do:
>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i
>> contrib/test_decoding/Makefile
>> # make check -C contrib/test_decoding/
>>
>> ...
>> test delayed_startup              ... ok
>> test ondisk_startup               ... ok
>> test concurrent_ddl_dml           ... ok
>> test oldest_xmin                  ... ok
>> test snapshot_transfer            ... ok
>> ============== shutting down postmaster               ==============
>>
>> ==========================
>>  19 of 1005 tests failed. 
>> ==========================
> 
>> The differences that caused some tests to fail can be viewed in the
>> file ".../contrib/test_decoding/output_iso/regression.diffs".  A copy of the
>> test summary that you see
>> above is saved in the file
>> ".../contrib/test_decoding/output_iso/regression.out".
>>
>> The diff file contains:
>> --- .../contrib/test_decoding/expected/delayed_startup.out  2018-02-08
>> 16:57:42.816744192 +0300
>> +++ .../contrib/test_decoding/output_iso/results/delayed_startup.out       
>> 2019-01-16 17:46:33.138962021 +0300
>> @@ -12,6 +12,8 @@
>>  step s2start: SELECT data FROM
>> pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids',
>> 'false');
>>  data
>>
>> +BEGIN          
>> +COMMIT         
>>  step s1b: BEGIN ISOLATION LEVEL SERIALIZABLE;
>>  step s1w: INSERT INTO do_write DEFAULT VALUES;
>>  step s1c: COMMIT;
>>
>> ======================================================================
>> ...
>>
>>
>> I observe this issue with the tests: mxact, delayed_startup, and
>> ondisk_startup.
> 
> Interesting - I've not seen that on the buildfarm before, or in local tests
> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1'
> to the options of pg_logical_slot_get_changes()?
> 

I bet those empty transactions are from autovacuum. The probability of
hitting this could be exaggerated by the fact that the test is run
within the same instance 1000x rather than running the make check 1000x
so we produce plenty of catalog bloat for autovacuum to clean.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Re: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
Alexander Lakhin
Дата:
17.01.2019 13:25, Petr Jelinek wrote:
> On 16/01/2019 20:09, Andres Freund wrote:
>> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote:
>>> The following bug has been logged on the website:
>>>
>>> Bug reference:      15594
>>> Logged by:          Alexander Lakhin
>>> Email address:      exclusion@gmail.com
>>> PostgreSQL version: 11.1
>>> Operating system:   Ubuntu 18.04
>>> Description:        
>>>
>>> Some tests in contrib/test_decoding/output_iso/ occasionally fail.
>>> To reproduce it reliable you can do:
>>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i
>>> contrib/test_decoding/Makefile
>>> # make check -C contrib/test_decoding/
>>>
>>> ...
>>>
>>> I observe this issue with the tests: mxact, delayed_startup, and
>>> ondisk_startup.
>> Interesting - I've not seen that on the buildfarm before, or in local tests
>> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1'
>> to the options of pg_logical_slot_get_changes()?
>>
> I bet those empty transactions are from autovacuum. The probability of
> hitting this could be exaggerated by the fact that the test is run
> within the same instance 1000x rather than running the make check 1000x
> so we produce plenty of catalog bloat for autovacuum to clean.
>
Yes, it's almost impossible to hit this with `make check`, but with
`make installcheck` the probability is much higher as we use an existing
instance.
I've reproduced it with `make check` just to make it simple.

Best regards,
Alexander


Re: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
Andres Freund
Дата:
On 2019-01-17 13:42:40 +0300, Alexander Lakhin wrote:
> 17.01.2019 13:25, Petr Jelinek wrote:
> > On 16/01/2019 20:09, Andres Freund wrote:
> >> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote:
> >>> The following bug has been logged on the website:
> >>>
> >>> Bug reference:      15594
> >>> Logged by:          Alexander Lakhin
> >>> Email address:      exclusion@gmail.com
> >>> PostgreSQL version: 11.1
> >>> Operating system:   Ubuntu 18.04
> >>> Description:        
> >>>
> >>> Some tests in contrib/test_decoding/output_iso/ occasionally fail.
> >>> To reproduce it reliable you can do:
> >>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i
> >>> contrib/test_decoding/Makefile
> >>> # make check -C contrib/test_decoding/
> >>>
> >>> ...
> >>>
> >>> I observe this issue with the tests: mxact, delayed_startup, and
> >>> ondisk_startup.
> >> Interesting - I've not seen that on the buildfarm before, or in local tests
> >> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1'
> >> to the options of pg_logical_slot_get_changes()?
> >>
> > I bet those empty transactions are from autovacuum. The probability of
> > hitting this could be exaggerated by the fact that the test is run
> > within the same instance 1000x rather than running the make check 1000x
> > so we produce plenty of catalog bloat for autovacuum to clean.
> >
> Yes, it's almost impossible to hit this with `make check`, but with
> `make installcheck` the probability is much higher as we use an existing
> instance.
> I've reproduced it with `make check` just to make it simple.

I'll apply the fix for this, but please note that installcheck
*intentionally* is disabled for test_decoding.

Greetings,

Andres Freund


Re: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/

От
Alexander Lakhin
Дата:
17.01.2019 20:39, Andres Freund wrote:
> On 2019-01-17 13:42:40 +0300, Alexander Lakhin wrote:
>> 17.01.2019 13:25, Petr Jelinek wrote:
>>> On 16/01/2019 20:09, Andres Freund wrote:
>>>> On 2019-01-16 14:56:34 +0000, PG Bug reporting form wrote:
>>>>> The following bug has been logged on the website:
>>>>>
>>>>> Bug reference:      15594
>>>>> Logged by:          Alexander Lakhin
>>>>> Email address:      exclusion@gmail.com
>>>>> PostgreSQL version: 11.1
>>>>> Operating system:   Ubuntu 18.04
>>>>> Description:        
>>>>>
>>>>> Some tests in contrib/test_decoding/output_iso/ occasionally fail.
>>>>> To reproduce it reliable you can do:
>>>>> # sed 's/ mxact / $(shell printf "mxact %.0s" `seq 1000` ) /' -i
>>>>> contrib/test_decoding/Makefile
>>>>> # make check -C contrib/test_decoding/
>>>>>
>>>>> ...
>>>>>
>>>>> I observe this issue with the tests: mxact, delayed_startup, and
>>>>> ondisk_startup.
>>>> Interesting - I've not seen that on the buildfarm before, or in local tests
>>>> with a lot of repetitions. I assume the problem vanishes if you add 'skip-empty-xacts', '1'
>>>> to the options of pg_logical_slot_get_changes()?
>>>>
>>> I bet those empty transactions are from autovacuum. The probability of
>>> hitting this could be exaggerated by the fact that the test is run
>>> within the same instance 1000x rather than running the make check 1000x
>>> so we produce plenty of catalog bloat for autovacuum to clean.
>>>
>> Yes, it's almost impossible to hit this with `make check`, but with
>> `make installcheck` the probability is much higher as we use an existing
>> instance.
>> I've reproduced it with `make check` just to make it simple.
> I'll apply the fix for this, but please note that installcheck
> *intentionally* is disabled for test_decoding.
Thanks, Andres! I see a comment in Makefile:
# Disabled because these tests require "wal_level=logical", which
# typical installcheck users do not have (e.g. buildfarm clients).

So if I use for the installcheck'ed instance a config with the required
wal_level, is it ok to perform `make installcheck-force -C
contrib/test_decoding`?
Or maybe there is some other difficulty with installcheck?

Best regards,
Alexander