Re: fairywren timeout failures on the pg_amcheck/005_opclass_damage test

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: fairywren timeout failures on the pg_amcheck/005_opclass_damage test
Дата
Msg-id 94c920db-15b0-ea35-d1a9-495678488745@gmail.com
обсуждение исходный текст
Ответ на fairywren timeout failures on the pg_amcheck/005_opclass_damage test  (Alexander Lakhin <exclusion@gmail.com>)
Ответы Re: fairywren timeout failures on the pg_amcheck/005_opclass_damage test
Список pgsql-hackers
25.07.2024 15:00, Alexander Lakhin wrote:

>
> The other question is: why is 005_opclass_damage taking so much time there?
> ...
> $ make -s check -C src/bin/pg_amcheck/ PROVE_TESTS="t/005*" PROVE_FLAGS="--timer"
> [11:11:53] t/005_opclass_damage.pl .. ok     1370 ms ( 0.00 usr 0.00 sys +  0.10 cusr  0.07 csys =  0.17 CPU)
>
> $ echo "debug_parallel_query = regress" >/tmp/extra.config
> $ TEMP_CONFIG=/tmp/extra.config make -s check -C src/bin/pg_amcheck/ PROVE_TESTS="t/005*" PROVE_FLAGS="--timer"
> [11:12:46] t/005_opclass_damage.pl .. ok    40854 ms ( 0.00 usr 0.00 sys +  0.10 cusr  0.10 csys =  0.20 CPU)
>
> ...
> So maybe at least this test should be improved for testing with
> debug_parallel_query enabled, if such active use of parallel workers by
> pg_amcheck can't be an issue to end users?
>

When running this test with "log_min_messages = DEBUG2" in my extra.config,
I see thousands of the following messages in the test log:
2024-07-26 09:32:54.544 UTC [2572189:46] DEBUG:  postmaster received pmsignal signal
2024-07-26 09:32:54.544 UTC [2572189:47] DEBUG:  registering background worker "parallel worker for PID 2572197"
2024-07-26 09:32:54.544 UTC [2572189:48] DEBUG:  starting background worker process "parallel worker for PID 2572197"
2024-07-26 09:32:54.547 UTC [2572189:49] DEBUG:  unregistering background worker "parallel worker for PID 2572197"
2024-07-26 09:32:54.547 UTC [2572189:50] DEBUG:  background worker "parallel worker" (PID 2572205) exited with exit
code0
 
2024-07-26 09:32:54.547 UTC [2572189:51] DEBUG:  postmaster received pmsignal signal
2024-07-26 09:32:54.547 UTC [2572189:52] DEBUG:  registering background worker "parallel worker for PID 2572197"
2024-07-26 09:32:54.547 UTC [2572189:53] DEBUG:  starting background worker process "parallel worker for PID 2572197"
2024-07-26 09:32:54.549 UTC [2572189:54] DEBUG:  unregistering background worker "parallel worker for PID 2572197"
2024-07-26 09:32:54.549 UTC [2572189:55] DEBUG:  background worker "parallel worker" (PID 2572206) exited with exit
code0
 
...

grep ' registering background worker' \
  src/bin/pg_amcheck/tmp_check/log/005_opclass_damage_test.log | wc -l
15669

So this test launches more than 15000 processes when debug_parallel_query
is enabled.

As far as I can see, this is happening because of the "PARALLEL UNSAFE"
marking is ignored when the function is called by CREATE INDEX/amcheck.

Namely, with a function defined as:
     CREATE FUNCTION int4_asc_cmp (a int4, b int4) RETURNS int LANGUAGE sql AS $$
         SELECT CASE WHEN $1 = $2 THEN 0 WHEN $1 > $2 THEN 1 ELSE -1 END; $$;

SELECT int4_asc_cmp(1, 2);
executed without parallel workers. Whilst when it's used by an index:
CREATE OPERATOR CLASS int4_fickle_ops FOR TYPE int4 USING btree AS
...
OPERATOR 5 > (int4, int4), FUNCTION 1 int4_asc_cmp(int4, int4);

INSERT INTO int4tbl (SELECT * FROM generate_series(1,1000) gs);

CREATE INDEX fickleidx ON int4tbl USING btree (i int4_fickle_ops);
launches 1000 parallel workers.

(This is reminiscent of bug #18314.)

One way to workaround this is to disable debug_parallel_query in the test
and another I find possible is to set max_parallel_workers = 0.

Best regards,
Alexander



В списке pgsql-hackers по дате отправления:

Предыдущее
От: shveta malik
Дата:
Сообщение: Re: Conflict detection and logging in logical replication
Следующее
От: Yao Wang
Дата:
Сообщение: Re: 回复: An implementation of multi-key sort