Re: Random pg_upgrade test failure on drongo

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: Random pg_upgrade test failure on drongo
Дата
Msg-id 685bc1bd-fd46-2747-b45f-5c700e5a7c65@gmail.com
обсуждение исходный текст
Ответ на RE: Random pg_upgrade test failure on drongo  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Ответы Re: Random pg_upgrade test failure on drongo
Список pgsql-hackers
Hello Kuroda-san,

09.01.2024 08:49, Hayato Kuroda (Fujitsu) wrote:
> Based on the suggestion by Amit, I have created a patch with the alternative
> approach. This just does GUC settings. The reported failure is only for
> 003_logical_slots, but the patch also includes changes for the recently added
> test, 004_subscription. IIUC, there is a possibility that 004 would fail as well.
>
> Per our understanding, this patch can stop random failures. Alexander, can you
> test for the confirmation?
>

Yes, the patch fixes the issue for me (without the patch I observe failures
on iterations 1-2, with 10 tests running in parallel, but with the patch
10 iterations succeeded).

But as far I can see, 004_subscription is not affected by the issue,
because it doesn't enable streaming for nodes new_sub, new_sub1.
As I noted before, I could see the failure only with
shared_buffers = 1MB (which is set with allows_streaming => 'logical').
So I'm not sure, whether we need to modify 004 (or any other test that
runs pg_upgrade).

As to checkpoint_timeout, personally I would not increase it, because it
seems unbelievable to me that pg_restore (with the cluster containing only
two empty databases) can run for longer than 5 minutes. I'd rather
investigate such situation separately, in case we encounter it, but maybe
it's only me.
On the other hand, if a checkpoint could occur by some reason within a
shorter time span, then increasing the timeout would not matter, I suppose.
(I've also tested the bgwriter_lru_maxpages-only modification of your patch
and can confirm that it works as well.)

Best regards,
Alexander



В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: [HACKERS] make async slave to wait for lsn to be replayed
Следующее
От: John Naylor
Дата:
Сообщение: Re: Add BF member koel-like indentation checks to SanityCheck CI