Re: [PATCH TEST] Fix logical replication setup in subscription test `t/009_matviews.pl`
От | Michael Paquier |
---|---|
Тема | Re: [PATCH TEST] Fix logical replication setup in subscription test `t/009_matviews.pl` |
Дата | |
Msg-id | aOhwNHujMnt5YqaM@paquier.xyz обсуждение исходный текст |
Ответ на | [PATCH TEST] Fix logical replication setup in subscription test `t/009_matviews.pl` (Грем Снорт <grem.snoort@gmail.com>) |
Ответы |
RE: [PATCH TEST] Fix logical replication setup in subscription test `t/009_matviews.pl`
|
Список | pgsql-hackers |
On Thu, Oct 09, 2025 at 03:37:25PM +0300, Грем Снорт wrote: > I've found a simple problem in one of subscription tests > (`src/test/subscription/t/009_matviews.pl`). (Added a couple of folks in CC.) Hmm, something else is going on here, and I am not sure what yet (a bisect is annoying as the test depends on a timeout for failure detection, see below for more ranting). The backend change coupled with this test comes from bc1adc651b8e, first introduced in v11. At the top of REL_11_STABLE, which is the first branch where the test has been introduced, if I update pgoutput.c and remove the is_publishable_relation() call in pgoutput_change() to undo the fix, then the test is able to hang as it is designed. Now, if I do the same thing on HEAD, removing the check, then the test passes! Something else is going on here: the test is not checking what it has been written for. Applying your patch does not change this state. As far as I can see, the test is broken since v17. Up to v16, the test would hang once the fix in pgoutput.c is reverted. In v17 and newer versions, it does not. While something specific to v17 is to blame here, I am also going to complain about the way this test is writen and designed to fail: a failing scenario should be deterministic, and should check some state in the cluster to validate something, be it a lookup at some relation, some catalogs or some server logs. 009_matviews.pl does nothing like that: a failure is a test hanging with the failure detected by a timeout. From my perspective, this is a poor design choice, and one reason why nobody has noticed the regression I'm just finding in v17 after looking more closely as an effect of your patch. Amit, Kurada-san or Sawada-san, does something ring a bell? There have been many changes in the logical replication code since v17, and it sounds like an issue introduced by one of these recent changes, but I have to admit that I am not seeing anything obvious (that's not dcd4454590e7, checked it). Up to v16, the test loops with the following failure popping in the subscriber logs: 2025-10-10 11:24:15.884 JST [25148] ERROR: logical replication target relation "public.testmv1" does not exist 2025-10-10 11:24:15.884 JST [25148] CONTEXT: processing remote data for replication origin "pg_16391" during message type "INSERT" in transaction 733, finished at 0/14BBE08 From v17, the subscriber logs just accepts things, without the worker complaining about a matview: 2025-10-10 11:27:10.020 JST [32467] LOG: logical replication table synchronization worker for subscription "mysub", table "test1" has started 2025-10-10 11:27:10.041 JST [32467] LOG: logical replication table synchronization worker for subscription "mysub", table "test1" has finished 2025-10-10 11:27:10.120 JST [32443] LOG: received fast shutdown request I am attempting a bisect, as well, perhaps I'll be able to catch something... -- Michael
Вложения
В списке pgsql-hackers по дате отправления: