Dropping publication breaks logical replication
От | Ashutosh Bapat |
---|---|
Тема | Dropping publication breaks logical replication |
Дата | |
Msg-id | CAExHW5s_vBPb_8o0kBbC1PZo3Qb-7NfzReD53iynkR-jQmjd-w@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Dropping publication breaks logical replication
Re: Dropping publication breaks logical replication Re: Dropping publication breaks logical replication |
Список | pgsql-hackers |
Hi Vignesh, Amit, We encountered a situation where a customer dropped a publication accidentally and that broke logical replication in an irrecoverable manner. This is PG 15.3 but the team confirmed that the behaviour is reproducible with PG 17 as well. When a WAL sender processes a WAL record recording a change in publication, it ends up calling LoadPublication() which throws an error if a publication mentioned in START_REPLICATION command is not found. The downstream tries to reconnect but the WAL sender again repeats the same process going in an error loop. Creating the publication does not help since WAL sender will always encounter the WAL record dropping the publication first. There are ways to come out of this situation, but not very clean always 1. Remove publication from subscription, run logical replication till it passes the point where publication was added, add the publication back and continue. It's not always possible to know when the publication was added back and thus it becomes tedious or next to impossible to apply these steps. 2. Reseeding the replication slot which involves copying all the data again and not feasible in case of large databases. 3. Skipping the transaction which dropped the publication. This will work if drop publication was the only thing in that transaction but not otherwise. Confirming that is tricky and requires some expert help. In PG 18 onwards, this behaviour is fixed by throwing a WARNING instead of an error. In the relevant thread [1] where the fix to PG 18 was discussed, backpatching was also discussed. Back then it was deferred because of lack of field reports. But we are seeing this situation now. So maybe it's time to backpatch the fix. Further PG 15 documentation mentions that https://www.postgresql.org/docs/15/sql-createsubscription.html. So the users will expect that their logical replication will not be affected (except for the data published by the publication) if a publication is dropped or does not exist. So, backpatching the change would make the behaviour compatible with the documentation. The backport seems to be straight forward. Please let me know if you need my help in doing so, if we decide to backport the fix. -- Best Wishes, Ashutosh Bapat
В списке pgsql-hackers по дате отправления: