Re: Intermittent Issue with WAL Segment Removal in Logical Replication

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Intermittent Issue with WAL Segment Removal in Logical Replication
Дата
Msg-id 51b620b8-d13b-649e-4f1d-c57f1d73eb87@enterprisedb.com
обсуждение исходный текст
Ответ на Intermittent Issue with WAL Segment Removal in Logical Replication  (Kaushik Iska <kaushik@peerdb.io>)
Список pgsql-general
On 12/29/23 22:28, Kaushik Iska wrote:
> I am unfortunately not really familiar with Google Cloud SQL internals
> as well. But we have seen this happen on Amazon RDS as well.
> 

Do you have a reproducer for regular Postgres?

> Could it be possible that we are requesting a future WAL segment, say
> WAL upto X is written and we are asking for X + 1? It could be that the
> error message is misleading.
> 

I don't think that should be possible. The LSN in the START_REPLICATION
comes from the replica, where it's tracked as the last LSN received from
the upstream. So that shouldn't be in the future. And it's doesn't seem
to be suspiciously close to segment boundary either.

In fact, the LSN in the message is 6/5AE67D79, but the "failed" segment
is 000000010000000600000059, which is the *preceding* one. So it can't
be in the future.

> I do not have the information from pg_replication_slots as I have
> terminated the test. I am fairly certain that I can reproduce this
> again. I will gather both the restart_lsn and contents of pg_wal for the
> failed segment. Is there any other information that would help debug
> this further?
> 

Hard to say. The best thing would be to have a reproducer script, ofc.
If that's not possible, the information already requested seems like a
good start.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-general по дате отправления:

Предыдущее
От: Kaushik Iska
Дата:
Сообщение: Re: Intermittent Issue with WAL Segment Removal in Logical Replication
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: Need help