Re: Logical replication timeout problem

Поиск
Список
Период
Сортировка
От Fabrice Chapuis
Тема Re: Logical replication timeout problem
Дата
Msg-id CAA5-nLABf97QKAR8K8NiQs2s6_323dvd7kpAdJ3GZ+p2iR5K7A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Logical replication timeout problem  (Fabrice Chapuis <fabrice636861@gmail.com>)
Ответы Re: Logical replication timeout problem  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Hello,
Our lab is ready now. Amit,  I compile Postgres 10.18 with your patch.Tang, I used your script to configure logical replication between 2 databases and to generate 10 million entries in an unreplicated foo table. On a standalone instance no error message appears in log.
I activate the physical replication between 2 nodes, and I got following error:

2021-11-10 10:49:12.297 CET [12126] LOG:  attempt to send keep alive message
2021-11-10 10:49:12.297 CET [12126] STATEMENT:  START_REPLICATION 0/3000000 TIMELINE 1
2021-11-10 10:49:15.127 CET [12064] FATAL:  terminating logical replication worker due to administrator command
2021-11-10 10:49:15.127 CET [12036] LOG:  worker process: logical replication worker for subscription 16413 (PID 12064) exited with exit code 1
2021-11-10 10:49:15.155 CET [12126] LOG:  attempt to send keep alive message

This message look like strange because no admin command have been executed during data load.
I did not find any error related to the timeout.
The message coming from the modification made with the patch comes back all the time: attempt to send keep alive message. But there is no "sent keep alive message".

Why logical replication worker exit when physical replication is configured?

Thanks for your help

Fabrice



On Fri, Oct 8, 2021 at 9:33 AM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Thanks Tang for your script. 
Our debugging environment will be ready soon. I will test your script and we will try to reproduce the problem by integrating the patch provided by Amit. As soon as I have results I will let you know.

Regards

Fabrice

On Thu, Sep 30, 2021 at 3:15 AM Tang, Haiying/唐 海英 <tanghy.fnst@fujitsu.com> wrote:

On Friday, September 24, 2021 12:04 AM, Fabrice Chapuis <fabrice636861@gmail.com> wrote:

>

> Thanks for your patch, we are going to set up a lab in order to debug the function.

 

Hi

 

I tried to reproduce this timeout problem on version10.18 but failed.

In my trial, I inserted large amounts of data at publisher, which took more than 1 minute to replicate.

And with the patch provided by Amit, I saw that the frequency of invoking

WalSndKeepaliveIfNecessary function is raised after I inserted data.

 

The test script is attached. Maybe you can try it on your machine and check if this problem could happen.

If I miss something in the script, please let me know.

Of course, it will be better if you can provide your script to reproduce the problem.

 

Regards

Tang

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Should AT TIME ZONE be volatile?
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display