RE: Replication is stuck

Поиск
Список
Период
Сортировка
От Murthy Nunna
Тема RE: Replication is stuck
Дата
Msg-id DM8PR09MB66777AE5CEB40A788B00B114B8CB2@DM8PR09MB6677.namprd09.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Replication is stuck  (Ninad Shah <ninad.shah@percona.com>)
Ответы Re: Replication is stuck
Список pgsql-admin

Thanks, Ninad. Looks like there is some error in 0000000100013D94000000FF. Any way to tell if this is logical corruption or physical corruption. In other words if this is file system corruption or of postgres generated corrupted file?

 

pg_waldump -q 0000000100013D94000000FE

[no errors]

 

pg_waldump -q 0000000100013D94000000FF

pg_waldump: fatal: error in WAL record at 13D94/FFBFFF48: invalid magic number 0000 in log segment 0000000100013D94000000FF, offset 12582912

 

pg_waldump -q 0000000100013D9500000000

[no errors]

 

 

From: Ninad Shah <ninad.shah@percona.com>
Sent: Sunday, June 23, 2024 7:16 AM
To: Murthy Nunna <mnunna@fnal.gov>
Cc: pgsql-admin@postgresql.org
Subject: Re: Replication is stuck

 

[EXTERNAL] – This message is from an external sender

Hi Murthy,

 

Would you please generate a pg_waldump of 0000000100013D94000000FF, 0000000100013D94000000FE and 0000000100013D9500000000?


Thanks,

--

Ninad Shah
PostgreSQL DBA I, Managed Services

e: ninad.shah@percona.com

 w: www.percona.com

Databases Run Better With Percona

 

 

On Sun, Jun 23, 2024 at 5:32 PM Murthy Nunna <mnunna@fnal.gov> wrote:

I am running pg14.4. I use WAL replication in a stand-by server which is 7-days behind primary (recovery_min_apply_delay = 7d)

 

My replication is stuck. It looks like it is repeatedly applying same WAL file. The next WAL file(s) are very much there.

 

I restarted cluster but it didn’t fix the issue.

 

I appreciate any help you can provide before I rebuild the stand-by. I am trying to find the root cause. If 0000000100013D94000000FF is corrupted how can we tell?

 

2024-06-23 06:54:57 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:02 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:07 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:12 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:17 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:22 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:27 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:32 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:37 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

2024-06-23 06:55:42 CDT []LOG:  restored log file "0000000100013D94000000FF" from archive

 

 

There are no missing WALs:

 

ls -ltr 0000000100013D95000000* |more

-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000000

-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000001

-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000002

-rw------- 1 postgres postgres 16777216 Jun 14 19:39 0000000100013D9500000003

-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000004

-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000005

-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000006

-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000007

-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000008

-rw------- 1 postgres postgres 16777216 Jun 14 19:40 0000000100013D9500000009

-rw------- 1 postgres postgres 16777216 Jun 14 19:41 0000000100013D950000000A

-rw------- 1 postgres postgres 16777216 Jun 14 19:41 0000000100013D950000000B

 

 

 

В списке pgsql-admin по дате отправления:

Предыдущее
От: Ninad Shah
Дата:
Сообщение: Re: Replication is stuck
Следующее
От: Ninad Shah
Дата:
Сообщение: Re: Replication is stuck