Обсуждение: invalid record length at XX: wanted 24, got

Поиск

Список

Период

Сортировка

invalid record length at XX: wanted 24, got

От

Mariel Cherkassky

Дата:

20 августа 2019 г., 06:43:52

Hey,

I have 2 db nodes(9.6) configured with streaming replication (+repmgr). Suddenly ysterday my secondary stopped syncing and I saw the following error in the log :

invalid record length at X/YYYYY: wanted 24, got

In addition, since then, the secondary db keeps restoring the same wal file (kinda stuck on restorying it).

I guess that the wal was missing some data / corrupted so I tried to copy it from the primary but it didnt help. In addition, I decided to start the secondary in read write but it failed with the following error :

LOG: invalid primary checkpoint record
LOG: invalid secondary checkpoint record
PANIC: could not locate a valid checkpoint record
LOG: startup process (PID 17096) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure
LOG: database system is shut down

My next idea is using pg_resetxlog in order to start the secondary successfully and then use pg_rewind to sync it again with the master. The master is working perfectly and there arent any issues on it. Right now, I'm not interested in taking a basebackup and creating the secondary from scratch..

I will be happy to hear if u guys have any other ideas why it might happened and how I can handle it.

Thanks

Re: invalid record length at XX: wanted 24, got

От

Jeff Janes

Дата:

20 августа 2019 г., 14:14:09

On Tue, Aug 20, 2019 at 2:44 AM Mariel Cherkassky <mariel.cherkassky@gmail.com> wrote:

Hey,
I have 2 db nodes(9.6) configured with streaming replication (+repmgr). Suddenly ysterday my secondary stopped syncing and I saw the following error in the log :
invalid record length at X/YYYYY: wanted 24, got

Did it really just end the message with "got"?

My next idea is using pg_resetxlog in order to start the secondary successfully and then use pg_rewind to sync it again with the master. The master is working perfectly and there arent any issues on it.

Since you don't know what went wrong, I don't think I'd rely on pg_rewind to fix it. Also, while I haven't use pg_rewind, I think it requires the destination to be shut down while it runs. So pg_resetxlog would not be needed, and likely even harmful.

Right now, I'm not interested in taking a basebackup and creating the secondary from scratch..

Why not? Too much disk activity? Too much network traffic? If the latter, you could do a low level backup, using rsync in checksum mode as the file transfer method.

Cheers,

Jeff

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: invalid record length at XX: wanted 24, got

invalid record length at XX: wanted 24, got

Re: invalid record length at XX: wanted 24, got