[bug fix] pg_rewind creates corrupt WAL files, and the standbycannot catch up the primary
От | Tsunakawa, Takayuki |
---|---|
Тема | [bug fix] pg_rewind creates corrupt WAL files, and the standbycannot catch up the primary |
Дата | |
Msg-id | 0A3221C70F24FB45833433255569204D1F8DAAA2@G01JPEXMBYT05 обсуждение исходный текст |
Ответы |
Re: [bug fix] pg_rewind creates corrupt WAL files, and the standbycannot catch up the primary
Re: [bug fix] pg_rewind creates corrupt WAL files, and the standbycannot catch up the primary |
Список | pgsql-hackers |
Hello, Our customer hit another bug of pg_rewind with PG 9.5. The attached patch fixes this. PROBLEM ======================================== After a long run of successful pg_rewind, the synchronized standby could not catch up the primary forever, emitting the followingmessage repeatedly: LOG: XX000: could not read from log segment 000000060000028A00000031, offset 16384: No error CAUSE ======================================== If the primary removes WAL files that pg_rewind is going to get, pg_rewind leaves 0-byte WAL files in the target directoryhere: [libpq_fetch.c] case FILE_ACTION_COPY: /* Truncate the old file out of the way, if any */ open_target_file(entry->path, true); fetch_file_range(entry->path, 0, entry->newsize); break; pg_rewind completes successfully, create recovery.conf, and then start the standby in the target cluster. walreceiver receivesWAL records and write them to the 0-byte WAL files. Finally, xlog reader complains that he cannot read a WAL page. FIX ======================================== pg_rewind deletes the file when it finds that the primary has deleted it. OTHER THOUGHTS ======================================== BTW, should pg_rewind really copy WAL files from the primary? If the sole purpose of pg_rewind is to recover an instanceto use as a standby, can pg_rewind just remove all WAL files in the target directory, because the standby can getWAL files from the primary and/or archive? Related to this, shouldn't pg_rewind avoid copying more files and directories like pg_basebackup? Currently, pg_rewind doesn'tcopy postmaster.pid, postmaster.opts, and temporary files/directories (pg_sql_tmp/). Regards Takayuki Tsunakawa
Вложения
В списке pgsql-hackers по дате отправления: