Обсуждение: Avoiding a needless failure of PITR

Поиск
Список
Период
Сортировка

Avoiding a needless failure of PITR

От
Fujii Masao
Дата:
Hi,

PITR always fails in finding the archived log file with wrong size. But,
I think that we can continue PITR if .ready file of the same name exists
in XLOGDIR/archive_status, ie the complete file might exist in XLOGDIR.

I want to modify the implementation of PITR a little as follows.

- In PITR, if the archived log file with wrong size is found, we check for .ready in XLOGDIR/archive_status.

- If .ready exists, we try to continue PITR by using the log file in XLOGDIR. (The log message about the situation
mightbe needed.)
 

- Otherwise, we make PITR fail as it is (making fatal error).


Is it worth making the patch?

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
TEL (03)5860-5115
FAX (03)5463-5490


Re: Avoiding a needless failure of PITR

От
Tom Lane
Дата:
Fujii Masao <fujii.masao@oss.ntt.co.jp> writes:
> PITR always fails in finding the archived log file with wrong size.

If the archived file is actually broken like that, it hardly seems
prudent to keep going ...
        regards, tom lane


Re: Avoiding a needless failure of PITR

От
Simon Riggs
Дата:
On Thu, 2008-04-24 at 23:25 +0900, Fujii Masao wrote:
> Hi,
> 
> PITR always fails in finding the archived log file with wrong size. But,
> I think that we can continue PITR if .ready file of the same name exists
> in XLOGDIR/archive_status, ie the complete file might exist in XLOGDIR.
> 
> I want to modify the implementation of PITR a little as follows.
> 
> - In PITR, if the archived log file with wrong size is found,
>   we check for .ready in XLOGDIR/archive_status.
> 
> - If .ready exists, we try to continue PITR by using the log file in XLOGDIR.
>   (The log message about the situation might be needed.)
> 
> - Otherwise, we make PITR fail as it is (making fatal error).

If you do get this error *and* you have a good copy of the file
somewhere, then you can copy it to the archive and restart recovery.

If we didn't fail, but checked for a local copy then it would have
worked automatically in your case, thats true. But it would fail in any
other case where a truncated file was copied over even though a good
copy is available, such as when 2 copies of archived files are
maintained remotely.

We should look upon the FATAL error as an opportunity to intervene and
then restart recovery, rather than a problem itself.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com