Re: [RFC] What should we do for reliable WAL archiving?

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: [RFC] What should we do for reliable WAL archiving?
Дата
Msg-id CAMkU=1wM5CvcTQB2DXnt1v_NcmT0e=aiWQCZJ7+Zci4gB-HovQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [RFC] What should we do for reliable WAL archiving?  ("MauMau" <maumau307@gmail.com>)
Ответы Re: [RFC] What should we do for reliable WAL archiving?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, Mar 21, 2014 at 2:22 PM, MauMau <maumau307@gmail.com> wrote:
From: "Jeff Janes" <jeff.janes@gmail.com>

Do people really just copy the files from one directory of local storage to
another directory of local storage?  I don't see the point of that.

It makes sense to archive WAL to a directory of local storage for media recovery.  Here, the local storage is a different disk drive which is directly attached to the database server or directly connected through SAN.


For a SAN I guess we have different meanings of "local" :)  
(I have no doubt yours is correct--the fine art of IT terminology is not my thing.)


The recommendation is to refuse to overwrite an existing file of the same
name, and exit with failure.  Which essentially brings archiving to a halt,
because it keeps trying but it will keep failing.  If we make a custom
version, one thing it should do is determine if the existing archived file
is just a truncated version of the attempting-to-be archived file, and if
so overwrite it.  Because if the first archival command fails with a
network glitch, it can leave behind a partial file.

What I'm trying to address is just an alternative to cp/copy which fsyncs a file.  It just overwrites an existing file.

Yes, you're right, the failed archive attempt leaves behind a partial file which causes subsequent attempts to fail, if you follow the PG manual. That's another undesirable point in the current doc.  To overcome this, someone on this ML recommended me to do "cp %p /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f".  Does this solve your problem?

As written is doesn't solve it, as it just unconditionally overwrites the file.  If you wanted that you could just do the single-statement unconditional overwrite.  

You could make it so that the .tmp gets overwritten unconditionally, but the move of it will not overwrite an existing permanent file.  That would solve the problem where a glitch in the network leaves in incomplete file behind that blocks the next attempt, *except* that mv on (at least some) network file systems is really a copy, and not an atomic rename, so is still subject to leaving behind incomplete crud.

But, it is hard to tell what the real solution is, because the doc doesn't explain why it should refuse (and fail) to overwrite an existing file.  The only reason I can think of to make that recommendation is because it is easy to accidentally configure two clusters to attempt to archive to the same location, and having them overwrite each others files should be guarded against.  If I am right, it seems like this reason should be added to the docs, so people know what they are defending against.  And if I am wrong, it seems even more important that the (correct) reason is added to the docs.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: psql \d+ and oid display
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [RFC] What should we do for reliable WAL archiving?