Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog?

Поиск
Список
Период
Сортировка
От Daniel Farina
Тема Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog?
Дата
Msg-id BANLkTin03DyWL7MPOK5Kq7m5YX6gnArSgA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog?  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Apr 21, 2011 at 8:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 7:15 AM, Daniel Farina <daniel@heroku.com> wrote:
>> To start at the end of this story: "DETAIL:  Could not read from file
>> "pg_clog/007D" at offset 65536: Success."
>>
>> This is a message we received on a a standby that we were bringing
>> online as part of a test.  The clog file was present, but apparently
>> too small for Postgres (or at least I tihnk this is what the message
>> meant), so one could stub in another clog file and then continue
>> recovery successfully (modulus the voodoo of stubbing in clog files in
>> general).  I am unsure if this is due to an interesting race condition
>> in Postgres or a result of my somewhat-interesting hot-backup
>> protocol, which is slightly more involved than the norm.  I will
>> describe what it does here:
>>
>> 1) Call pg start backup
>> 2) crawl the entire postgres cluster directory structure, except
>> pg_xlog, taking notes of the size of every file present
>> 3) begin writing TAR files, but *only up to the size noted during the
>> original crawling of the cluster directory,* so if the file grows
>> between the original snapshot and subsequently actually calling read()
>> on the file those extra bytes will not be added to the TAR.
>>  3a) If a file is truncated partially, I add "\0" bytes to pad the
>> tarfile member up to the size sampled in step 2, as I am streaming the
>> tar file and cannot go back in the stream and adjust the tarfile
>> member size
>> 4) call pg stop backup
>
> In theory I would expect any defects introduced by the, ahem,
> exciting, procedure described in steps 3 and 3a to be corrected by
> recovery automatically when you start the new cluster.

Neat.  This is mostly what I was looking to get out of this thread, I
will start looking for places where I have botched things.

Although some of the frontend interface and some of the mechanism is
embarrassingly rough for several reasons, the other thread posters can
have access to the code if they wish: the code responsible for these
shenangians can be found at https://github.com/heroku/wal-e (and
https://github.com/fdr/wal-e), in the tar_partition.py file.
(https://github.com/heroku/WAL-E/blob/master/wal_e/tar_partition.py)

But I realize that's really too much detail for most people to be
interested in, which is why I didn't post it in the first place.  I
think given your assessment I have enough to try to reproduce this
case synthetically (I think taking a very old pg_clog snapshot,
committing a few million xacts while not vacuuming, and then trying to
merge the old clog otherwise newer base backup may prove out the
mechanism I have in mind) or add some more robust logging so I can
catch my (or any, really) problem.

--
fdr


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: fsync reliability
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Formatting Curmudgeons WAS: MMAP Buffers