Re: [HACKERS] Support for pg_receivexlog --format=plain|tar

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: [HACKERS] Support for pg_receivexlog --format=plain|tar
Дата
Msg-id CABUevEzEvuG6FgSPm0r8smQVW1rSo1HssgSnkeALAhEv8BCL7w@mail.gmail.com
обсуждение исходный текст
Ответ на [HACKERS] Support for pg_receivexlog --format=plain|tar  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] Support for pg_receivexlog --format=plain|tar  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers


On Tue, Dec 27, 2016 at 2:23 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
Hi all,

Since 56c7d8d4, pg_basebackup supports tar format when streaming WAL
records. This has been done by introducing a new transparent routine
layer to control the method used to fetch WAL walmethods.c: plain or
tar.

pg_receivexlog does not make use of that yet, but I think that it
could to allow retention of more WAL history within the same amount of
disk space. OK, disk space is cheap but for some users things like
that matters to define a duration retention policy. Especially when
things are automated around Postgres. I really think that
pg_receivexlog should be able to support an option like
--format=plain|tar. "plain" is the default, and matches the current
behavior. This option is of course designed to match pg_basebackup's
one.

So, here is in details what would happen if --format=tar is done:
- When streaming begins, write changes to a tar stream, named
segnum.tar.partial as long as the segment is not completed.
- Once the segment completes, rename it to segnum.tar.
- each individual segment has its own tarball.
- if pg_receivexlog fails to receive changes in the middle of a
segment, it begins streaming back at the beginning of a segment,
considering that the current .partial segment is corrupted. So if
server comes back online, empty the current .partial file and begin
writing on it again. (I have found a bug on HEAD in this area
actually). 

Magnus, you have mentioned me as well that you had a couple of ideas
on the matter, feel free to jump in and let's mix our thoughts!

Yeah, I've been wondering what the actual usecase is here :)

Though I was considering the case where all segments are streamed into the same tarfile (and then some sort of configurable limit where we'd switch tarfile after <n> segments, which rapidly started to feel too complicated).

What's the actual advantage of having it wrapped inside a single tarfile?

 
There are a couple of things that I have been considering as well for
pg_receivexlog. Though they are not directly stick to this thread,
here they are as I don't forget about them:
- Removal of oldest WAL segments on a partition. When writing WAL
segments to a dedicated partition, we could have an option that
automatically removes the oldest WAL segment if the partition is full.
This triggers once a segment is completed.
- Compression of fully-written segments. When a segment is finished
being written, pg_receivexlog could compress them further with gz for
example. With --format=t this leads to segnum.tar.gz being generated.
The advantage of doing those two things in pg_receivexlog is
monitoring. One process to handle them all, and there is no need of
cron jobs to handle any cleanup or compression.

I was at one point thinking that would be a good idea as well, but recently I've more been thinking that what we should do is implement a "--post-segment-command", which would act similar to archive_command but started by pg_receivexlog. This could handle things like compression, and also integration with external backup tools like backrest or barman in a cleaner way. We could also spawn this without waiting for it to finish immediately, which would allow parallellization of the process. When doing the compression inline that rapidly becomes the bottleneck. Unlike a basebackup you're only dealing with the need to buffer 16Mb on disk before compressing it, so it should be fairly cheap.

Another thing I've been considering in the same area would be to add the ability to write the segments to a pipe instead of a directory. Then you could just pipe it into gzip without the need to buffer on disk. This would kill the ability to know at which point we'd sync()ed to disk, but in most cases so will doing direct gzip. Just means we couldn't support this in sync mode.

I can see the point of being able to compress the individual segments directly in pg_receivexlog in smaller systems though, without the need to rely on an external compression program as well. But in that case, is there any reason we need to wrap it in a tarfile, and can't just write it to <segment>.gz natively?


--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rajkumar Raghuwanshi
Дата:
Сообщение: Re: [HACKERS] Declarative partitioning - another take
Следующее
От: Erik Rijkers
Дата:
Сообщение: Re: [HACKERS] comments tablecmds.c