Re: [HACKERS] Support for pg_receivexlog --format=plain|tar

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: [HACKERS] Support for pg_receivexlog --format=plain|tar
Дата
Msg-id CABUevEwgrysESOSsJBy+wGAyoSxhyEn6mMFwXbfQv+=URMDd5g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Support for pg_receivexlog --format=plain|tar  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] Support for pg_receivexlog --format=plain|tar  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
On Tue, Dec 27, 2016 at 1:16 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Dec 27, 2016 at 6:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Dec 27, 2016 at 2:23 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>> Magnus, you have mentioned me as well that you had a couple of ideas
>> on the matter, feel free to jump in and let's mix our thoughts!
>
>
> Yeah, I've been wondering what the actual usecase is here :)

There is value to compress segments finishing with trailing zeros,
even if they are not the species with the highest representation in
the WAL archive.


Agreed on that part -- that's the value in compression though, and not necessarily the TAR format itself.

Is there any value of the TAR format *without* compression in your scenario?

 
> Though I was considering the case where all segments are streamed into the
> same tarfile (and then some sort of configurable limit where we'd switch
> tarfile after <n> segments, which rapidly started to feel too complicated).
>
> What's the actual advantage of having it wrapped inside a single tarfile?

I am advocating for one tar file per segment to be honest. Grouping
them makes the failure handling more complicated when connection to
the server is killed, or the replication stream is cut. Well, not
really complicated actually, because I think that you would need to
drop in the segment folder a status file with enough information to
let pg_receivexlog know from where in the tar file it needs to
continue writing. If a new tarball is created for each segment,
deciding from where to stream after a connection failure is just a
matter of doing what is done today: having a look at the completed
segments and begin streaming from the incomplete/absent one.

This pretty much matches up with the conclusion I got to myself as well. We could create a new tarfile for each restart of pg_receivexlog, but then it becomes unpredictable.

 
>> There are a couple of things that I have been considering as well for
>> pg_receivexlog. Though they are not directly stick to this thread,
>> here they are as I don't forget about them:
>> - Removal of oldest WAL segments on a partition. When writing WAL
>> segments to a dedicated partition, we could have an option that
>> automatically removes the oldest WAL segment if the partition is full.
>> This triggers once a segment is completed.
>> - Compression of fully-written segments. When a segment is finished
>> being written, pg_receivexlog could compress them further with gz for
>> example. With --format=t this leads to segnum.tar.gz being generated.
>> The advantage of doing those two things in pg_receivexlog is
>> monitoring. One process to handle them all, and there is no need of
>> cron jobs to handle any cleanup or compression.
>
> I was at one point thinking that would be a good idea as well, but recently
> I've more been thinking that what we should do is implement a
> "--post-segment-command", which would act similar to archive_command but
> started by pg_receivexlog. This could handle things like compression, and
> also integration with external backup tools like backrest or barman in a
> cleaner way. We could also spawn this without waiting for it to finish
> immediately, which would allow parallellization of the process. When doing
> the compression inline that rapidly becomes the bottleneck. Unlike a
> basebackup you're only dealing with the need to buffer 16Mb on disk before
> compressing it, so it should be fairly cheap.

I did not consider the case of barman and backrest to be honest,
having the view of 2ndQ folks and David would be nice here. Still, the
main idea behind those done by pg_receivexlog's process would be to
not spawn a new process. I have a class of users that care about
things that could hang, they play a lot with network-mounted disks...
And VMs of course.

I have been talking to David about it a couple of times, and he agreed that it'd be useful to have a post-segment command. We haven't discussed it in much detail though. I'll add him to direct-cc here to see if he has any further input :)

It could be that the best idea is to just notify some other process of what's happening. But making it an external command would give that a lot of flexibility. Of course, we need to be careful not to put ourselves back in the position we are in with archive_command, in that it's very difficult to write a good one.

I'm sure everybody cares about things that could hang. But everything can hang...

 
> Another thing I've been considering in the same area would be to add the
> ability to write the segments to a pipe instead of a directory. Then you
> could just pipe it into gzip without the need to buffer on disk. This would
> kill the ability to know at which point we'd sync()ed to disk, but in most
> cases so will doing direct gzip. Just means we couldn't support this in sync
> mode.

Users piping their data don't care about reliability anyway. So that
is not a problem.

Good point. Same would be true about people who gzip it, wouldn't it?

 
> I can see the point of being able to compress the individual segments
> directly in pg_receivexlog in smaller systems though, without the need to
> rely on an external compression program as well. But in that case, is there
> any reason we need to wrap it in a tarfile, and can't just write it to
> <segment>.gz natively?

You mean having a --compress=0|9 option that creates individual gz
files for each segment? Definitely we could just do that. It would be

Yes, that's what I meant.

 
a shame though to not use the WAL methods you have introduced in
src/bin/pg_basebackup, with having the whole set tar and tar.gz. A
quick hack in pg_receivexlog has showed me that segments are saved in
a single tarball, which is not cool. My feeling is that using the
existing infrastructure, but making it pluggable for individual files
(in short I think that what is needed here is a way to tell the WAL
method to switch to a new file when a segment completes) would really
be the most simple one in terms of code lines and maintenance.

Much as I'd like to reuse that, I don't think that reusing that in itself shold be the driver for how this should be decided. It should be the end product.

To me it seems silly to create a directory full of tarfiles with a single file in each. I don't particularly care about the fact that we added 512 bytes of wasted space to each, but we just created something that's unnecessarily complicated for people to handle, didn't we? A plain directory of .gz files is a lot easier to work with.

//Magnus

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] BUG: pg_stat_statements query normalization issueswith combined queries
Следующее
От: David Fetter
Дата:
Сообщение: [HACKERS] Hooks