Re: Refactoring of compression options in pg_basebackup
От | Michael Paquier |
---|---|
Тема | Re: Refactoring of compression options in pg_basebackup |
Дата | |
Msg-id | YdZ4R7sNek759eZX@paquier.xyz обсуждение исходный текст |
Ответ на | Re: Refactoring of compression options in pg_basebackup (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Refactoring of compression options in pg_basebackup
(Robert Haas <robertmhaas@gmail.com>)
|
Список | pgsql-hackers |
On Wed, Jan 05, 2022 at 10:33:38AM -0500, Robert Haas wrote: > I think we're going to want to offer both options. We can't know > whether the user prefers to consume CPU cycles on the server or on the > client. Compressing on the server has the advantage of potentially > saving transfer bandwidth, but the server is also often the busiest > part of the whole system, and users are often keen to offload as much > work as possible. Yeah. There are cases for both. I just got to wonder whether it makes sense to allow both server-side and client-side compression to be used at the same time. That would be a rather strange case, but well, with the correct set of options that could be possible. > Given that, I'd like us to be thinking about what the full set of > options looks like once we have (1) compression on either the server > or the client and (2) multiple compression algorithms and (3) multiple > compression levels. Personally, I don't really like the decision made > by this proposed patch. In this patch's view of the world, -Z is a way > of providing the compression level for whatever compression algorithm > you happen to have selected, but I think of -Z as being the upper-case > version of -z which I think of as selecting specifically gzip. It's > not particularly intuitive to me that in a command like pg_basebackup > --compress=<something>, <something> is a compression level rather than > an algorithm. So what I would propose is probably something like: > > pg_basebackup --compress=ALGORITHM [--compression-level=NUMBER] > pg_basebackup --server-compress=ALGORITHM [--compression-level=NUMBER] > > And then make -z short for --compress=gzip and -Z <n> short for > --compress=gzip --compression-level=<n>. That would be a > backward-incompatible change to the definition of --compress, but as > long as -Z <n> works the same as today, I don't think many people will > notice. If we like, we can notice if the argument to --compress is an > integer and suggest using either -Z or --compress=gzip > --compression-level=<n> instead. My view of things is slightly different, aka I'd rather keep --compress to mean a compression level with an integer option, but introduce a --compression-method={lz4,gzip,none}, with -Z being a synonym of --compression-method=gzip. That's at least the path we chose for pg_receivewal. I don't mind sticking with one way or another, as what you are proposing is basically the same thing I have in mind, but both tools ought to use the same set of options. Hmm. Perhaps at the end the problem is with --compress, where we don't know if it means a compression level or a compression method? For me, --compress means the former, and for you the latter. So a third way of seeing things is to drop completely --compress, but have one --compression-method and one --compression-level. That would bring a clear split. Or just one --compression-method for the client-side compression as you are proposing for the server-side compression, however I'd like to think that a split between the method and level is more intuitive. > In the proposed patch, you end up with pg_basebackup > --compression-method=lz4 -Z2 meaning compression with lz4 level 2. I > find that quite odd, though as with all such things, opinions may > vary. In my proposal, that would be an error, because it would be > equivalent to --compress=lz4 --compress=gzip --compression-level=2, > and would thus involve conflicting compression method specifications. It seems to me that you did not read the patch closely enough. The attached patch does not add support for LZ4 in pg_basebackup on the client-side yet. Once it gets added, though, the idea is that using --compress with LZ4 would result in an error. That's what happens with pg_receivewal on HEAD, for one. The patch just shapes things to plug LZ4 more easily in the existing code of pg_basebackup.c, and walmethods.c. So.. As of now, it is actually possible to cut the pie in three parts. There are no real objections to the cleanup of walmethods.c and the addition of some conditional TAP tests with pg_basebackup and client-side compression, as far as I can see, only to the option renaming part. Attached are two patches, then. 0001 is the cleanup of walmethods.c to rely the compression method, with more tests (tests that could be moved into their own patch, as well). 0002 is the addition of the options I suggested upthread, but we may change that depending on what gets used for the server-side compression for consistency so I am not suggesting to merge that until we agree on the full picture. The point of this thread was mostly about 0001, so I am fine to discard 0002. Thoughts? -- Michael
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Amit KapilaДата:
Сообщение: Re: Skipping logical replication transactions on subscriber side
Следующее
От: Amit LangoteДата:
Сообщение: Re: Delay the variable initialization in get_rel_sync_entry