Re: refactoring basebackup.c

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: refactoring basebackup.c
Дата
Msg-id CA+Tgmobm+k0VSeyCa9_TbB9FJfZM6evqfeB7_qC+fMrvUuVvCw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: refactoring basebackup.c  (Dipesh Pandit <dipesh.pandit@gmail.com>)
Список pgsql-hackers
On Wed, Jan 19, 2022 at 7:16 AM Dipesh Pandit <dipesh.pandit@gmail.com> wrote:
> I have added support for decompressing a gzip compressed tar file
> at client. pg_basebackup can enable server side compression for
> plain format backup with this change.
>
> Added a gzip extractor which decompresses the compressed archive
> and forwards it to the next streamer. I have done initial testing and
> working on updating the test coverage.

Cool. It's going to need some documentation changes, too.

I don't like the way you coded this in CreateBackupStreamer(). I would
like the decision about whether to use
bbstreamer_gzip_extractor_new(), and/or throw an error about not being
able to parse an archive, to based on the file type i.e. "did we get a
.tar.gz file?" rather than on whether we asked for server-side
compression. Notice that the existing logic checks whether we actually
got a .tar file from the server rather than assuming that's what must
have happened.

As a matter of style, I don't think it's good for the only thing
inside of an "if" statement to be another "if" statement. The two
could be merged, but we also don't want to have the "if" conditional
be too complex. I am imagining that this should end up saying
something like if (must_parse_archive && !is_tar && !is_tar_gz) {
pg_log_error(...

+     * "windowBits" must be greater than or equal to "windowBits" value
+     * provided to deflateInit2 while compressing.

It would be nice to clarify why we know the value we're using is safe.
Maybe we're using the maximum possible value, in which case you could
just add that to the end of the comment: "...so we use the maximum
possible value for safety."

+    /*
+     * End of the stream, if there is some pending data in output buffers then
+     * we must forward it to next streamer.
+     */
+    if (res == Z_STREAM_END) {
+        bbstreamer_content(mystreamer->base.bbs_next, member,
mystreamer->base.bbs_buffer.data,
+                mystreamer->bytes_written, context);
+    }

Uncuddle the brace.

It probably doesn't make much difference, but I would be inclined to
do the final flush in bbstreamer_gzip_extractor_finalize() rather than
here. That way we rely on our own notion of when there's no more input
data rather than zlib's notion. Probably terrible things are going to
happen if those two ideas don't match up .... but there might be some
other compression algorithm that doesn't return a distinguishing code
at end-of-stream. Such an algorithm would have to take care of any
leftover data in the finalize function, so I think we should do that
here too, so the code can be similar in all cases.

Perhaps we should move all the gzip stuff to a new file bbstreamer_gzip.c.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Refactoring of compression options in pg_basebackup
Следующее
От: John Naylor
Дата:
Сообщение: Re: A qsort template