Обсуждение: [PATCH] Fix minor issues in astreamer_zstd.c

Поиск
Список
Период
Сортировка

[PATCH] Fix minor issues in astreamer_zstd.c

От
"zengman"
Дата:
Hi all,

While reviewing the zstd code, I noticed two minor issues in astreamer_zstd.c that appear to be errors:

First, the file header comment for astreamer_zstd_compressor incorrectly references "lz4 compression" instead of "zstd
compression"— this looks like a copy-paste error from astreamer_lz4.c. 
 
Second, in the astreamer_zstd_decompressor_finalize function, it seems there may be an error in parameter passing: the
functionuses the full buffer capacity (bbs_buffer.maxlen) when calling astreamer_content, rather than the actual length
ofdecompressed data (zstd_outBuf.pos).
 

I’ve attached two patches to address these potential issues:
0001: Fixes the incorrect comment in the astreamer_zstd.c header
0002: Corrects the parameter passed to astreamer_content in astreamer_zstd_decompressor_finalize 

--
Regards,
Man Zeng
www.openhalo.org
Вложения

Re: [PATCH] Fix minor issues in astreamer_zstd.c

От
Japin Li
Дата:
On Sat, 10 Jan 2026 at 12:59, "zengman" <zengman@halodbtech.com> wrote:
> Hi all,
>
> While reviewing the zstd code, I noticed two minor issues in astreamer_zstd.c that appear to be errors:
>
> First, the file header comment for astreamer_zstd_compressor incorrectly references "lz4 compression" instead of
"zstdcompression" — this looks like a copy-paste error from astreamer_lz4.c.  
> Second, in the astreamer_zstd_decompressor_finalize function, it seems
> there may be an error in parameter passing: the function uses the full
> buffer capacity (bbs_buffer.maxlen) when calling astreamer_content,
> rather than the actual length of decompressed data (zstd_outBuf.pos).
>
> I’ve attached two patches to address these potential issues:
> 0001: Fixes the incorrect comment in the astreamer_zstd.c header

+1

> 0002: Corrects the parameter passed to astreamer_content in astreamer_zstd_decompressor_finalize

Not sure about this — LZ4 and GZIP do the same thing.

Based on the code coverage reports in [0] and [1], it appears that this case
is already covered by existing tests for both LZ4 and GZIP.

[0] https://coverage.postgresql.org/src/fe_utils/astreamer_lz4.c.gcov.html
[1] https://coverage.postgresql.org/src/fe_utils/astreamer_gzip.c.gcov.html

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: [PATCH] Fix minor issues in astreamer_zstd.c

От
Japin Li
Дата:
On Sat, 10 Jan 2026 at 15:03, Japin Li <japinli@hotmail.com> wrote:
> On Sat, 10 Jan 2026 at 12:59, "zengman" <zengman@halodbtech.com> wrote:
>> Hi all,
>>
>> While reviewing the zstd code, I noticed two minor issues in astreamer_zstd.c that appear to be errors:
>>
>> First, the file header comment for astreamer_zstd_compressor incorrectly references "lz4 compression" instead of
"zstdcompression" — this looks like a copy-paste error from astreamer_lz4.c.  
>> Second, in the astreamer_zstd_decompressor_finalize function, it seems
>> there may be an error in parameter passing: the function uses the full
>> buffer capacity (bbs_buffer.maxlen) when calling astreamer_content,
>> rather than the actual length of decompressed data (zstd_outBuf.pos).
>>
>> I’ve attached two patches to address these potential issues:
>> 0001: Fixes the incorrect comment in the astreamer_zstd.c header
>
> +1
>
>> 0002: Corrects the parameter passed to astreamer_content in astreamer_zstd_decompressor_finalize
>
> Not sure about this — LZ4 and GZIP do the same thing.
>
> Based on the code coverage reports in [0] and [1], it appears that this case
> is already covered by existing tests for both LZ4 and GZIP.
>
> [0] https://coverage.postgresql.org/src/fe_utils/astreamer_lz4.c.gcov.html
> [1] https://coverage.postgresql.org/src/fe_utils/astreamer_gzip.c.gcov.html
>

After re-reading the LZ4 and GZIP code, I agree — the parameter is indeed wrong.
It looks like decompressor_finalize has the same issue in both implementations.
Would you mind taking a look?

diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
index e8d62f754ca..4ccec1a5a7f 100644
--- a/src/fe_utils/astreamer_gzip.c
+++ b/src/fe_utils/astreamer_gzip.c
@@ -349,7 +349,7 @@ astreamer_gzip_decompressor_finalize(astreamer *streamer)
      */
     astreamer_content(mystreamer->base.bbs_next, NULL,
                       mystreamer->base.bbs_buffer.data,
-                      mystreamer->base.bbs_buffer.maxlen,
+                      mystreamer->bytes_written,
                       ASTREAMER_UNKNOWN);

     astreamer_finalize(mystreamer->base.bbs_next);
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index bcee7e088de..570e5305874 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -401,7 +401,7 @@ astreamer_lz4_decompressor_finalize(astreamer *streamer)
      */
     astreamer_content(mystreamer->base.bbs_next, NULL,
                       mystreamer->base.bbs_buffer.data,
-                      mystreamer->base.bbs_buffer.maxlen,
+                      mystreamer->bytes_written,
                       ASTREAMER_UNKNOWN);

     astreamer_finalize(mystreamer->base.bbs_next);

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.



Re: [PATCH] Fix minor issues in astreamer_zstd.c

От
"zengman"
Дата:
> After re-reading the LZ4 and GZIP code, I agree — the parameter is indeed wrong.
> It looks like decompressor_finalize has the same issue in both implementations.
> Would you mind taking a look?
> 
> diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
> index e8d62f754ca..4ccec1a5a7f 100644
> --- a/src/fe_utils/astreamer_gzip.c
> +++ b/src/fe_utils/astreamer_gzip.c
> @@ -349,7 +349,7 @@ astreamer_gzip_decompressor_finalize(astreamer *streamer)
> */
> astreamer_content(mystreamer->base.bbs_next, NULL,
>   mystreamer->base.bbs_buffer.data,
> -   mystreamer->base.bbs_buffer.maxlen,
> +   mystreamer->bytes_written,
>   ASTREAMER_UNKNOWN);
> 
> astreamer_finalize(mystreamer->base.bbs_next);
> diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
> index bcee7e088de..570e5305874 100644
> --- a/src/fe_utils/astreamer_lz4.c
> +++ b/src/fe_utils/astreamer_lz4.c
> @@ -401,7 +401,7 @@ astreamer_lz4_decompressor_finalize(astreamer *streamer)
> */
> astreamer_content(mystreamer->base.bbs_next, NULL,
>   mystreamer->base.bbs_buffer.data,
> -   mystreamer->base.bbs_buffer.maxlen,
> +   mystreamer->bytes_written,
>   ASTREAMER_UNKNOWN);
> 
> astreamer_finalize(mystreamer->base.bbs_next);

Hi,

Thank you for your review. I've moved it to 0003, and the test passed without any issues.
However, I'm wondering why no one else reported this. Perhaps this part isn't actually worth changing.
I think we can invite Mr. Robert Haas to help me confirm this, so I've chosen to CC him.

--
Regards,
Man Zeng
www.openhalo.org
Вложения

Re: [PATCH] Fix minor issues in astreamer_zstd.c

От
Michael Paquier
Дата:
On Sun, Jan 11, 2026 at 12:24:22PM +0800, zengman wrote:
> Thank you for your review. I've moved it to 0003, and the test
> passed without any issues.
> However, I'm wondering why no one else reported this. Perhaps this
> part isn't actually worth changing.

Could you demonstrate one or more examples when using these APIs
proving that in some cases the current code can be a problem while the
"fixed" code improves the situation, then extract test cases to be
able to cover our future tracks?  This would take the shape of one or
more regression tests to demonstrate individual problems.  If the
three code paths touched here prove to be problematic, we would need
three cases in total.  One other possibility would be to use a set of
asserts to make sure that nobody uses these APIs in ways we don't
expect them to.

Note that for the astream code, the lz4, gzip and zstd code paths rely
on their own assumptions regarding how the input and output buffers
are processed, some of them related to the way the libraries we rely
on behave.  So it goes beyond the point of having a consistent
implementation across all the libraries when the contents are
finalized, for sometimes behaviors that need to work across all the
versions of these libraries supported.

The TAP tests of pg_verifybackup stress the cases where bytes_written
is 0 for lz4 and gzip, there is nothing for the 0-case for zstd,
though.

Saying all that, the comment is indeed wrong.
--
Michael

Вложения

Re: [PATCH] Fix minor issues in astreamer_zstd.c

От
Michael Paquier
Дата:
On Sun, Jan 11, 2026 at 04:10:36PM +0900, Michael Paquier wrote:
> Could you demonstrate one or more examples when using these APIs
> proving that in some cases the current code can be a problem while the
> "fixed" code improves the situation, then extract test cases to be
> able to cover our future tracks?  This would take the shape of one or
> more regression tests to demonstrate individual problems.  If the
> three code paths touched here prove to be problematic, we would need
> three cases in total.  One other possibility would be to use a set of
> asserts to make sure that nobody uses these APIs in ways we don't
> expect them to.

See for example 3369a3b49b0b as one reference, that has fixed a bug in
the same area of the code.
--
Michael

Вложения

Re: [PATCH] Fix minor issues in astreamer_zstd.c

От
"zengman"
Дата:
> > Could you demonstrate one or more examples when using these APIs
> > proving that in some cases the current code can be a problem while the
> > "fixed" code improves the situation, then extract test cases to be
> > able to cover our future tracks?  This would take the shape of one or
> > more regression tests to demonstrate individual problems.  If the
> > three code paths touched here prove to be problematic, we would need
> > three cases in total.  One other possibility would be to use a set of
> > asserts to make sure that nobody uses these APIs in ways we don't
> > expect them to.
> 
> See for example 3369a3b49b0b as one reference, that has fixed a bug in
> the same area of the code.

Hi Mr. Michael

Thank you for pointing this out. I'd like to admit my mistake first. I was actually just looking at the code in
`astreamer_zstd.c`and noticed that `astreamer_zstd_decompressor_finalize` checked `zstd_outBuf.pos` but didn't use it,
whichpuzzled me.
 
```
    if (mystreamer->zstd_outBuf.pos > 0)
        astreamer_content(mystreamer->base.bbs_next, NULL,
                          mystreamer->base.bbs_buffer.data,
                          mystreamer->base.bbs_buffer.maxlen,
                          ASTREAMER_UNKNOWN);
```
Then I referred to the rest of the code in that file and made the corresponding adjustments. Before that, when I tested
itmanually, there was probably an error in the operation, and the size difference was about 3kb, so I thought that was
theproblem.
 
However, after several more tests, I found no difference.

Before code modification:
```
postgres@zxm-VMware-Virtual-Platform:~/code/postgres$ pg_basebackup --pgdata ~/zstd_backup --format tar --compress
zstd:1--no-sync --manifest-checksums sha256
 
postgres@zxm-VMware-Virtual-Platform:~/code/postgres$ pg_verifybackup --exit-on-error ~/zstd_backup  -n
backup successfully verified
postgres@zxm-VMware-Virtual-Platform:~/code/postgres$ ls -al  ~/zstd_backup 
total 19500
drwx------  2 postgres postgres     4096 Jan 11 17:11 .
drwx------ 28 postgres root         4096 Jan 11 17:11 ..
-rw-------  1 postgres postgres   177649 Jan 11 17:11 backup_manifest
-rw-------  1 postgres postgres  2996548 Jan 11 17:11 base.tar.zst
-rw-------  1 postgres postgres 16778752 Jan 11 17:11 pg_wal.tar
```

After modifying the code:
```
postgres@zxm-VMware-Virtual-Platform:~/code/postgres$ pg_basebackup --pgdata ~/zstd_backup2 --format tar --compress
zstd:1--no-sync --manifest-checksums sha256
 
postgres@zxm-VMware-Virtual-Platform:~/code/postgres$ pg_verifybackup --exit-on-error ~/zstd_backup2  -n
backup successfully verified
postgres@zxm-VMware-Virtual-Platform:~/code/postgres$ ls ~/zstd_backup2 -al
total 19500
drwx------  2 postgres postgres     4096 Jan 11 17:13 .
drwx------ 29 postgres root         4096 Jan 11 17:13 ..
-rw-------  1 postgres postgres   177649 Jan 11 17:13 backup_manifest
-rw-------  1 postgres postgres  2996546 Jan 11 17:13 base.tar.zst
-rw-------  1 postgres postgres 16778752 Jan 11 17:13 pg_wal.tar
```
So from a code perspective, this might be worth modifying, but not modifying it doesn't seem to cause any real harm.
Sorryto bother you all.