Обсуждение: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

Поиск
Список
Период
Сортировка

pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

От
"yangyz"
Дата:
Hi Hackers,

When I read the FindStreamingStart function in pg_receivewal.c, I discovered an unnecessary use of memset.So I removed it, optimizing the performance without affecting its functionality.

The following is the detailed analysis of the reasons:
1.LZ4F_decompress will fully overwrite the output buffer:
When out_size is passed as an input parameter, it denotes the size of the output buffer (outbuf). The decompression operation writes the decompressed data to outbuf. Upon function return, out_size is updated to reflect the actual number of bytes written. Notably, even in cases of partial decompression, data is written starting from the initial position of outbuf.
2.Performance Overhead
In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized. Since these memory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an unnecessary consumption of CPU resources.

Regards,

Yang Yuanzhuo




Вложения

Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

От
Chao Li
Дата:

> On Feb 25, 2026, at 14:31, yangyz <1197620467@qq.com> wrote:
>
> Hi Hackers,
>
> When I read the FindStreamingStart function in pg_receivewal.c, I discovered an unnecessary use of memset.So I
removedit, optimizing the performance without affecting its functionality. 
>
> The following is the detailed analysis of the reasons:
> 1.LZ4F_decompress will fully overwrite the output buffer:
> When out_size is passed as an input parameter, it denotes the size of the output buffer (outbuf). The decompression
operationwrites the decompressed data to outbuf. Upon function return, out_size is updated to reflect the actual number
ofbytes written. Notably, even in cases of partial decompression, data is written starting from the initial position of
outbuf.
> 2.Performance Overhead
> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized. Since
thesememory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an unnecessary
consumptionof CPU resources. 
>
> Regards,
> Yang Yuanzhuo
>
>
>
> <v1-0001-Removed-an-unnecessary-use-of-memset-in-FindStrea.patch>

Looking at the code snippet:
```
    while (readp < readend)
    {
        size_t        out_size = LZ4_CHUNK_SZ;
        size_t        read_size = readend - readp;

        memset(outbuf, 0, LZ4_CHUNK_SZ);
        status = LZ4F_decompress(ctx, outbuf, &out_size,
                                 readp, &read_size, &dec_opt);
        if (LZ4F_isError(status))
            pg_fatal("could not decompress file \"%s\": %s",
                     fullpath,
                     LZ4F_getErrorName(status));

        readp += read_size;
        uncompressed_size += out_size;
    }
```
It’s trying to locate the start position, and the decoded bytes are not consumed (they’re effectively discarded). Given
thatLZ4F_decompress() reports the produced size via out_size, zeroing the whole output buffer beforehand doesn’t seem
necessaryhere. Since this happens inside the loop, the extra memset() just amplifies the overhead. 

Also, ReadDataFromArchiveLZ4() has a very similar loop that doesn’t zero the output buffer at all:
```
    while (readp < readend)
    {
        size_t        out_size = DEFAULT_IO_BUFFER_SIZE;
        size_t        read_size = readend - readp;

        status = LZ4F_decompress(ctx, outbuf, &out_size,
                                 readp, &read_size, &dec_opt);
        if (LZ4F_isError(status))
            pg_fatal("could not decompress: %s",
                     LZ4F_getErrorName(status));

        ahwrite(outbuf, 1, out_size, AH);
        readp += read_size;
    }
```

So +1 for removing the memset.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

От
Daniel Gustafsson
Дата:
> On 25 Feb 2026, at 07:31, yangyz <1197620467@qq.com> wrote:

> 2.Performance Overhead
> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized. Since
thesememory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an unnecessary
consumptionof CPU resources. 

When proposing a performance improvement it's important to provide some level
of benchmarks to show the improvement. Is removing this memset noticeable?

--
Daniel Gustafsson




Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

От
Chao Li
Дата:

> On Feb 25, 2026, at 18:21, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 25 Feb 2026, at 07:31, yangyz <1197620467@qq.com> wrote:
>
>> 2.Performance Overhead
>> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized. Since
thesememory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an unnecessary
consumptionof CPU resources. 
>
> When proposing a performance improvement it's important to provide some level
> of benchmarks to show the improvement. Is removing this memset noticeable?
>
> --
> Daniel Gustafsson
>

I don’t think this patch is about performance. Although removing the memset might save a few CPU cycles, the real
benefitseems to be cleanup and consistency. The memset appears unnecessary, and similar functions don’t use it, so I
thinkthis change mainly improves maintainability. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

От
Daniel Gustafsson
Дата:
> On 25 Feb 2026, at 13:41, Chao Li <li.evan.chao@gmail.com> wrote:
>
>> On Feb 25, 2026, at 18:21, Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 25 Feb 2026, at 07:31, yangyz <1197620467@qq.com> wrote:
>>
>>> 2.Performance Overhead
>>> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized.
Sincethese memory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an
unnecessaryconsumption of CPU resources. 
>>
>> When proposing a performance improvement it's important to provide some level
>> of benchmarks to show the improvement. Is removing this memset noticeable?
>
> I don’t think this patch is about performance. Although removing the memset might save a few CPU cycles, the real
benefitseems to be cleanup and consistency. The memset appears unnecessary, and similar functions don’t use it, so I
thinkthis change mainly improves maintainability. 

I would argue the opposite, clearing a buffer before passing it to an external
library function writing to it seems the right thing to do unless it can be
proven to regress performance too much.  Also, "appears unnecessary" doesn't
instill enough confidence to perform a change IMO.

--
Daniel Gustafsson




Re: pg_basebackup: removed an unnecessary use of memset in FindStreamingStart

От
Chao Li
Дата:

> On Feb 25, 2026, at 21:10, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 25 Feb 2026, at 13:41, Chao Li <li.evan.chao@gmail.com> wrote:
>>
>>> On Feb 25, 2026, at 18:21, Daniel Gustafsson <daniel@yesql.se> wrote:
>>>
>>>> On 25 Feb 2026, at 07:31, yangyz <1197620467@qq.com> wrote:
>>>
>>>> 2.Performance Overhead
>>>> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized.
Sincethese memory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an
unnecessaryconsumption of CPU resources. 
>>>
>>> When proposing a performance improvement it's important to provide some level
>>> of benchmarks to show the improvement. Is removing this memset noticeable?
>>
>> I don’t think this patch is about performance. Although removing the memset might save a few CPU cycles, the real
benefitseems to be cleanup and consistency. The memset appears unnecessary, and similar functions don’t use it, so I
thinkthis change mainly improves maintainability. 
>
> I would argue the opposite, clearing a buffer before passing it to an external
> library function writing to it seems the right thing to do unless it can be
> proven to regress performance too much.  Also, "appears unnecessary" doesn't
> instill enough confidence to perform a change IMO.
>
> --
> Daniel Gustafsson


As I pointed out earlier, ReadDataFromArchiveLZ4() has a very similar loop that doesn’t zero out  the output buffer:
```
while (readp < readend)
{
    size_t out_size = DEFAULT_IO_BUFFER_SIZE;
    size_t read_size = readend - readp;

    status = LZ4F_decompress(ctx, outbuf, &out_size,
                                                 readp, &read_size, &dec_opt);
    if (LZ4F_isError(status))
        pg_fatal("could not decompress: %s",
              LZ4F_getErrorName(status));

    ahwrite(outbuf, 1, out_size, AH);
    readp += read_size;
}
```

Do you think we should add a memset there? There are a couple of more callers of LZ4F_decompress that don’t zero out
theoutput buffer. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/








> On Feb 25, 2026, at 21:54, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
>> On Feb 25, 2026, at 21:10, Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>>> On 25 Feb 2026, at 13:41, Chao Li <li.evan.chao@gmail.com> wrote:
>>>
>>>> On Feb 25, 2026, at 18:21, Daniel Gustafsson <daniel@yesql.se> wrote:
>>>>
>>>>> On 25 Feb 2026, at 07:31, yangyz <1197620467@qq.com> wrote:
>>>>
>>>>> 2.Performance Overhead
>>>>> In each iteration, the entire buffer of size LZ4_CHUNK_SZ (potentially several megabytes) is zero-initialized.
Sincethese memory blocks are immediately overwritten by decompressed data, this zeroing operation constitutes an
unnecessaryconsumption of CPU resources. 
>>>>
>>>> When proposing a performance improvement it's important to provide some level
>>>> of benchmarks to show the improvement. Is removing this memset noticeable?
>>>
>>> I don’t think this patch is about performance. Although removing the memset might save a few CPU cycles, the real
benefitseems to be cleanup and consistency. The memset appears unnecessary, and similar functions don’t use it, so I
thinkthis change mainly improves maintainability. 
>>
>> I would argue the opposite, clearing a buffer before passing it to an external
>> library function writing to it seems the right thing to do unless it can be
>> proven to regress performance too much.  Also, "appears unnecessary" doesn't
>> instill enough confidence to perform a change IMO.
>>
>> --
>> Daniel Gustafsson
>
>
> As I pointed out earlier, ReadDataFromArchiveLZ4() has a very similar loop that doesn’t zero out  the output buffer:
> ```
> while (readp < readend)
> {
>    size_t out_size = DEFAULT_IO_BUFFER_SIZE;
>    size_t read_size = readend - readp;
>
>    status = LZ4F_decompress(ctx, outbuf, &out_size,
>                                                 readp, &read_size, &dec_opt);
>    if (LZ4F_isError(status))
>        pg_fatal("could not decompress: %s",
>              LZ4F_getErrorName(status));
>
>    ahwrite(outbuf, 1, out_size, AH);
>    readp += read_size;
> }
> ```
>
> Do you think we should add a memset there? There are a couple of more callers of LZ4F_decompress that don’t zero out
theoutput buffer. 
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>

Adding the original author to see if he still remember what was the intention of the memset.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/