Re: Decompression bug in astreamer_lz4
От | Michael Paquier |
---|---|
Тема | Re: Decompression bug in astreamer_lz4 |
Дата | |
Msg-id | aFyGYWedCAs_gr21@paquier.xyz обсуждение исходный текст |
Список | pgsql-hackers |
On Wed, Jun 25, 2025 at 11:07:32PM +0300, Mikhail Gribkov wrote: > The same situation can be reproduced using randomly filled files. Let’s > imagine the extension saves its own statistics in pg_stat: > dd if=/dev/urandom of=$PGDATA/pg_stat/junk bs=1M count=16 Original, I like that. It's not hard to imagine some extension code that decides to dump at the root of the data folder some file using one of the shutdown callbacks at the root of the data directory. Not seeing the problem would then be a matter of luck. So yes, that's a legit bug IMO. > In the vast majority of cases the mystreamer->bytes_written value is zero > at the moment of next_out/avail_out initialization, thus the problem does > not appear. In fact we were totally unable to reproduce non-zero initial > value of mystreamer->bytes_written without poor-compressible custom format > files.I’m not that deep in the compression algorithms to say exactly why is > that, and I realize that very few users will really face the problem, but > the problem is definitely there. That's not surprising, noted. > Attached is a simple patch implementing the above fix. The issue disappears > with it. It seems like you have spent some time digging into all that, thanks for all that. > What do you think? I think that it would be nice to have some test coverage for such cases in pg_verifybackup for both the client side and the server side of backups compressed, relying on the backend to generate some random data dumped into a file at the root of a data folder. You could just plug in that into one of the scripts, before taking a backup, and check all the compression methods we need to support. See for example brin.sql where with the combination of string_agg() and fipshash(), as one reference. So we could just reuse something like that. I am detecting a threshold close to 512kB to be able to reproduce the problem, so we don't need a file that large to see the failure, making such tests faster with less data required as long as the file is written once in a data folder used by all the backups taken: $ rm $PGDATA/junk && dd if=/dev/urandom of=$PGDATA/junk bs=1k count=512 $ rm -r /tmp/client-backup-lz4/* $ pg_basebackup --format=tar --no-sync -cfast \ -Xfetch --compress client-lz4 -D /tmp/client-backup-lz4 $ pg_verifybackup -n /tmp/client-backup-lz4/ pg_verifybackup: error: checksum mismatch for file "junk" in archive "base.tar.lz4" -- Michael
Вложения
В списке pgsql-hackers по дате отправления: