Re: Add LZ4 compression in pg_dump

Поиск
Список
Период
Сортировка
От gkokolatos@pm.me
Тема Re: Add LZ4 compression in pg_dump
Дата
Msg-id P-Dl7-QRVPmjv35H2n-Wnj9amDG_MUEb0-jtlzs05n5z03lq0IZNTsI5h4DjTmLekfmKexgFBQhmVXtUO402vAIFx0e002lIzsb8O8Hi1wA=@pm.me
обсуждение исходный текст
Ответ на Re: Add LZ4 compression in pg_dump  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: Add LZ4 compression in pg_dump  (Michael Paquier <michael@paquier.xyz>)
Re: Add LZ4 compression in pg_dump  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers


------- Original Message -------
On Friday, May 5th, 2023 at 3:23 PM, Andrew Dunstan <andrew@dunslane.net> wrote:


On 2023-05-05 Fr 06:02, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin <exclusion@gmail.com> wrote:


23.03.2023 20:10, Tomas Vondra wrote:

So pushed all three parts, after updating the commit messages a bit.

This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.
I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir

==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised value(s)
==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal (strops.c:41)
==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData (pg_backup_directory.c:422)
==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry (pg_backup_archiver.c:882)
==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive (pg_backup_archiver.c:699)
==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...

It looks like the line variable returned by gets_func() here is not
null-terminated:
while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)

{
...
if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", &oid, lofname) != 2)
...
And Valgrind doesn't like it.

Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
gets() when it should have been modeled after fgets().

Please find a patch attached to address it.



Isn't using memset here a bit wasteful? Why not just put a null at the end after calling LZ4Stream_read_internal(), which tells you how many bytes it has written?

Good point. I thought about it before submitting the patch. I concluded that given the complexity and operations involved in LZ4Stream_read_internal() and the rest of the pg_dump/pg_restore code, the memset() call will be negligible. However from the readability point of view, the function is a bit cleaner with the memset().

I will not object to any suggestion though, as this is a very trivial point. Please find attached a v2 of the patch following the suggested approach.

Cheers,

//Georgios



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Padmavathi G
Дата:
Сообщение: Re: Tables getting stuck at 's' state during logical replication
Следующее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Re: Add two missing tests in 035_standby_logical_decoding.pl