Re: libpq compression

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: libpq compression
Дата
Msg-id 93027ee4-fefa-a8ca-2cc7-245c912bf824@postgrespro.ru
обсуждение исходный текст
Ответ на Re: libpq compression  (Denis Smirnov <sd@arenadata.io>)
Ответы Re: libpq compression  (Daniil Zakhlystov <usernamedt@yandex-team.ru>)
Re: libpq compression  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers


On 17.12.2020 16:39, Denis Smirnov wrote:
Hello all,

I’ve finally read the whole thread (it was huge). It is extremely sad that this patch hang without progress for such a long time. It seems that the main problem in discussion is that everyone has its own view what problems should be solve with this patch. Here are some of positions (not all of them):

1. Add a compression for networks with a bad bandwidth (and make a patch as simple and maintainable as possible) - author’s position.
2. Don’t change current network protocol and related code much.
3. Refactor compression API (and network compression as well)
4. Solve cloud provider’s problems: on demand buy network bandwidth with CPU utilisation and vice versa.

All of these requirements have a different nature and sometimes conflict with each other. Without clearly formed requirements this patch would never be released.

Anyway, I have rebased it to the current master branch, applied pgindent, tested on MacOS and fixed a MacOS specific problem with strcpy in build_compressors_list(): it has an undefined behaviour when source and destination strings overlap.
-	*client_compressors = src = dst = strdup(value);
+	*client_compressors = src = strdup(value);
+	dst = strdup(value);

According to my very simple tests with randomly generated data, zstd gives about 3x compression (zlib has a little worse compression ratio and a little bigger CPU utilisation). It seems to be a normal ratio for any streaming data - Greenplum also uses zstd/zlib to compress append optimised tables and compression ratio is usually about 3-5x. Also according to my Greenplum experience, the most commonly used zstd ratio is 1, while for zlib it is usually in a range of 1-5. CPU and execution time were not affected much according to uncompressed data (but my tests were very simple and they should not be treated as reliable).



Best regards,
Denis Smirnov | Developer
sd@arenadata.io 
Arenadata | Godovikova 9-17, Moscow 129085 Russia


Thank you very much for reporting the problem.
Sorry, but you fix is entirely correct: it is necessary to assign dst to *client_compressors, not  *src. Also extra strdup requires correspondent memory deallocation.
I prepared new version of the patch with this fix + some other code refactoring requested by reviewers (like sending ack message in more standard way).

I am maintaining this code in git@github.com:postgrespro/libpq_compression.git repository.
I will be pleased if anybody, who wants to suggest any bug fixes/improvements of libpq compression, create pull requests: it will be much easier for me to merge them.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Seino Yuki
Дата:
Сообщение: Re: Feature improvement for pg_stat_statements
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: cutting down the TODO list thread