Обсуждение: libpq compression (part 2)

Поиск
Список
Период
Сортировка

libpq compression (part 2)

От
Daniil Zakhlystov
Дата:
Hi!

I want to introduce the updated version of the libpq protocol compression patch, initially
introduced by Konstantin Knizhnik in this thread:
https://www.postgresql.org/message-id/aad16e41-b3f9-e89d-fa57-fb4c694bec25@postgrespro.ru

The original thread became huge and it makes it hard for new people to catch up so I think it is better
to open a new thread with the summary of the current state of the patch.

Compression of libpq traffic is useful in:
1. COPY
2. Replication
3. Queries returning large results sets (for example JSON) through slow connections.

The patch introduces three new protocol messages: CompressionAck, CompressedData, and
SetCompressionMethod.

Here is a brief overview of the compression initialization process:

1. Compression can be requested by a client by including the "compression" option in its connection
string. This can either be a boolean value to enable or
disable compression or an explicit list of comma-separated compression algorithms which can
optionally include compression level. The client indicates the compression request by sending the
_pq_.compression startup packet
parameter with a list of compression algorithms and an optional specification of compression level.
If the server does not support compression, the backend will ignore the _pq_.compression parameter
and will not send the CompressionAck message to the frontend.

2. Server receives the client's compression request and intersects the requested compression
algorithms with the allowed ones (controlled via the libpq_compression server config setting). If
the intersection is not empty, the server responds with CompressionAck containing the final list of
the compression algorithms that can be used for the compression of libpq messages between the client
and server. If the intersection is empty (server does not accept any of the requested algorithms),
then it replies with CompressionAck containing the empty list and it is up to the client whether to
continue without compression or to report an error.

3. After sending the CompressionAck message, the server can send the SetCompressionMethod message to
set the current compression algorithm for server-to-client traffic compression. Same for the client,
after receiving the CompressionAck message, the client can send the SetCompressionMethod message to set the current
compression algorithm for client-to-server traffic compression. Client-to-server and
server-to-client compression are independent of each other.

To compress messages, streaming compression is used. Compressed bytes are wrapped into the
CompressedData protocol messages. One CompressedData message may contain multiple regular protocol
messages. CompressedData messages can be mixed with the regular uncompressed messages.

Compression context is retained between the multiple CompressedData messages. Currently, only
CopyData, DataRow, and Query types of messages with length more than 60 bytes are being compressed.

If the client (or server) wants to switch the current compression method, it sends the
SetCompressionMethod message to the receiving side would be able to change its decompressor.

I've separated the patch into two parts: first contains the main patch with ZLIB and LZ4 compressing
algorithms support, second adds the ZSTD support.

Thanks,

Daniil Zakhlystov


Вложения

Re: libpq compression (part 2)

От
Justin Pryzby
Дата:
Thanks for working on this.  The patch looks to be in good shape - I hope more
people will help to review and test it.  I took the liberty of creating a new
CF entry. http://cfbot.cputube.org/daniil-zakhlystov.html

+zpq_should_compress(ZpqStream * zpq, char msg_type, uint32 msg_len)
+{
+   return zpq_choose_compressor(zpq, msg_type, msg_len) == -1;

I think this is backwards , and should say != -1 ?

As written, the server GUC libpq_compression defaults to "on", and the client
doesn't request compression.  I think the server GUC should default to off.
I failed to convince Kontantin about this last year.  The reason is that 1)
it's a new feature; 2) with security implications.  An admin should need to
"opt in" to this.  I still wonder if this should be controlled by a new "TYPE"
in pg_hba (rather than a GUC); that would make it exclusive of SSL.

However, I also think that in the development patches both client and server
should enable compression by default, to allow it to be exercized by cfbot.
For the other compression patches I worked on, I had an 0001 commit where the
feature default was off, plus following patches which enabled the feature by
default - only for cfbot and not intended for commit.

+/* GUC variable containing the allowed compression algorithms list (separated by comma) */
+char      *libpq_compress_algorithms;

=> The global variable is conventionally initialized to the same default as
guc.c (even though the GUC default value will be applied at startup).

+                * - NegotiateProtocolVersion in cases when server does not support protocol compression
+                * Anything else probably means it's not Postgres on the other end at all.
                 */
-               if (!(beresp == 'R' || beresp == 'E'))
+               if (!(beresp == 'R' || beresp == 'E' || beresp == 'z' || beresp == 'v'))

=> I think NegotiateProtocolVersion and 'v' are from an old version of the
patch and no longer used ?

There's a handful of compiler warnings:
| z_stream.c:311:5: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]

+       ereport(LOG, (errmsg("failed to parse configured compression setting: %s", libpq_compress_algorithms)));
+       return 0;

=> Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).

+ * Report current transaction start timestamp as the specified value.
+ * Zero means there is no active transaction.

=> The comment doesn't correspond to the function (copy+paste).

+       return id >= 0 && id < (sizeof(zs_algorithms) / sizeof(*zs_algorithms));
+       size_t          n_algorithms = sizeof(zs_algorithms) / sizeof(*zs_algorithms);

=> You can use lengthof() for these.

Maybe pg_stat_network_traffic should show the compression?
It'd have to show the current client-server and server-client values.
Or maybe that's not useful since it can change dynamically (unless it were
reset when the compression method was changed).

I think the psql conninfo Compressor/Decompressor line may be confusing.
Maybe it should talk about Client->Server and Server->Client.
Maybe it should be displayed on a single line.  

Actually, right now the \conninfo part is hardly useful, since the compressor
is allocated lazily/"on demand", so it shows the compression of some previous
command, but maybe not the last command, and maybe not the next command...

I'm not sure it's needed to allow the compression to be changed dynamically,
and the generality to support a different compression mthod for each libpq
message type seems excessive.  Maybe it's enough to check that the message type
is one of VALID_LONG_MESSAGE_TYPE and its length is long enough.

I wonder whether the asymmetric compression idea is useful.  The only
application I can see for this is that a server might allow to *decompress*
data from a client, but may not want to *compress* data to a client.

What happenes if the compression doesn't decrease the message size ?  I don't
see anything that allows sending the original, raw data.  The advantage would
be that the remote side doesn't incur the overhead of decompression.

I hit this assertion with PGCOMPRESSION=lz4 (but not zlib), and haven't yet
found the cause.

TRAP: FailedAssertion("decBytes > 0", File: "z_stream.c", Line: 315, PID: 21180)
postgres: pryzbyj postgres 127.0.0.1(46154) idle(ExceptionalCondition+0x99)[0x5642137806d9]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(+0x632bfe)[0x5642137d4bfe]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(zs_read+0x2b)[0x5642137d4ebb]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(zpq_read+0x192)[0x5642137d1172]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(+0x378b8a)[0x56421351ab8a]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(pq_getbyte+0x1f)[0x56421351c12f]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(PostgresMain+0xf96)[0x56421365fcc6]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(+0x428b69)[0x5642135cab69]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(PostmasterMain+0xd1c)[0x5642135cbbac]
postgres: pryzbyj postgres 127.0.0.1(46154) idle(main+0x220)[0x5642132f9a40]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fb70ef45b97]

Maybe you should reset the streams between each compression message (even if
it's using the same compression algorithm).  This might allow better
compression.  You could either unconditionally call zs_compressor_free()/
zs_create_compressor().  Or, add a new, optional interface for resetting the
stream (and if that isn't set for a given compression method, then free+create
the stream).  For LZ4, that'd be LZ4_resetStream_fast(), but only for
v1.9.0+...

Some minor formatting issues:
 - There are spaces rather than tabs in a few files; you can maybe fix it by
   piping the file through unexpand -t4.
 - pointers in function declarations should have no space after the stars:
   +zs_buffered(ZStream * zs)
 - There's a few extraneous whitespace changes:
   ConfigureNamesEnum, pq_buffer_has_data, libpq-int.h.
 - Some lines are too long:
   git log -2 -U1 '*.[ch]' |grep -E '.{80}|^diff' |less
 - Some 1-line "if" statements would be easier to read without braces {}.
 - Some multi-line comments should have "*/" on a separate line:
   git log -3 -p |grep -E '^\+ \*.*\*\/'
   git log -3 -p '*.c' |grep -E '^\+[^/]*\*.*\*\/'

-- 
Justin

Вложения

Re: libpq compression (part 2)

От
Andrey Borodin
Дата:
> Maybe you should reset the streams between each compression message (even if
> it's using the same compression algorithm). This might allow better
> compression.

AFAIK on the contrary - longer data sequence usually compresses better. The codec can use knowledge about prior data to
bettercompress current bytes.
 

Best regards, Andrey Borodin.



Re: libpq compression (part 2)

От
Andrey Borodin
Дата:

> 8 янв. 2022 г., в 01:56, Stephen Frost <sfrost@snowman.net> написал(а):
>>
>> It's discussed in last year's thread.  The thinking is that there tends to be
>> *fewer* exploitable opportunities between application->DB than between
>> browser->app.
>
> Yes, this was discussed previously and addressed.

What else do we need to decide architecturally to make protocol compression happen in 15? As far as I can see - only
HBA\GUCpart. 

Best regards, Andrey Borodin.




Re: libpq compression (part 2)

От
Justin Pryzby
Дата:
zlib still causes check-world to get stuck.  I first mentioned this last March:
20210319062800.GI11765@telsasoft.com

Actually all the compression methods seems to get stuck with
time make check -C src/bin/pg_rewind
time make check -C src/test/isolation

For CI purposes, there should be an 0003 patch which enables compression by
default, for all message types and maybe all lengths.

I removed the thread from the CFBOT until that's resolved.

-- 
Justin



Re: libpq compression (part 2)

От
Daniil Zakhlystov
Дата:
Hi, Justin!

First of all, thanks for the detailed review. I’ve applied your patches to the current version.


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I wonder whether the asymmetric compression idea is useful.  The only
> application I can see for this is that a server might allow to *decompress*
> data from a client, but may not want to *compress* data to a client.

In the current patch state, there is no option to forbid the server-to-client or client-to-server
compression only.  The decision of whether to compress or not is based only on the message length.  In
the future, we can implement more precise control without the need for any protocol changes.

> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I'm not sure it's needed to allow the compression to be changed dynamically,
> and the generality to support a different compression mthod for each libpq
> message type seems excessive.  Maybe it's enough to check that the message type
> is one of VALID_LONG_MESSAGE_TYPE and its length is long enough.


Yes, as stated above, in the current patch version protocol message type is ignored but can be potentially
be taken into account.  All messages will be compressed using the first available compressor.  For
example, if postgresql.conf contains “libpq_compression = zlib,zstd” and has "compression = zlib,zstd”
in its connection string, zlib will be used to compress all messages with length more than the threshold.


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> What happenes if the compression doesn't decrease the message size ?  I don't
> see anything that allows sending the original, raw data.  The advantage would
> be that the remote side doesn't incur the overhead of decompression.

Because of the streaming compression, we can’t just throw away some compressed data because future
compressed messages might back-reference it.  There are two ways to solve this:
1.  Do not use the streaming compression
2.  Somehow reset the compression context to the state before the message has been compressed

I’ll look into it.  Also, if anyone has any thoughts on this, I’ll appreciate hearing your opinions.


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I hit this assertion with PGCOMPRESSION=lz4 (but not zlib), and haven't yet
> found the cause.

Thanks for reporting this.  Looks like LZ4 implementation is not polished yet and needs some
additional effort. I'll look into it too.

> On 12 Jan 2022, at 19:15, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> zlib still causes check-world to get stuck.  I first mentioned this last March:
> 20210319062800.GI11765@telsasoft.com
>
> Actually all the compression methods seems to get stuck with
> time make check -C src/bin/pg_rewind
> time make check -C src/test/isolation
>
> For CI purposes, there should be an 0003 patch which enables compression by
> default, for all message types and maybe all lengths.
>
> I removed the thread from the CFBOT until that's resolved.

I’ve fixed the failing tests, now they should pass.

Thanks,

Daniil Zakhlystov




Вложения

Re: libpq compression (part 2)

От
Justin Pryzby
Дата:
On Fri, Jan 14, 2022 at 02:12:17AM +0500, Daniil Zakhlystov wrote:
> Hi, Justin!
> 
> First of all, thanks for the detailed review. I’ve applied your patches to the current version.

Note that my message had other comments that weren't addressed in this patch.

Your 0003 patch has a couple "noise" hunks that get rid of ^M characters added
in previous patches.  The ^M shouldn't be added in the first place.  Did you
apply my fixes using git-am or something else ?

On Fri, Jan 14, 2022 at 02:12:17AM +0500, Daniil Zakhlystov wrote:
> > On 12 Jan 2022, at 19:15, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > 
> > zlib still causes check-world to get stuck.  I first mentioned this last March:
> > 20210319062800.GI11765@telsasoft.com
> > 
...
> > I removed the thread from the CFBOT until that's resolved.
> 
> I’ve fixed the failing tests, now they should pass.

macos: passed
linux: timed out after 1hr
freebsd: failed in pg_rewind: ... <= replay_lsn AND state = 'streaming' FROM ...
windows: "Failed test 'data_checksums=on is reported on an offline cluster stdout /(?^:^on$)/'" / WARNING:  01000:
algorithmzlib is not supported
 

Note that it's possible and easy to kick off a CI run using any github account:
see ./src/tools/ci/README

For me, it's faster than running check-world -j4 locally, and runs tests on 4
OSes.

I re-ran your branch under my own account and linux didn't get stuck (and the
compiler warnings tests passed).  But on a third attempt, macos failed the
pg_rewind test, and bsd failed the subscription test:
| SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('s', 'r');

-- 
Justin



Re: libpq compression (part 2)

От
Daniil Zakhlystov
Дата:
Hi!


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> +/* GUC variable containing the allowed compression algorithms list (separated by comma) */
> +char      *libpq_compress_algorithms;
>
> => The global variable is conventionally initialized to the same default as
> guc.c (even though the GUC default value will be applied at startup).

I’ve updated libpq_compress_algorithms to match the corresponding guc.c setting value.


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> +                * - NegotiateProtocolVersion in cases when server does not support protocol compression
> +                * Anything else probably means it's not Postgres on the other end at all.
>                 */
> -               if (!(beresp == 'R' || beresp == 'E'))
> +               if (!(beresp == 'R' || beresp == 'E' || beresp == 'z' || beresp == 'v'))
>
> => I think NegotiateProtocolVersion and 'v' are from an old version of the
> patch and no longer used ?

No, it is still relevant in the current patch version.  I’ve added more comments regarding this case
and also made more transparent compression negotiation behavior.

If the client requested the compression, but the server does not support the libpq compression feature
because of the old Postgres version, the client will report an error and exit.  If the server rejected
the client's compression request (for example, if libpq_compression is set to ‘off’ or no matching
algorithms found), the client will also report an error and exit.

> On 14 Jan 2022, at 04:58, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> Your 0003 patch has a couple "noise" hunks that get rid of ^M characters added
> in previous patches.  The ^M shouldn't be added in the first place.  Did you
> apply my fixes using git-am or something else ?

I’ve fixed the line endings and applied the pgindent. Now each patch should be OK.

> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> => Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> + * Report current transaction start timestamp as the specified value.
> + * Zero means there is no active transaction.
>
> => The comment doesn't correspond to the function


> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> +       return id >= 0 && id < (sizeof(zs_algorithms) / sizeof(*zs_algorithms));
> +       size_t          n_algorithms = sizeof(zs_algorithms) / sizeof(*zs_algorithms);
>
> => You can use lengthof() for these


Thanks, I’ve updated the patch and corrected the highlighted issues.

> On 1 Jan 2022, at 22:25, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> I think the psql conninfo Compressor/Decompressor line may be confusing.
> Maybe it should talk about Client->Server and Server->Client.
> Maybe it should be displayed on a single line.
>
> Actually, right now the \conninfo part is hardly useful, since the compressor
> is allocated lazily/"on demand", so it shows the compression of some previous
> command, but maybe not the last command, and maybe not the next command…

I’ve updated the \conninfo part, now it shows the list of the negotiated compression algorithms.

Also, I’ve added the check for the compression option to the frontend side.

> On 14 Jan 2022, at 04:58, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> macos: passed
> linux: timed out after 1hr
> freebsd: failed in pg_rewind: ... <= replay_lsn AND state = 'streaming' FROM ...
> windows: "Failed test 'data_checksums=on is reported on an offline cluster stdout /(?^:^on$)/'" / WARNING:  01000:
algorithmzlib is not supported 
>
> Note that it's possible and easy to kick off a CI run using any github account:
> see ./src/tools/ci/README
>
> For me, it's faster than running check-world -j4 locally, and runs tests on 4
> OSes.


I’ve resolved the stuck tests and added zlib support for CI Windows builds to patch 0003-*.  Thanks
for the suggestion, all tests seem to be OK now, except the macOS one.  It won't schedule in Cirrus
CI for some reason, but I guess it happens because of my GitHub account limitation.

--
Thanks,

Daniil Zakhlystov




Вложения

Re: libpq compression (part 2)

От
Justin Pryzby
Дата:
On Tue, Jan 18, 2022 at 02:06:32AM +0500, Daniil Zakhlystov wrote:
> > => Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).

> I’ve resolved the stuck tests and added zlib support for CI Windows builds to patch 0003-*.  Thanks
> for the suggestion, all tests seem to be OK now, except the macOS one.  It won't schedule in Cirrus
> CI for some reason, but I guess it happens because of my GitHub account limitation.

I don't know about your github account, but it works for cfbot, which is now
green.

Thanks for implementing zlib for windows.  Did you try this with default
compressions set to lz4 and zstd ?

The thread from 2019 is very long, and starts off with the guidance that
compression had been implemented at the wrong layer.  It looks like this hasn't
changed since then.  secure_read/write are passed as function pointers to the
ZPQ interface, which then calls back to them to read and flush its compression
buffers.  As I understand, the suggestion was to leave the socket reads and
writes alone.  And then conditionally de/compress buffers after reading /
before writing from the socket if compression was negotiated.

It's currently done like this
pq_recvbuf() => secure_read() - when compression is disabled 
pq_recvbuf() => ZPQ => secure_read() - when compression is enabled 

Dmitri sent a partial, POC patch which changes the de/compression to happen in
secure_read/write, which is changed to call ZPQ:  
https://www.postgresql.org/message-id/CA+q6zcUPrssNaRS+FyoBsD-F2stK1Roa-4sAhFOfAjOWLziM4g@mail.gmail.com
pq_recvbuf() => secure_read() => ZPQ

The same thing is true of the frontend: function pointers to
pqsecure_read/write are being passed to zpq_create, and then the ZPQ interface
called instead of the original functions.  Those are the functions which read
using SSL, so they should also handle compression.

That's where SSL is handled, and it seems like the right place to handle
compression.  Have you evaluated that way to do things ?

Konstantin said he put ZPQ at that layer seems to 1) avoid code duplication
between client/server; and, 2) to allow compression to happen before SSL, to
allow both (if the admin decides it's okay).  But I don't see why compression
can't happen before sending to SSL, or after reading from it?

-- 
Justin



Re: libpq compression (part 2)

От
Justin Pryzby
Дата:
If there's no objection, I'd like to move this to the next CF for consideration
in PG16.

On Mon, Jan 17, 2022 at 10:39:19PM -0600, Justin Pryzby wrote:
> On Tue, Jan 18, 2022 at 02:06:32AM +0500, Daniil Zakhlystov wrote:
> > > => Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).
> 
> > I’ve resolved the stuck tests and added zlib support for CI Windows builds to patch 0003-*.  Thanks
> > for the suggestion, all tests seem to be OK now, except the macOS one.  It won't schedule in Cirrus
> > CI for some reason, but I guess it happens because of my GitHub account limitation.
> 
> I don't know about your github account, but it works for cfbot, which is now
> green.
> 
> Thanks for implementing zlib for windows.  Did you try this with default
> compressions set to lz4 and zstd ?
> 
> The thread from 2019 is very long, and starts off with the guidance that
> compression had been implemented at the wrong layer.  It looks like this hasn't
> changed since then.  secure_read/write are passed as function pointers to the
> ZPQ interface, which then calls back to them to read and flush its compression
> buffers.  As I understand, the suggestion was to leave the socket reads and
> writes alone.  And then conditionally de/compress buffers after reading /
> before writing from the socket if compression was negotiated.
> 
> It's currently done like this
> pq_recvbuf() => secure_read() - when compression is disabled 
> pq_recvbuf() => ZPQ => secure_read() - when compression is enabled 
> 
> Dmitri sent a partial, POC patch which changes the de/compression to happen in
> secure_read/write, which is changed to call ZPQ:  
> https://www.postgresql.org/message-id/CA+q6zcUPrssNaRS+FyoBsD-F2stK1Roa-4sAhFOfAjOWLziM4g@mail.gmail.com
> pq_recvbuf() => secure_read() => ZPQ
> 
> The same thing is true of the frontend: function pointers to
> pqsecure_read/write are being passed to zpq_create, and then the ZPQ interface
> called instead of the original functions.  Those are the functions which read
> using SSL, so they should also handle compression.
> 
> That's where SSL is handled, and it seems like the right place to handle
> compression.  Have you evaluated that way to do things ?
> 
> Konstantin said he put ZPQ at that layer seems to 1) avoid code duplication
> between client/server; and, 2) to allow compression to happen before SSL, to
> allow both (if the admin decides it's okay).  But I don't see why compression
> can't happen before sending to SSL, or after reading from it?



Re: libpq compression (part 2)

От
Daniil Zakhlystov
Дата:
Ok, thanks


> On 3 Mar 2022, at 02:33, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> If there's no objection, I'd like to move this to the next CF for consideration
> in PG16.
>
> On Mon, Jan 17, 2022 at 10:39:19PM -0600, Justin Pryzby wrote:
>> On Tue, Jan 18, 2022 at 02:06:32AM +0500, Daniil Zakhlystov wrote:
>>>> => Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).
>>
>>> I’ve resolved the stuck tests and added zlib support for CI Windows builds to patch 0003-*.  Thanks
>>> for the suggestion, all tests seem to be OK now, except the macOS one.  It won't schedule in Cirrus
>>> CI for some reason, but I guess it happens because of my GitHub account limitation.
>>
>> I don't know about your github account, but it works for cfbot, which is now
>> green.
>>
>> Thanks for implementing zlib for windows.  Did you try this with default
>> compressions set to lz4 and zstd ?
>>
>> The thread from 2019 is very long, and starts off with the guidance that
>> compression had been implemented at the wrong layer.  It looks like this hasn't
>> changed since then.  secure_read/write are passed as function pointers to the
>> ZPQ interface, which then calls back to them to read and flush its compression
>> buffers.  As I understand, the suggestion was to leave the socket reads and
>> writes alone.  And then conditionally de/compress buffers after reading /
>> before writing from the socket if compression was negotiated.
>>
>> It's currently done like this
>> pq_recvbuf() => secure_read() - when compression is disabled
>> pq_recvbuf() => ZPQ => secure_read() - when compression is enabled
>>
>> Dmitri sent a partial, POC patch which changes the de/compression to happen in
>> secure_read/write, which is changed to call ZPQ:
>> https://www.postgresql.org/message-id/CA+q6zcUPrssNaRS+FyoBsD-F2stK1Roa-4sAhFOfAjOWLziM4g@mail.gmail.com
>> pq_recvbuf() => secure_read() => ZPQ
>>
>> The same thing is true of the frontend: function pointers to
>> pqsecure_read/write are being passed to zpq_create, and then the ZPQ interface
>> called instead of the original functions.  Those are the functions which read
>> using SSL, so they should also handle compression.
>>
>> That's where SSL is handled, and it seems like the right place to handle
>> compression.  Have you evaluated that way to do things ?
>>
>> Konstantin said he put ZPQ at that layer seems to 1) avoid code duplication
>> between client/server; and, 2) to allow compression to happen before SSL, to
>> allow both (if the admin decides it's okay).  But I don't see why compression
>> can't happen before sending to SSL, or after reading from it?




Re: libpq compression (part 2)

От
Jacob Champion
Дата:
This entry has been waiting on author input for a while (our current
threshold is roughly two weeks), so I've marked it Returned with
Feedback.

Once you think the patchset is ready for review again, you (or any
interested party) can resurrect the patch entry by visiting

    https://commitfest.postgresql.org/38/3499/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

Thanks,
--Jacob



Re: libpq compression (part 2)

От
Andrey Borodin
Дата:
On Tue, Aug 2, 2022 at 1:53 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> and changing the status to "Needs Review"

I've tried the patch, it works as advertised. The code generally is
OK, maybe some functions require comments (because at least their
neighbours have some).
Some linkers complain about zs_is_valid_impl_id() being inline while
used in other modules.

Some architectural notes:
1. Currently when the user requests compression from libpq, but the
server does not support any of the codecs client have - the connection
will be rejected by client. I think this should be configured akin to
SSL: on, try, off.
2. On the zpq_stream level of abstraction we parse a stream of bytes
to look for individual message headers. I think this is OK, just a
little bit awkward.
3. CompressionAck message can be completely replaced by ParameterStatus message.
4. Instead of sending a separate SetCompressionMethod, the
CompressedData can have its header with the index of the used method
and the necessity to restart compression context.
5. Portions of pg_stat_network_traffic can be extracted to separate
patch step to ease the review. And, actually, the scope of this view
is slightly beyond compression anyway.

What do you think?

Also, compression is a very cool and awaited feature, hope to see it
committed one day, thank you for working on this!


Best regards, Andrey Borodin.



Re: libpq compression (part 2)

От
Andrey Borodin
Дата:
On Sat, Nov 12, 2022 at 1:47 PM Andrey Borodin <amborodin86@gmail.com> wrote:
>
> I've tried the patch, it works as advertised.

While testing patch some more I observe unpleasant segfaults:

#26 0x00007fecafa1e058 in __memcpy_ssse3_back () from target:/lib64/libc.so.6
#27 0x000000000b08fda2 in lz4_decompress (d_stream=0x18cf82a0,
src=0x7feae4fa505d, src_size=92,
    src_processed=0x7ffff9f4fdf8, dst=0x18b01f80, dst_size=8192,
dst_processed=0x7ffff9f4fe60)
#28 0x000000000b090624 in zs_read (zs=0x18cdfbf0, src=0x7feae4fa505d,
src_size=92, src_processed=0x7ffff9f4fdf8,
    dst=0x18b01f80, dst_size=8192, dst_processed=0x7ffff9f4fe60)
#29 0x000000000b08eb8f in zpq_read_compressed_message
(zpq=0x7feae4fa5010, dst=0x18b01f80 "Q", dst_len=8192,
    dst_processed=0x7ffff9f4fe60)
#30 0x000000000b08f1a9 in zpq_read (zpq=0x7feae4fa5010,
dst=0x18b01f80, dst_size=8192, noblock=false)

(gdb) select-frame 27
(gdb) info locals
ds = 0x18cf82a0
decPtr = 0x18cf8aec ""
decBytes = -87

This is the buffer overrun by decompression. I think the receive
buffer must be twice bigger than the send buffer to accommodate such
messages.
Also this portion of lz4_decompress()
    Assert(decBytes > 0);
must actually be a real check and elog(ERROR,). Because clients can
intentionally compose CompressedData to blow up a server.

Best regards, Andrey Borodin.



Re: libpq compression (part 2)

От
Andrey Borodin
Дата:
On Sat, Nov 12, 2022 at 8:04 PM Andrey Borodin <amborodin86@gmail.com> wrote:
>
> While testing patch some more I observe unpleasant segfaults:
>
> #27 0x000000000b08fda2 in lz4_decompress (d_stream=0x18cf82a0,
> src=0x7feae4fa505d, src_size=92,
> (gdb) select-frame 27
> (gdb) info locals
> ds = 0x18cf82a0
> decPtr = 0x18cf8aec ""
> decBytes = -87
>

I've debugged the stuff and the problem resulted to be in wrong
message limits for Lz4 compression
+#define MESSAGE_MAX_BYTES 819200
instead of
+#define MESSAGE_MAX_BYTES 8192

Other codecs can utilize continuation of the decompression stream
using ZS_DATA_PENDING, while Lz4 cannot do this. I was going to
produce quickfix for all my lz4 findings, but it occured to me that a
patchset needs a heavy rebase. If no one shows up to fix it I'll do
that in one of the coming weekends. Either way here's a reproducer for
the coredump:
psql 'compression=lz4' -f boom.sql

Thanks!

Best regards, Andrey Borodin.

Вложения

Re: libpq compression (part 2)

От
Justin Pryzby
Дата:
On Mon, Nov 14, 2022 at 07:44:24PM -0800, Andrey Borodin wrote:
> patchset needs a heavy rebase. If no one shows up to fix it I'll do

Despite what its git timestamp says, this is based on the most recent
patch from January, which I've had floating around since then.  It
needed to be rebased over at least:

  - guc_tables patch;
  - build and test with meson;
  - doc/

Some of my changes are separate so you can see what I've done.
check_libpq_compression() is in the wrong place, but I couldn't
immediately see where else to put it, since src/common can't include the
backend's guc headers.

Some of the makefile changes seem unnecessary (now?), and my meson
changes don't seem quite right, either.

There's no reason for Zstd to be a separate patch anymore.

It should be updated to parse compression level and options using the
infrastructure introduced for basebackup.

And address the architectural issue from 2 years ago:
https://www.postgresql.org/message-id/20220118043919.GA23027%40telsasoft.com

The global variable PqStream should be moved into some libpq structure
(Port?) and handled within secure_read().  And pqsecure_read shouldn't
be passed as a function pointer/callback.  And verify it passes tests
with all supported compression algorithms and connects to old servers.

-- 
Justin

Вложения

Re: libpq compression (part 2)

От
Andrey Borodin
Дата:
On Tue, Nov 15, 2022 at 7:17 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>

Also I've found one more TODO item for the patch. Currently in
fe-connect.c patch is doing buffer reset:
conn->inStart = conn->inCursor = conn->inEnd = 0;
This effectively consumes butes up tu current cursor. However, packet
inspection is continued. The patch works because in most cases
following code will issue re-read of message. Coincidentally.
/* Get the type of request. */
if (pqGetInt((int *) &areq, 4, conn))
{
/* We'll come back when there are more data */
return PGRES_POLLING_READING;
}

But I think we need a proper
goto keep_going;
to start from the beginning of the message.


Thank you!

Best regards, Andrey Borodin.



Re: libpq compression (part 2)

От
Peter Eisentraut
Дата:
On 17.11.22 01:27, Andrey Borodin wrote:
> Also I've found one more TODO item for the patch. Currently in
> fe-connect.c patch is doing buffer reset:
> conn->inStart = conn->inCursor = conn->inEnd = 0;
> This effectively consumes butes up tu current cursor. However, packet
> inspection is continued. The patch works because in most cases
> following code will issue re-read of message. Coincidentally.
> /* Get the type of request. */
> if (pqGetInt((int *) &areq, 4, conn))
> {
> /* We'll come back when there are more data */
> return PGRES_POLLING_READING;
> }

Note that the above code was just changed in dce92e59b1.  I don't know 
how that affects this patch set.




Re: libpq compression (part 2)

От
Andrey Borodin
Дата:
On Thu, Nov 17, 2022 at 7:09 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> Note that the above code was just changed in dce92e59b1.
Thanks!

> I don't know
> how that affects this patch set.
With dce92e59b1 it would be much easier to find a bug in the compression patch.

Some more notes about the patch. (sorry for posting review notes in so
many different messages)

1. zs_is_valid_impl_id(unsigned int id)
{
return id >= 0 && id < lengthof(zs_algorithms);
}

id is unsigned, no need to check it's non-negative.

2. This literal
{no_compression_name}
should be replaced by explicit form
{no_compression_name, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL}

3. Comments like "Return true if should, false if should not." are useless.

Thanks!

Best regards, Andrey Borodin.



Re: libpq compression (part 2)

От
Peter Eisentraut
Дата:
On 18.11.22 02:07, Andrey Borodin wrote:
> 2. This literal
> {no_compression_name}
> should be replaced by explicit form
> {no_compression_name, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL}

That doesn't seem better.



Re: libpq compression (part 2)

От
"Jonah H. Harris"
Дата:
Pinging to see if anyone has continued to work on this behind-the-scenes or whether this is the latest patch set there is.

--
Jonah H. Harris

Re: libpq compression (part 2)

От
"Andrey M. Borodin"
Дата:

> On 10 Aug 2023, at 19:47, Jonah H. Harris <jonah.harris@gmail.com> wrote:
>
> Pinging to see if anyone has continued to work on this behind-the-scenes or whether this is the latest patch set
thereis. 

It's still on my TODO list, but I haven't done much review cycles yet. And the patch series already needs heavy rebase.

Thank for your interest in the topic.


Best regards, Andrey Borodin.