Обсуждение: PostgreSQL in-transit compression for a client connection

Поиск
Список
Период
Сортировка

PostgreSQL in-transit compression for a client connection

От
Tushar Takate
Дата:
Hi Team,

Does PostgreSQL support in-transit compression for a client connection?

If yes, Then please help me with the below.
  1. 1. What are the different methods? 
  2. 2. How to enable/use it?


Thanks & Regards,
Tushar K Takate.

Re: PostgreSQL in-transit compression for a client connection

От
Daniel Gustafsson
Дата:
> On 27 Apr 2023, at 11:18, Tushar Takate <tushar11.takate@gmail.com> wrote:

> Does PostgreSQL support in-transit compression for a client connection?

No.  Earlier versions supported SSL compression for encrypted connections but
that rarely worked as it was disabled in the vast majority of all OpenSSL
installations.  There has been patches proposed to add compression to libpq but
nothing has been added as of yet.

--
Daniel Gustafsson




Re: PostgreSQL in-transit compression for a client connection

От
Laurenz Albe
Дата:
On Thu, 2023-04-27 at 14:48 +0530, Tushar Takate wrote:
> Does PostgreSQL support in-transit compression for a client connection?

No, not any more.  There used to be compression via SSL, but that was
removed for security reasons, and because most binary distributions of
OpenSSL didn't support it anyway.

Yours,
Laurenz Albe



Re: PostgreSQL in-transit compression for a client connection

От
Dominique Devienne
Дата:
On Thu, Apr 27, 2023 at 11:24 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2023-04-27 at 14:48 +0530, Tushar Takate wrote:
> Does PostgreSQL support in-transit compression for a client connection?

No, not any more.

On a related but different subject, as someone who must store ZLIB (from ZIP files)
and sometimes LZ4 compressed `bytea` values, I often find it's a shame that I have
to decompress them, send them over the wire uncompressed, to have the PostgreSQL
backend recompress them when TOAST'ed. That's a waste of CPU and IO bandwidth...

I wish there was a way to tell the backend via libpq and the v3 (or later) protocol:
Here's the XYZ compressed value, with this uncompressed size and checksum
(depending on the format used / expected), and skip the decompression/re-compression
and fatter bandwidth, to store them as-is (in the usual 2K TOAST chunks).

I know this is unlikely to happen, for several reasons. Still, I thought I'd throw it out there.

PS: BTW, in my testing, on-the-wire compression is rarely beneficial IMHO. I tested the
break-even bandwidth point in the (industry-specific) client-server protocol I worked on,
which optionally supports compression, and those bandwidths were quite low. The CPU cost of
ZLib (~ 4x compression) and even the faster LZ4 (~ 2x compression) and decompression
at the other end, are high enough that you need quite low bandwidth to recoup them on IO. FWIW.

Re: PostgreSQL in-transit compression for a client connection

От
Laurenz Albe
Дата:
On Thu, 2023-04-27 at 11:44 +0200, Dominique Devienne wrote:
> as someone who must store ZLIB (from ZIP files)
> and sometimes LZ4 compressed `bytea` values, I often find it's a shame that I have
> to decompress them, send them over the wire uncompressed, to have the PostgreSQL
> backend recompress them when TOAST'ed. That's a waste of CPU and IO bandwidth...

That's not what you were looking for, but why not store the compressed data
in the database (after SET STORAGE EXTERNAL on the column) and uncompress
them after you have received them on the client side?

Yours,
Laurenz Albe



Re: PostgreSQL in-transit compression for a client connection

От
Magnus Hagander
Дата:
On Thu, Apr 27, 2023 at 11:55 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Thu, 2023-04-27 at 11:44 +0200, Dominique Devienne wrote:
> > as someone who must store ZLIB (from ZIP files)
> > and sometimes LZ4 compressed `bytea` values, I often find it's a shame that I have
> > to decompress them, send them over the wire uncompressed, to have the PostgreSQL
> > backend recompress them when TOAST'ed. That's a waste of CPU and IO bandwidth...
>
> That's not what you were looking for, but why not store the compressed data
> in the database (after SET STORAGE EXTERNAL on the column) and uncompress
> them after you have received them on the client side?

That assumes you only have one client. You may want to use the
transparent compression/decompression from some clients and not for
others.

I think it'd be a useful feature to have, but it's not something that
we have today or that I'm aware of being on anybodys radar. So most
likely, for now you're stuck with either what you're doing today, or
as Laurenz suggests handle it completely in the application. You can't
do the mix.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: PostgreSQL in-transit compression for a client connection

От
Dominique Devienne
Дата:
On Fri, Apr 28, 2023 at 9:03 AM Magnus Hagander <magnus@hagander.net> wrote:
On Thu, Apr 27, 2023 at 11:55 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Thu, 2023-04-27 at 11:44 +0200, Dominique Devienne wrote:
> > as someone who must store ZLIB (from ZIP files)
> > and sometimes LZ4 compressed `bytea` values, I often find it's a shame that I have
> > to decompress them, send them over the wire uncompressed, to have the PostgreSQL
> > backend recompress them when TOAST'ed. That's a waste of CPU and IO bandwidth...
>
> That's not what you were looking for, but why not store the compressed data
> in the database (after SET STORAGE EXTERNAL on the column) and uncompress
> them after you have received them on the client side?

Laurenz is right of course. But then like Magnus is saying, I lose transparent decompression,
on read. But also for server-side processing. Some of those compressed values are actually
XML, and sometimes it can be useful to process (usualyl extract subset) of those server-side.

Unless there are ways to uncompress values explicitly in SQL or PG/PLSQL?
 
That assumes you only have one client. You may want to use the
transparent compression/decompression from some clients and not for
others.

Which brings up something I forgot to mention earlier, where I concrentrated from the write-side,
which is that clients would also ideally need a way to fetch the values still-compressed, when
explicitly requesting it, while others implicitly get the transparent decompression.

BTW, such a mechanism would open the door for libpq doing that itself transparently too, I guess.
That would allow network-transport of the compressed values, and client-side decompression.
Of course, when getting the whole value only, not when getting a subset. And possibly opt-in only.
 
I think it'd be a useful feature to have, but it's not something that
we have today or that I'm aware of being on anybodys radar. So most
likely, for now you're stuck with either what you're doing today, or
as Laurenz suggests handle it completely in the application. You can't
do the mix.

It's a tricky feature, because the client and server have to cooperate and agree on the exact
format used for compressed values, including meta-data (uncompressed size, and checksum,
and compression algo+setting and checksum type; e.g. CRC32 vs XXHASH). That's why I
suspect this won't happen anytime soon or ever. It's just a pie-in-the-sky brainstorming exercise.