Обсуждение: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

Поиск
Список
Период
Сортировка

BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16476
Logged by:          Frank Gagnepain
Email address:      frank.gagnepain@intm.fr
PostgreSQL version: 10.13
Operating system:   Debian 10
Description:

Hello to the support team,

I already sent a bug report for this issue, but PostgreSQL version was
9.4.21 which isnt supported anymore
So we tested this bug with a 10.13 PostgreSQL version this time and we got
the exact same issue.

I get "ERROR:  Wrong key or corrupt data" when using successively function
pgp_sym_encrypt_bytea and pgp_sym_decrypt_bytea on only some bytea data in
db with those options :
compress-algo=1 (ZIP algo)
cipher-algo=aes256
compress-level=6 (which is the default compress-level)
With any other value for compress-level (0,1,2,3,4,5,7,8,9) for
pgp_sym_encrypt_bytea, i get no error with pgp_sym_decrypt_bytea...


Here is what i do to test this error :


create or replace function bytea_import(p_path text, p_result out bytea)
language plpgsql as $$
declare
l_oid oid;
r record;
begin
p_result := '';
select lo_import(p_path) into l_oid;
for r in ( select data
from pg_largeobject
where loid = l_oid
order by pageno ) loop
p_result = p_result || r.data;
end loop;
perform lo_unlink(l_oid);
end;$$;


select
pgp_sym_decrypt_bytea(pgp_sym_encrypt_bytea(bytea_import(DATA),'password','compress-algo=1,
cipher-algo=aes256, compress-level=6'),'password','compress-algo=1,
cipher-algo=aes256');


ERROR:  Wrong key or corrupt data


Unfortunately i cant post any example of DATA since those are supposed to
be
sensitive data.
Nevertheless, does this kind of error rings a bell to anyone ?


Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data

От
Jeff Janes
Дата:
 
select
pgp_sym_decrypt_bytea(pgp_sym_encrypt_bytea(bytea_import(DATA),'password','compress-algo=1,
cipher-algo=aes256, compress-level=6'),'password','compress-algo=1,
cipher-algo=aes256');


decryption reads the settings from the encrypted message header, there is no need to specify them again. 

I can reproduce this at any compression level if the data is random (not compressible) and exactly 16365 bytes long.  If the data is compressible, then you need a longer length of message to reproduce it and it depends on the random content and compression level.

I'm attaching the reproducer as a Perl script.  I have not investigated the C code of pgcrypto itself.

Cheers,

Jeff
Вложения

RE: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data

От
Frank Gagnepain
Дата:
Hello again,

Thank you for this script ,

We managed to get you an example of data that triggers the error message (with compress-level=6) in attachments.
You would have to unzip first and then test it (I mean it hasnt been zipped by pgcrypto).

Cheers,

Frank GAGNEPAIN





De : Jeff Janes <jeff.janes@gmail.com>
Envoyé : mercredi 3 juin 2020 14:35
À : Frank Gagnepain <frank.gagnepain@intm.fr>; pgsql-bugs <pgsql-bugs@lists.postgresql.org>
Objet : Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data
 
 
select
pgp_sym_decrypt_bytea(pgp_sym_encrypt_bytea(bytea_import(DATA),'password','compress-algo=1,
cipher-algo=aes256, compress-level=6'),'password','compress-algo=1,
cipher-algo=aes256');


decryption reads the settings from the encrypted message header, there is no need to specify them again. 

I can reproduce this at any compression level if the data is random (not compressible) and exactly 16365 bytes long.  If the data is compressible, then you need a longer length of message to reproduce it and it depends on the random content and compression level.

I'm attaching the reproducer as a Perl script.  I have not investigated the C code of pgcrypto itself.

Cheers,

Jeff
Вложения

Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 :Wrong key or corrupt data

От
Kyotaro Horiguchi
Дата:
At Wed, 3 Jun 2020 08:35:02 -0400, Jeff Janes <jeff.janes@gmail.com> wrote in 
> I can reproduce this at any compression level if the data is random (not
> compressible) and exactly 16365 bytes long.  If the data is compressible,
> then you need a longer length of message to reproduce it and it depends on
> the random content and compression level.
> 
> I'm attaching the reproducer as a Perl script.  I have not investigated the
> C code of pgcrypto itself.

Thanks for the reproducer.

Compressed stream must end with a normal packet. If a stream ends with
a complete stream packet, deflator adds a zero-length normal packet at
the end.  decompress_read forgets to read such a terminating packet
when EOF comes at just at the end of the last stream packet. An extra
call to pullf_read at EOF correctly consumes such an extra packet.
The extra call doesn't harm if a stream ends with partial normal
packet.

The reproducer becomes not to fail with the attached patch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From f4a8997ec08c0aaec8326b0f0dda9f3a001d5865 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 11 Jun 2020 20:29:23 +0900
Subject: [PATCH] Make sure to consume stream-terminating packet

When compressed stream is ended with a full stream packet, it must be
followed by a terminating normal packet with zero-length.  Make sure
consume such terminating packet at the end of stream.
---
 contrib/pgcrypto/pgp-compress.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/contrib/pgcrypto/pgp-compress.c b/contrib/pgcrypto/pgp-compress.c
index 0505bdee92..69a8d6c577 100644
--- a/contrib/pgcrypto/pgp-compress.c
+++ b/contrib/pgcrypto/pgp-compress.c
@@ -286,7 +286,20 @@ restart:
 
     dec->buf_data = dec->buf_len - dec->stream.avail_out;
     if (res == Z_STREAM_END)
+    {
+        uint8 *tmp;
+
+        /*
+         * If source stream ends with a full stream packet, it is followed by
+         * an extra normal zero-length packet, which should be consumed before
+         * reading further. If we have already seen a terminating packet,
+         * nothing happen by this call.
+         */
+        res = pullf_read(src, 1, &tmp);
+        Assert(res == 0);
+        
         dec->eof = 1;
+    }
     goto restart;
 }
 
-- 
2.18.2


Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data

От
Kyotaro Horiguchi
Дата:
The reproducer becomes not to fail with the attached patch.

I put an assertion in the patch, but that is not appropriare. It shoud be an ereport instead. I’ll fix that later.

regards.
-- 
Kyotaro Horiguchi

Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 :Wrong key or corrupt data

От
Kyotaro Horiguchi
Дата:
At Thu, 11 Jun 2020 22:17:26 +0900, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> >
> > The reproducer becomes not to fail with the attached patch.
> 
> 
> I put an assertion in the patch, but that is not appropriare. It shoud be
> an ereport instead. I’ll fix that later.

Fixed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 1f5003c164cf529a79d1f56e4c43d5867c3a345e Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 11 Jun 2020 20:29:23 +0900
Subject: [PATCH v2] Make sure to consume stream-terminating packet

When a compressed stream ends with a full packet, it must be
terminated by a normal empty packet.  Make sure to consume such
packets.
---
 contrib/pgcrypto/pgp-compress.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/contrib/pgcrypto/pgp-compress.c b/contrib/pgcrypto/pgp-compress.c
index 0505bdee92..296afb3324 100644
--- a/contrib/pgcrypto/pgp-compress.c
+++ b/contrib/pgcrypto/pgp-compress.c
@@ -286,7 +286,29 @@ restart:
 
     dec->buf_data = dec->buf_len - dec->stream.avail_out;
     if (res == Z_STREAM_END)
+    {
+        uint8 *tmp;
+
+        /*
+         * A stream must be terminated by a normal packet. If the last stream
+         * packet in the source stream is a full packet, a normal empty packet
+         * must follow. Since the underlying packet reader doesn't know that
+         * the compressed stream has been ended, we need to to consume the
+         * terminating packet here. This read doesn't harm even if the stream
+         * has already ended.
+         */
+        res = pullf_read(src, 1, &tmp);
+
+        if (res < 0)
+            return res;
+        else if (res > 0)
+        {
+            px_debug("decompress_read: extra bytes after end of stream");
+            return PXE_PGP_CORRUPT_DATA;
+        }
+        
         dec->eof = 1;
+    }
     goto restart;
 }
 
-- 
2.18.2


Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Tom Lane
Дата:
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
>> I put an assertion in the patch, but that is not appropriare. It shoud be
>> an ereport instead. I¢ll fix that later.
> Fixed.

Hm, should we add a test case exercising this code?

            regards, tom lane



Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 :Wrong key or corrupt data

От
Kyotaro Horiguchi
Дата:
At Thu, 11 Jun 2020 21:57:25 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
> Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> >> I put an assertion in the patch, but that is not appropriare. It shoud be
> >> an ereport instead. I¢ll fix that later.
> > Fixed.
>
> Hm, should we add a test case exercising this code?

Agrred.  As far as I found, there is another point where a stream ends
with an empty-packet at 16318 bytes, but he stream is not a compressed
data stream.

In the attached, generates a source bytes using random(). To make sure
to stabilize the result, it does setseed(0) just before.  One concern
of the test is there's not means to find if the test gets stale.
Concretely if somehow the data length to cause the issue moved, the
check would silently get no longer effective.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center
From bfbe165af79dbe90e6152f738921311a483ff663 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 11 Jun 2020 20:29:23 +0900
Subject: [PATCH v3] Make sure to consume stream-terminating packet

A compressed stream may end with an empty packet.  In this case
decompression finishes before reading the empty packet and the
remaining stream packet causes a failure in reading the following
data. Make sure to consume such packets after compression finishes.
---
 contrib/pgcrypto/expected/pgp-compression.out | 21 ++++++++++++++++++
 contrib/pgcrypto/pgp-compress.c               | 22 +++++++++++++++++++
 contrib/pgcrypto/sql/pgp-compression.sql      | 12 ++++++++++
 3 files changed, 55 insertions(+)

diff --git a/contrib/pgcrypto/expected/pgp-compression.out b/contrib/pgcrypto/expected/pgp-compression.out
index 32b350b8fe..84d0956070 100644
--- a/contrib/pgcrypto/expected/pgp-compression.out
+++ b/contrib/pgcrypto/expected/pgp-compression.out
@@ -48,3 +48,24 @@ select pgp_sym_decrypt(
  Secret message
 (1 row)
 
+-- check if trailing empty-packet of compressed stream is correctly read
+select setseed(0);
+ setseed 
+---------
+ 
+(1 row)
+
+select bytes =
+        pgp_sym_decrypt_bytea(
+            pgp_sym_encrypt_bytea(bytes, E'\\x12345678',
+                                    'compress-algo=1,compress-level=1'),
+            E'\\x12345678') from
+        (select -- generate incompressible source data with 16385 bytes 
+         string_agg(decode(lpad(to_hex((random()*256)::int),2,'0'), 'hex'), '')
+         as bytes
+        from generate_series(0, 16365)) t;
+ ?column? 
+----------
+ t
+(1 row)
+
diff --git a/contrib/pgcrypto/pgp-compress.c b/contrib/pgcrypto/pgp-compress.c
index 0505bdee92..296afb3324 100644
--- a/contrib/pgcrypto/pgp-compress.c
+++ b/contrib/pgcrypto/pgp-compress.c
@@ -286,7 +286,29 @@ restart:
 
     dec->buf_data = dec->buf_len - dec->stream.avail_out;
     if (res == Z_STREAM_END)
+    {
+        uint8 *tmp;
+
+        /*
+         * A stream must be terminated by a normal packet. If the last stream
+         * packet in the source stream is a full packet, a normal empty packet
+         * must follow. Since the underlying packet reader doesn't know that
+         * the compressed stream has been ended, we need to to consume the
+         * terminating packet here. This read doesn't harm even if the stream
+         * has already ended.
+         */
+        res = pullf_read(src, 1, &tmp);
+
+        if (res < 0)
+            return res;
+        else if (res > 0)
+        {
+            px_debug("decompress_read: extra bytes after end of stream");
+            return PXE_PGP_CORRUPT_DATA;
+        }
+        
         dec->eof = 1;
+    }
     goto restart;
 }
 
diff --git a/contrib/pgcrypto/sql/pgp-compression.sql b/contrib/pgcrypto/sql/pgp-compression.sql
index ca9ee1fc00..7eb65048dd 100644
--- a/contrib/pgcrypto/sql/pgp-compression.sql
+++ b/contrib/pgcrypto/sql/pgp-compression.sql
@@ -28,3 +28,15 @@ select pgp_sym_decrypt(
     pgp_sym_encrypt('Secret message', 'key',
             'compress-algo=2, compress-level=0'),
     'key', 'expect-compress-algo=0');
+
+-- check if trailing empty-packet of compressed stream is correctly read
+select setseed(0);
+select bytes =
+        pgp_sym_decrypt_bytea(
+            pgp_sym_encrypt_bytea(bytes, E'\\x12345678',
+                                    'compress-algo=1,compress-level=1'),
+            E'\\x12345678') from
+        (select -- generate incompressible source data with 16385 bytes 
+         string_agg(decode(lpad(to_hex((random()*256)::int),2,'0'), 'hex'), '')
+         as bytes
+        from generate_series(0, 16365)) t;
-- 
2.18.2


Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data

От
Michael Paquier
Дата:
On Fri, Jun 12, 2020 at 02:54:12PM +0900, Kyotaro Horiguchi wrote:
> Agrred.  As far as I found, there is another point where a stream ends
> with an empty-packet at 16318 bytes, but he stream is not a compressed
> data stream.
>
> In the attached, generates a source bytes using random(). To make sure
> to stabilize the result, it does setseed(0) just before.  One concern
> of the test is there's not means to find if the test gets stale.
> Concretely if somehow the data length to cause the issue moved, the
> check would silently get no longer effective.

FYI, I have begun looking at this report, and I am reviewing the
proposed patch.
--
Michael

Вложения

Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrongkey or corrupt data

От
Michael Paquier
Дата:
On Wed, Jun 24, 2020 at 04:59:21PM +0900, Michael Paquier wrote:
> FYI, I have begun looking at this report, and I am reviewing the
> proposed patch.

Okay.  First I found the test case you added a bit hard to parse, so I
have refactored it with a CTE in charge of building the random string,
with the main query doing the compression/decompression and a check
making sure that the original string and the result match.  I quite
liked the logic with lpad() to append zeros if the computation with
random()*256 returned a result less than 16, as well as the use of
string_agg() for that purpose to build the string.  I have also
switched the second arguments of the functions to just use 'key', for
readability.  Using a random string sounds good to me here.  It could
always be possible that we finish with something less random, causing
it to to become compressed but I'll believe in the rule of chaos
here.

Then, the fix you are proposing is to simply make sure that all the
input from the source stream is properly consumed even after the zlib
stream has ended in this corner case thanks to pullf_read(), and that
sounds good to me.  However, I had an idea slightly different than
yours, consisting of simply reading the contents of the source before
checking if there is any available in the decompressed buffer (the
check on buf_data before the goto restart step).  That makes the fix a
bit simpler, without changing the logic.

I am also attaching an extra script I used to validate this stuff
based on the regression test of the patch, that has allowed me to
check the logic for random strings up to 33kB, like that for example
(this one took 8 mins on my laptop):
select count(test_crypto(len)) from generate_series(0, 33000) as len;

The inputs and outputs perfectly matched in all my tests, with the
16kB string being the only one failing in the range I have tested on
HEAD, test passing with the patch of course.  One or more extra pairs
of eyes is welcome, so please feel free to look at the version
attached.

Thanks,
--
Michael

Вложения

Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Kyotaro Horiguchi
Дата:
At Wed, 22 Jul 2020 15:51:12 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Tue, Jul 07, 2020 at 03:07:22PM +0900, Michael Paquier wrote:
> > Horiguchi-san, as you looked at this thread, would you like to look at
> > what I sent previously?  There is still time until the next minor
> > release, but this thread has been idle for close to two weeks now.
> 
> Hearing nothing, I have studied more this stuff today and applied the

Sorry, I overlooked it. It's a bit too late, but it looks good to me.
Thanks for looking this and commited.

> patch down to 9.5.  Based on the first reports from the buildfarm,
> things are getting interesting with steamerduck complaining that the
> input and the output do not match in the test:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=steamerduck&dt=2020-07-22%2006%3A01%3A24
> 
> The package of libz used there seems to be 1.2.11, so that's up to
> date:
> https://software.opensuse.org/package/libz1
> 
> I am not exactly sure what this environment does differently.  There
> are other machines using s390x, so I'll wait a bit more before doing
> something.

Might be help if the test prints the orignal and the decrypted bytes?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Wed, Jul 22, 2020 at 05:45:54PM +0900, Kyotaro Horiguchi wrote:
>> Might be help if the test prints the orignal and the decrypted bytes?

> And turning the buildfarm completely red?  No, thanks.  Those random
> strings are also too long to be analyzed easily.

Is there a reason why the test string has to be non-constant?
The current form of the test would be fine if things were working,
but it offers absolutely zero chance of diagnosing problems.
For starters, it'd be nice to know whether the encrypt or the
decrypt side is going wrong.

            regards, tom lane



Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Tom Lane
Дата:
I've spent quite some time digging around in pgcrypto, and I can't
find anything that looks clearly machine-dependent in the code paths
involved here.  The only suspicious thing I've found is that (at least
for most of the PullFilters) pullf_read will not initialize the
output data pointer if no bytes are returned.  So on the last call
where we expect to get an EOF result from the subsidiary filter,
this:

        uint8       *tmp;

        res = pullf_read(src, 8192, &tmp);
        if (res < 0)
            return res;
        dec->stream.next_in = tmp;
        dec->stream.avail_in = res;

is setting next_in to garbage.  However, if we've already set
eof to 1, which we have, then we won't call inflate() again
so the garbage pointer should not matter.  (Besides which,
zlib really shouldn't dereference that pointer if avail_in is 0.)
I'm still baffled as to why only the SLES s390x animals are failing,
but it's beginning to seem like it might be due to them using a
different zlib version.

Having said that, though, I do not like the committed patch one bit.
It's got two big problems:

1. Once EOF has been detected, we'll still call the subsidiary filter
again on each subsequent call to decompress_read, if it takes several
calls to empty out the final load of decompressed data.  This is only
safe if all the pgcrypto filter types are okay with being called again
after they report EOF.  I'm not sure that is true --- mdc_read looks
like a counterexample --- and even if it is true, it seems inefficient.

2. There is no check to make sure that we got an EOF indication from
the subsidiary filter.  If it returned some bytes, those will just be
absorbed into the decompression input buffer, but we'll never try to
decompress them; they're just lost.  This is the inverse of the bug
allegedly being fixed here: instead of reading too few bytes and not
caring, we read too many and don't care.  Either way we lose sync
with the incoming data stream, causing problems at higher filter
levels.

Horiguchi-san's v3 patch at
<20200612.145412.475791851624925277.horikyota.ntt@gmail.com>
doesn't have either of these problems, so it seems much superior
to me than what you actually did.  I don't have a lot of hope that
changing to that one would fix the buildfarm problem --- but maybe it
would, if the machine-dependent behavior is somehow hiding in the
repeat-call-after-EOF code path.

            regards, tom lane



Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Michael Paquier
Дата:
On Wed, Jul 22, 2020 at 06:38:50PM -0400, Tom Lane wrote:
> is setting next_in to garbage.  However, if we've already set
> eof to 1, which we have, then we won't call inflate() again
> so the garbage pointer should not matter.  (Besides which,
> zlib really shouldn't dereference that pointer if avail_in is 0.)

Yeah, I looked at zlib quite a bit to reach this same conclusion when
studying this problem.

> I'm still baffled as to why only the SLES s390x animals are failing,
> but it's beginning to seem like it might be due to them using a
> different zlib version.

My main suspicion is that they are using hardware-specific calls in a
proprietary fork of libz.  They have a lot of changes in their
release notes that are hardware-specific:
https://www.suse.com/releasenotes/x86_64/SUSE-SLES/15-SP1/

I could not find the RPMs though, and perhaps it is based on zlib-ng
or similar but that's hard to say.

> Horiguchi-san's v3 patch at
> <20200612.145412.475791851624925277.horikyota.ntt@gmail.com>
> doesn't have either of these problems, so it seems much superior
> to me than what you actually did.  I don't have a lot of hope that
> changing to that one would fix the buildfarm problem --- but maybe it
> would, if the machine-dependent behavior is somehow hiding in the
> repeat-call-after-EOF code path.

Perhaps.  It could be possible to give it a try on HEAD, though I
doubt that it would solve the problem :/

For now, more investigation is needed for this SLES behavior though.
So, to put the buildfarm back to green, I am reverting the patch.
--
Michael

Вложения

Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Tom Lane
Дата:
I dug into this problem with access kindly provided by Mark Wong, and
verified that indeed the zlib on that machine acts a bit differently
from stock zlib.  The problem we're facing turns out to be entirely
unrelated to the patch at hand; it's the *compression* side that is
misbehaving.  After some digging in the code and reading the zlib.h
API spec carefully, the answer is that compress_process() completely
mishandles the situation where deflate() stops short of consuming all
the input that's supplied.  It resets the next_in pointer so that
old data is reprocessed, rather than allowing the remaining unprocessed
data to be processed.  We need to do this:

--- a/contrib/pgcrypto/pgp-compress.c
+++ b/contrib/pgcrypto/pgp-compress.c
@@ -114,10 +114,10 @@ compress_process(PushFilter *next, void *priv, const uint8 *data, int len)
     /*
      * process data
      */
-    while (len > 0)
+    st->stream.next_in = unconstify(uint8 *, data);
+    st->stream.avail_in = len;
+    while (st->stream.avail_in > 0)
     {
-        st->stream.next_in = unconstify(uint8 *, data);
-        st->stream.avail_in = len;
         st->stream.next_out = st->buf;
         st->stream.avail_out = st->buf_len;
         res = deflate(&st->stream, 0);
@@ -131,7 +131,6 @@ compress_process(PushFilter *next, void *priv, const uint8 *data, int len)
             if (res < 0)
                 return res;
         }
-        len = st->stream.avail_in;
     }

     return 0;

I suppose this has been broken since day one; it's a bit astonishing (and
disheartening) that nobody's reported a problem here before.

Anyway, with that corrected, the SLES zlib still produces different
output from stock zlib, and indeed seemingly is worse: the result
is noticeably larger than what stock zlib produces.  It does decompress
back to the same thing, though, so this is a performance problem not
a data corruption issue.  That does mean that the proposed test case
fails to exercise our empty-ending-block scenario with this version
of zlib.  I don't think we really care about that, though.

I will go apply this fix, and then you can put back the fix for
the originally-reported problem.  I still like Horiguchi-san's
fix better than what was committed, though.

            regards, tom lane



Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data

От
Michael Paquier
Дата:
On Thu, Jul 23, 2020 at 03:38:43PM -0400, Tom Lane wrote:
> I dug into this problem with access kindly provided by Mark Wong, and
> verified that indeed the zlib on that machine acts a bit differently
> from stock zlib.  The problem we're facing turns out to be entirely
> unrelated to the patch at hand; it's the *compression* side that is
> misbehaving.  After some digging in the code and reading the zlib.h
> API spec carefully, the answer is that compress_process() completely
> mishandles the situation where deflate() stops short of consuming all
> the input that's supplied.  It resets the next_in pointer so that
> old data is reprocessed, rather than allowing the remaining unprocessed
> data to be processed.

Good catch.  Thanks.

> Anyway, with that corrected, the SLES zlib still produces different
> output from stock zlib, and indeed seemingly is worse: the result
> is noticeably larger than what stock zlib produces.  It does decompress
> back to the same thing, though, so this is a performance problem not
> a data corruption issue.  That does mean that the proposed test case
> fails to exercise our empty-ending-block scenario with this version
> of zlib.  I don't think we really care about that, though.

One simple method to test this code path would be to just decompress a
hardcoded bytea value that got weirdly compressed.  I am not sure
either if that's worth the cycles, and I think that it would bloat the
test files a bit.

> I will go apply this fix, and then you can put back the fix for
> the originally-reported problem.  I still like Horiguchi-san's
> fix better than what was committed, though.

Back into business for this issue..  And I have been able to work more
on a SLES15 box thanks to Mark, confirming that b9b6105 got rid of the
compression issue with the test of 9e10898, and that the previous
versions of the patches would take care of the issue for the
decompression now if compressed data was incorrectly shaped.
--
Michael

Вложения