Обсуждение: SSL tests fail on OpenSSL v3.2.0
Hi,
SSL tests fail on OpenSSL v3.2.0. I tested both on macOS (CI) and
debian (my local) and both failed with the same errors. To trigger
these errors on CI, you may need to clear the repository cache;
otherwise macOS won't install the v3.2.0 of the OpenSSL.
001_ssltests:
psql exited with signal 6 (core dumped): 'psql: error: connection to
server at "127.0.0.1", port 56718 failed: server closed the connection
unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
SSL SYSCALL error: Connection reset by peer' while running 'psql -XAtq
-d sslkey=invalid sslcert=invalid sslrootcert=invalid sslcrl=invalid
sslcrldir=invalid user=ssltestuser dbname=trustdb hostaddr=127.0.0.1
host=common-name.pg-ssltest.test sslrootcert=invalid sslmode=require
-f - -w' at /Users/admin/pgsql/src/test/perl/PostgreSQL/Test/Cluster.pm
002_scram:
psql exited with signal 6 (core dumped): 'psql: error: connection to
server at "127.0.0.1", port 54531 failed: server closed the connection
unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
SSL SYSCALL error: Connection reset by peer' while running 'psql -XAtq
-d dbname=trustdb sslmode=require sslcert=invalid sslrootcert=invalid
hostaddr=127.0.0.1 host=localhost user=ssltestuser -f - -w' at
/Users/admin/pgsql/src/test/perl/PostgreSQL/Test/Cluster.pm line 1997.
003_sslinfo:
psql exited with signal 6 (core dumped): 'psql: error: connection to
server at "127.0.0.1", port 59337 failed: server closed the connection
unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
SSL SYSCALL error: Connection reset by peer' while running 'psql -XAtq
-d sslkey=invalid sslcert=invalid sslrootcert=invalid sslcrl=invalid
sslcrldir=invalid sslrootcert=ssl/root+server_ca.crt sslmode=require
dbname=certdb hostaddr=127.0.0.1 host=localhost user=ssltestuser
sslcert=ssl/client_ext.crt
sslkey=/Users/admin/pgsql/build/testrun/ssl/003_sslinfo/data/tmp_test_q11O/client_ext.key
-f - -w' at /Users/admin/pgsql/src/test/perl/PostgreSQL/Test/Cluster.pm
line 1997.
macOS CI run: https://cirrus-ci.com/task/5128008789393408
I couldn't find the cause yet but just wanted to inform you.
-- 
Regards,
Nazir Bilal Yavuz
Microsoft
			
		Nazir, Thanks for opening a thread. Was just about to start one, here what we came up with so far. Homebrew users discovered a regression[0] when using Postgres compiled and linked against OpenSSL version 3.2. $ psql "postgresql://$DB?sslmode=require" psql: error: connection to server at "redacted" (redacted), port 5432 failed: ERROR: Parameter 'user' is missing in startuppacket. double free or corruption (out) Aborted (core dumped) Analyzing the backtrace, OpenSSL was overwriting heap-allocated data in our PGconn struct because it thought BIO::ptr was a struct bss_sock_st *. OpenSSL then called a memset() on a member of that struct, and we zeroed out data in our PGconn struct. BIO_get_data(3) says the following: > These functions are mainly useful when implementing a custom BIO. > > The BIO_set_data() function associates the custom data pointed to by ptr > with the BIO a. This data can subsequently be retrieved via a call to > BIO_get_data(). This can be used by custom BIOs for storing > implementation specific information. If you take a look at my_BIO_s_socket(), we create a partially custom BIO, but for the most part are defaulting to the methods defined by BIO_s_socket(). We need to set application-specific data and not BIO private data, so that the BIO implementation we rely on, can properly assert that its private data is what it expects. The ssl test suite continues to pass with this patch. This patch should be backported to every supported Postgres version most likely. [0]: https://github.com/Homebrew/homebrew-core/issues/155651 -- Tristan Partin Neon (https://neon.tech)
Вложения
Here is a v2 which adds back a comment that was not meant to be removed. -- Tristan Partin Neon (https://neon.tech)
Вложения
On Mon, Nov 27, 2023 at 12:33:49PM -0600, Tristan Partin wrote: > - res = secure_raw_read(((Port *) BIO_get_data(h)), buf, size); > + res = secure_raw_read(((Port *) BIO_get_app_data(h)), buf, size); > BIO_clear_retry_flags(h); > if (res <= 0) Interesting. I have yet to look at that in details, but BIO_get_app_data() exists down to 0.9.8, which is the oldest version we need to support for stable branches. So that looks like a safe bet. > -#ifndef HAVE_BIO_GET_DATA > -#define BIO_get_data(bio) (bio->ptr) > -#define BIO_set_data(bio, data) (bio->ptr = data) > -#endif Shouldn't this patch do a refresh of configure.ac and remove the check on BIO_get_data() if HAVE_BIO_GET_DATA is gone? -- Michael
Вложения
On Mon Nov 27, 2023 at 5:53 PM CST, Michael Paquier wrote: > On Mon, Nov 27, 2023 at 12:33:49PM -0600, Tristan Partin wrote: > > -#ifndef HAVE_BIO_GET_DATA > > -#define BIO_get_data(bio) (bio->ptr) > > -#define BIO_set_data(bio, data) (bio->ptr = data) > > -#endif > > Shouldn't this patch do a refresh of configure.ac and remove the check > on BIO_get_data() if HAVE_BIO_GET_DATA is gone? See the attached v3. I am unfamiliar with autotools, so I just hand edited the configure.ac script instead of whatever "refresh" means. -- Tristan Partin Neon (https://neon.tech)
Вложения
Michael Paquier <michael@paquier.xyz> writes:
> Interesting.  I have yet to look at that in details, but
> BIO_get_app_data() exists down to 0.9.8, which is the oldest version
> we need to support for stable branches.  So that looks like a safe
> bet.
What about LibreSSL?  In general, I'm not too pleased with just assuming
that BIO_get_app_data exists.  If we can do that, we can probably remove
most of the OpenSSL function probes that configure.ac has today.  Even
if that's a good idea in HEAD, I doubt we want to do it all the way back.
I'd be inclined to form the patch more along the lines of
s/BIO_get_data/BIO_get_app_data/g, with a configure check for
BIO_get_app_data and falling back to the existing direct use of
bio->ptr if it's not there.
            regards, tom lane
			
		It was first added in SSLeay 0.8.1 which predates OpenSSL let alone the LibreSSL fork. It probably doesn’t exist in BoringSSL but neither does a lot of things. > On 28 Nov 2023, at 00:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Michael Paquier <michael@paquier.xyz> writes: >> Interesting. I have yet to look at that in details, but >> BIO_get_app_data() exists down to 0.9.8, which is the oldest version >> we need to support for stable branches. So that looks like a safe >> bet. > > What about LibreSSL? In general, I'm not too pleased with just assuming > that BIO_get_app_data exists. If we can do that, we can probably remove > most of the OpenSSL function probes that configure.ac has today. Even > if that's a good idea in HEAD, I doubt we want to do it all the way back. > > I'd be inclined to form the patch more along the lines of > s/BIO_get_data/BIO_get_app_data/g, with a configure check for > BIO_get_app_data and falling back to the existing direct use of > bio->ptr if it's not there. > > regards, tom lane
On Mon Nov 27, 2023 at 6:21 PM CST, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > Interesting.  I have yet to look at that in details, but
> > BIO_get_app_data() exists down to 0.9.8, which is the oldest version
> > we need to support for stable branches.  So that looks like a safe
> > bet.
>
> What about LibreSSL?  In general, I'm not too pleased with just assuming
> that BIO_get_app_data exists.  If we can do that, we can probably remove
> most of the OpenSSL function probes that configure.ac has today.  Even
> if that's a good idea in HEAD, I doubt we want to do it all the way back.
As Bo said, this has been available since before LibreSSL forked off of
OpenSSL.
> I'd be inclined to form the patch more along the lines of
> s/BIO_get_data/BIO_get_app_data/g, with a configure check for
> BIO_get_app_data and falling back to the existing direct use of
> bio->ptr if it's not there.
Falling back to what existed before is invalid. BIO::ptr is private data
for the BIO implementation. BIO_{get,set}_app_data() does
something completely different than setting BIO::ptr. In Postgres we
call BIO_meth_set_create() with BIO_meth_get_create() from
BIO_s_socket(). The create function we pass allocates bi->ptr to
a struct bss_sock_st * as previously stated, and that's been the case
since March 10, 2022[0]. Essentially Postgres only worked because the
BIO implementation didn't use the private data section until the linked
commit. I don't see any reason to keep compatibility with what only
worked by accident.
[0]: https://github.com/openssl/openssl/commit/a3e53d56831adb60d6875297b3339a4251f735d2
--
Tristan Partin
Neon (https://neon.tech)
			
		"Tristan Partin" <tristan@neon.tech> writes:
> On Mon Nov 27, 2023 at 6:21 PM CST, Tom Lane wrote:
>> What about LibreSSL?  In general, I'm not too pleased with just assuming
>> that BIO_get_app_data exists.
> Falling back to what existed before is invalid.
Well, sure it only worked by accident, but it did work with older
OpenSSL versions.  If we assume that BIO_get_app_data exists, and
somebody tries to use it with a version that hasn't got that,
it won't work.
Having said that, my concern was mainly driven by the comments in
configure.ac claiming that this was an OpenSSL 1.1.0 addition.
Looking at the relevant commits, 593d4e47d and 5c6df67e0, it seems
that that was less about "the function doesn't exist before 1.1.0"
and more about "in 1.1.0 we have to use the function because we
can no longer directly access the ptr field".  If the function
does exist in 0.9.8 then I concur that we don't need to test.
            regards, tom lane
			
		On Mon Nov 27, 2023 at 7:14 PM CST, Tom Lane wrote: > "Tristan Partin" <tristan@neon.tech> writes: > > On Mon Nov 27, 2023 at 6:21 PM CST, Tom Lane wrote: > >> What about LibreSSL? In general, I'm not too pleased with just assuming > >> that BIO_get_app_data exists. > > > Falling back to what existed before is invalid. > > Well, sure it only worked by accident, but it did work with older > OpenSSL versions. If we assume that BIO_get_app_data exists, and > somebody tries to use it with a version that hasn't got that, > it won't work. > > Having said that, my concern was mainly driven by the comments in > configure.ac claiming that this was an OpenSSL 1.1.0 addition. > Looking at the relevant commits, 593d4e47d and 5c6df67e0, it seems > that that was less about "the function doesn't exist before 1.1.0" > and more about "in 1.1.0 we have to use the function because we > can no longer directly access the ptr field". If the function > does exist in 0.9.8 then I concur that we don't need to test. I have gone back all the way to 1.0.0 and confirmed that the function exists. Didn't choose to go further than that since Postgres doesn't support it. -- Tristan Partin Neon (https://neon.tech)
"Tristan Partin" <tristan@neon.tech> writes:
> On Mon Nov 27, 2023 at 7:14 PM CST, Tom Lane wrote:
>> ... If the function
>> does exist in 0.9.8 then I concur that we don't need to test.
> I have gone back all the way to 1.0.0 and confirmed that the function 
> exists. Didn't choose to go further than that since Postgres doesn't 
> support it.
Since this is something we'd need to back-patch, OpenSSL 0.9.8
and later are relevant: the v12 branch still supports those.
It's moot given Bo's claim about the origin of the function,
though.
            regards, tom lane
			
		I can confirm that we also fail when using up-to-date MacPorts, which
seems to have started shipping 3.2.0 last week or so.  I tried the v3
patch, and while that stops the crash, it looks like 3.2.0 has also
made some random changes in error messages:
# +++ tap check in src/test/ssl +++
t/001_ssltests.pl .. 163/?
#   Failed test 'certificate authorization fails with revoked client cert: matches'
#   at t/001_ssltests.pl line 775.
#                   'psql: error: connection to server at "127.0.0.1", port 58332 failed: SSL error: ssl/tls alert
certificaterevoked' 
#     doesn't match '(?^:SSL error: sslv3 alert certificate revoked)'
#   Failed test 'certificate authorization fails with revoked client cert with server-side CRL directory: matches'
#   at t/001_ssltests.pl line 880.
#                   'psql: error: connection to server at "127.0.0.1", port 58332 failed: SSL error: ssl/tls alert
certificaterevoked' 
#     doesn't match '(?^:SSL error: sslv3 alert certificate revoked)'
#   Failed test 'certificate authorization fails with revoked UTF-8 client cert with server-side CRL directory:
matches'
#   at t/001_ssltests.pl line 893.
#                   'psql: error: connection to server at "127.0.0.1", port 58332 failed: SSL error: ssl/tls alert
certificaterevoked' 
#     doesn't match '(?^:SSL error: sslv3 alert certificate revoked)'
# Looks like you failed 3 tests of 205.
t/001_ssltests.pl .. Dubious, test returned 3 (wstat 768, 0x300)
Failed 3/205 subtests
t/002_scram.pl ..... ok
t/003_sslinfo.pl ... ok
Guess we'll need to adjust the test script a bit too.
            regards, tom lane
			
		On Mon, Nov 27, 2023 at 08:32:28PM -0500, Tom Lane wrote: > Since this is something we'd need to back-patch, OpenSSL 0.9.8 > and later are relevant: the v12 branch still supports those. > It's moot given Bo's claim about the origin of the function, > though. Yep, unfortunately this needs to be checked down to 0.9.8. I've just done this exercise yesterday for another backpatch.. -- Michael
Вложения
On Mon, Nov 27, 2023 at 09:04:23PM -0500, Tom Lane wrote: > I can confirm that we also fail when using up-to-date MacPorts, which > seems to have started shipping 3.2.0 last week or so. I tried the v3 > patch, and while that stops the crash, it looks like 3.2.0 has also > made some random changes in error messages: > > Failed 3/205 subtests > t/002_scram.pl ..... ok > t/003_sslinfo.pl ... ok > > Guess we'll need to adjust the test script a bit too. Sigh. We could use an extra check_pg_config() with a routine new in 3.2.0. Looking at CHANGES.md, SSL_get0_group_name() seems to be one generic choice here. -- Michael
Вложения
On Tue, Nov 28, 2023 at 12:55:37PM +0900, Michael Paquier wrote: > Sigh. We could use an extra check_pg_config() with a routine new in > 3.2.0. Looking at CHANGES.md, SSL_get0_group_name() seems to be one > generic choice here. Or even simpler: plant a (ssl\/tls|sslv3) in these strings. -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes:
> Or even simpler: plant a (ssl\/tls|sslv3) in these strings.
Yeah, weakening the pattern match was what I had in mind.
I was thinking of something like "ssl[a-z0-9/]*" but your
proposal works too.
            regards, tom lane
			
		> On 28 Nov 2023, at 01:29, Bo Anderson <mail@boanderson.me> wrote: > It probably doesn’t exist in BoringSSL but neither does a lot of things. Thats not an issue, we don't support building with BoringSSL. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes:
> Thats not an issue, we don't support building with BoringSSL.
Right.  I'll work on getting this pushed, unless someone else
is already on it?
            regards, tom lane
			
		On Tue Nov 28, 2023 at 9:00 AM CST, Tom Lane wrote: > Daniel Gustafsson <daniel@yesql.se> writes: > > Thats not an issue, we don't support building with BoringSSL. > > Right. I'll work on getting this pushed, unless someone else > is already on it? When you say "this" are you referring to the patch I sent or adding support for BoringSSL? -- Tristan Partin Neon (https://neon.tech)
How are you guys running the tests? I have PG_TEST_EXTRA=ssl and everything passes for me. Granted, I am using the Meson build. -- Tristan Partin Neon (https://neon.tech)
"Tristan Partin" <tristan@neon.tech> writes:
> How are you guys running the tests? I have PG_TEST_EXTRA=ssl and 
> everything passes for me. Granted, I am using the Meson build.
I'm doing what it says in test/ssl/README:
    make check PG_TEST_EXTRA=ssl
I don't know whether the meson build has support for running these
extra "unsafe" tests or not.
            regards, tom lane
			
		On Tue Nov 28, 2023 at 9:31 AM CST, Tom Lane wrote: > "Tristan Partin" <tristan@neon.tech> writes: > > How are you guys running the tests? I have PG_TEST_EXTRA=ssl and > > everything passes for me. Granted, I am using the Meson build. > > I'm doing what it says in test/ssl/README: > > make check PG_TEST_EXTRA=ssl > > I don't know whether the meson build has support for running these > extra "unsafe" tests or not. Thanks Tom. I'll check again. Maybe I didn't set LD_LIBRARY_PATH when running the tests. I have openssl installing to a non-default prefix. -- Tristan Partin Neon (https://neon.tech)
"Tristan Partin" <tristan@neon.tech> writes:
> When you say "this" are you referring to the patch I sent or adding 
> support for BoringSSL?
I have no interest in supporting BoringSSL.  I just replied to
Daniel's comment because it seemed to resolve the last concern
about whether your patch is OK.
            regards, tom lane
			
		On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote: > "Tristan Partin" <tristan@neon.tech> writes: > > When you say "this" are you referring to the patch I sent or adding > > support for BoringSSL? > > I have no interest in supporting BoringSSL. I just replied to > Daniel's comment because it seemed to resolve the last concern > about whether your patch is OK. If you haven't started fixing the tests, then I'll get to work on it and send a new revision before the end of the day. Thanks! -- Tristan Partin Neon (https://neon.tech)
"Tristan Partin" <tristan@neon.tech> writes:
> On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote:
>> I have no interest in supporting BoringSSL.  I just replied to
>> Daniel's comment because it seemed to resolve the last concern
>> about whether your patch is OK.
> If you haven't started fixing the tests, then I'll get to work on it and 
> send a new revision before the end of the day. Thanks!
No need, I can finish it up from here.
            regards, tom lane
			
		On Tue Nov 28, 2023 at 10:06 AM CST, Tom Lane wrote: > "Tristan Partin" <tristan@neon.tech> writes: > > On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote: > >> I have no interest in supporting BoringSSL. I just replied to > >> Daniel's comment because it seemed to resolve the last concern > >> about whether your patch is OK. > > > If you haven't started fixing the tests, then I'll get to work on it and > > send a new revision before the end of the day. Thanks! > > No need, I can finish it up from here. Sweet. I appreciate your help. -- Tristan Partin Neon (https://neon.tech)
FTR, I've pushed this and the buildfarm seems happy.  In particular,
I just updated indri to the latest MacPorts packages including
OpenSSL 3.2.0, so we'll have coverage of that going forward.
            regards, tom lane
			
		On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote:
> "Tristan Partin" <tristan@neon.tech> writes:
> > When you say "this" are you referring to the patch I sent or adding
> > support for BoringSSL?
>
> I have no interest in supporting BoringSSL.
Funnily enough, here[0] is BoringSSL adding the BIO_{get,set}_app_data()
APIs.
[0]: https://github.com/google/boringssl/commit/2139aba2e3e28cd1cdefbd9b48e2c31a75441203
--
Tristan Partin
Neon (https://neon.tech)
			
		> On 29 Nov 2023, at 16:21, Tristan Partin <tristan@neon.tech> wrote:
>
> On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote:
>> "Tristan Partin" <tristan@neon.tech> writes:
>> > When you say "this" are you referring to the patch I sent or adding > support for BoringSSL?
>>
>> I have no interest in supporting BoringSSL.
>
> Funnily enough, here[0] is BoringSSL adding the BIO_{get,set}_app_data() APIs.
Still doesn't seem like a good candidate for a postgres TLS library since they
themselves claim:
   "Although BoringSSL is an open source project, it is not intended for
    general use, as OpenSSL is.  We don't recommend that third parties depend
    upon it.  Doing so is likely to be frustrating because there are no
    guarantees of API or ABI stability."
--
Daniel Gustafsson
			
		Daniel Gustafsson <daniel@yesql.se> writes:
> On 29 Nov 2023, at 16:21, Tristan Partin <tristan@neon.tech> wrote:
>> Funnily enough, here[0] is BoringSSL adding the BIO_{get,set}_app_data() APIs.
> Still doesn't seem like a good candidate for a postgres TLS library since they
> themselves claim:
>    "Although BoringSSL is an open source project, it is not intended for
>     general use, as OpenSSL is.  We don't recommend that third parties depend
>     upon it.  Doing so is likely to be frustrating because there are no
>     guarantees of API or ABI stability."
Kind of odd that, with that mission statement, they are adding
BIO_{get,set}_app_data on the justification that OpenSSL has it
and Postgres is starting to use it.  Nonetheless, that commit
also seems to prove the point about lack of API/ABI stability.
I'm content to take their advice and not try to support BoringSSL.
It's not clear what benefit to us there would be, and we already
have our hands full coping with all the different OpenSSL and LibreSSL
versions.
            regards, tom lane
			
		On Wed Nov 29, 2023 at 10:32 AM CST, Tom Lane wrote:
> Daniel Gustafsson <daniel@yesql.se> writes:
> > On 29 Nov 2023, at 16:21, Tristan Partin <tristan@neon.tech> wrote:
> >> Funnily enough, here[0] is BoringSSL adding the BIO_{get,set}_app_data() APIs.
>
> > Still doesn't seem like a good candidate for a postgres TLS library since they
> > themselves claim:
> >    "Although BoringSSL is an open source project, it is not intended for
> >     general use, as OpenSSL is.  We don't recommend that third parties depend
> >     upon it.  Doing so is likely to be frustrating because there are no
> >     guarantees of API or ABI stability."
>
> Kind of odd that, with that mission statement, they are adding
> BIO_{get,set}_app_data on the justification that OpenSSL has it
> and Postgres is starting to use it.  Nonetheless, that commit
> also seems to prove the point about lack of API/ABI stability.
>
> I'm content to take their advice and not try to support BoringSSL.
> It's not clear what benefit to us there would be, and we already
> have our hands full coping with all the different OpenSSL and LibreSSL
> versions.
Yep, I just wanted to point it out in the interest of relevancy to our
conversation yesterday :).
--
Tristan Partin
Neon (https://neon.tech)
			
		On 2023-Nov-29, Tom Lane wrote:
> Kind of odd that, with that mission statement, they are adding
> BIO_{get,set}_app_data on the justification that OpenSSL has it
> and Postgres is starting to use it.  Nonetheless, that commit
> also seems to prove the point about lack of API/ABI stability.
As I understand it, this simply means that Google is already building
their own fork of Postgres, patching it to use BoringSSL.  (This makes
sense, since they offer Postgres databases in their cloud offerings.)
They don't need PGDG to support BoringSSL, but they do need to make sure
that BoringSSL is able to support being used by Postgres.
> I'm content to take their advice and not try to support BoringSSL.
That seems the right reaction.  It is not our problem.
-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Having your biases confirmed independently is how scientific progress is
made, and hence made our great society what it is today" (Mary Gardiner)
			
		I ran into an SSL issue when using the MSYS2/MINGW build of Postgres for the PgBouncer test suite. Postgres crashed whenever you tried to open an ssl connection to it. https://github.com/msys2/MINGW-packages/issues/19851 I'm wondering if the issue described in this thread could be related to the issue I ran into. Afaict the merged patch has not been released yet.
On Wed Jan 24, 2024 at 9:58 AM CST, Jelte Fennema-Nio wrote: > I ran into an SSL issue when using the MSYS2/MINGW build of Postgres > for the PgBouncer test suite. Postgres crashed whenever you tried to > open an ssl connection to it. > https://github.com/msys2/MINGW-packages/issues/19851 > > I'm wondering if the issue described in this thread could be related > to the issue I ran into. Afaict the merged patch has not been released > yet. Do you have a backtrace? Given that the version is 3.2.0, seems likely. -- Tristan Partin Neon (https://neon.tech)