Обсуждение: [MASSMAIL]pgsql: Refactor libpq state machine for negotiating encryption
[MASSMAIL]pgsql: Refactor libpq state machine for negotiating encryption
От
Heikki Linnakangas
Дата:
Refactor libpq state machine for negotiating encryption This fixes the few corner cases noted in commit 705843d294, as shown by the changes in the test. Author: Heikki Linnakangas, Matthias van de Meent Reviewed-by: Jacob Champion Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/05fd30c0e730bd5238f62d2fdfdcfaf28b16b225 Modified Files -------------- src/interfaces/libpq/fe-connect.c | 427 ++++++++++++--------- src/interfaces/libpq/libpq-int.h | 14 +- .../libpq_encryption/t/001_negotiate_encryption.pl | 26 +- 3 files changed, 265 insertions(+), 202 deletions(-)
On Mon, Apr 8, 2024 at 1:25 PM Heikki Linnakangas <heikki.linnakangas@iki.fi> wrote: > Refactor libpq state machine for negotiating encryption > > This fixes the few corner cases noted in commit 705843d294, as shown > by the changes in the test. > > Author: Heikki Linnakangas, Matthias van de Meent > Reviewed-by: Jacob Champion If -Dssl=none and -Dgssapi=disabled, compilation of fe-connect.c fails: call to undeclared function 'encryption_negotiation_failed'. I didn't look too hard, but maybe ENABLE_GSS and USE_GSS are confused?
Thomas Munro <thomas.munro@gmail.com> writes: > If -Dssl=none and -Dgssapi=disabled, compilation of fe-connect.c > fails: call to undeclared function 'encryption_negotiation_failed'. I > didn't look too hard, but maybe ENABLE_GSS and USE_GSS are confused? For me, configure --with-gssapi fails like that, but the other three combinations of --with-openssl and --with-gssapi compile OK. I don't find it terribly surprising that the buildfarm isn't covering that combination ... regards, tom lane
On Thu, Apr 11, 2024 at 11:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > If -Dssl=none and -Dgssapi=disabled, compilation of fe-connect.c > > fails: call to undeclared function 'encryption_negotiation_failed'. I > > didn't look too hard, but maybe ENABLE_GSS and USE_GSS are confused? > > For me, configure --with-gssapi fails like that, but the other three > combinations of --with-openssl and --with-gssapi compile OK. I don't > find it terribly surprising that the buildfarm isn't covering that > combination ... Oops, right, correction to my report: it is indeed -Dssl=none -Dgssapi=enabled that is broken, not the other combinations.
Thanks for the report! On 11/04/2024 02:33, Thomas Munro wrote: > On Thu, Apr 11, 2024 at 11:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Thomas Munro <thomas.munro@gmail.com> writes: >>> If -Dssl=none and -Dgssapi=disabled, compilation of fe-connect.c >>> fails: call to undeclared function 'encryption_negotiation_failed'. I >>> didn't look too hard, but maybe ENABLE_GSS and USE_GSS are confused? >> >> For me, configure --with-gssapi fails like that, but the other three >> combinations of --with-openssl and --with-gssapi compile OK. I don't >> find it terribly surprising that the buildfarm isn't covering that >> combination ... > > Oops, right, correction to my report: it is indeed -Dssl=none > -Dgssapi=enabled that is broken, not the other combinations. Yes, I misspelled ENABLE_GSS as USE_GSS. After fixing that, the new tests are failing; the expected output for many of the cases is different when GSSAPI support is not compiled in. I think the test tables need to be rearranged some more to take that into account, or we will end up with a ridiculous amount of different expected outputs. I will take a closer look at that tomorrow. As a bandaid fix, we could temporarily disable the new tests with that combination of configure options, it's still better test coverage than not having the tests at all. But given that no buildfarm members are testing that combination I think it can wait a day. -- Heikki Linnakangas Neon (https://neon.tech)
On 11/04/2024 20:07, Heikki Linnakangas wrote: > On 11/04/2024 02:33, Thomas Munro wrote: >> On Thu, Apr 11, 2024 at 11:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Thomas Munro <thomas.munro@gmail.com> writes: >>>> If -Dssl=none and -Dgssapi=disabled, compilation of fe-connect.c >>>> fails: call to undeclared function 'encryption_negotiation_failed'. I >>>> didn't look too hard, but maybe ENABLE_GSS and USE_GSS are confused? >>> >>> For me, configure --with-gssapi fails like that, but the other three >>> combinations of --with-openssl and --with-gssapi compile OK. I don't >>> find it terribly surprising that the buildfarm isn't covering that >>> combination ... >> >> Oops, right, correction to my report: it is indeed -Dssl=none >> -Dgssapi=enabled that is broken, not the other combinations. > > Yes, I misspelled ENABLE_GSS as USE_GSS. > > After fixing that, the new tests are failing; the expected output for > many of the cases is different when GSSAPI support is not compiled in. I > think the test tables need to be rearranged some more to take that into > account, or we will end up with a ridiculous amount of different > expected outputs. > > I will take a closer look at that tomorrow. As a bandaid fix, we could > temporarily disable the new tests with that combination of configure > options, it's still better test coverage than not having the tests at > all. But given that no buildfarm members are testing that combination I > think it can wait a day. Fixed the compilation with that combination, and the expected test output. Thanks for the report! -- Heikki Linnakangas Neon (https://neon.tech)
On 08.04.24 03:25, Heikki Linnakangas wrote: > Refactor libpq state machine for negotiating encryption > > This fixes the few corner cases noted in commit 705843d294, as shown > by the changes in the test. Either this or something nearby appears to have broken the error reporting from psql or libpq when failing to get an SSL connection: PG16: psql 'sslmode=require host=localhost' psql: error: connection to server at "localhost" (::1), port 65432 failed: server does not support SSL, but SSL was required master: psql 'sslmode=require host=localhost' psql: error: connection to server at "localhost" (::1), port 65432 failed: (sic, the output ends after the colon). This commit removed that detail error message string from the code, but I don't see any similar message that would take its place.
On 24/04/2024 15:57, Peter Eisentraut wrote: > On 08.04.24 03:25, Heikki Linnakangas wrote: >> Refactor libpq state machine for negotiating encryption >> >> This fixes the few corner cases noted in commit 705843d294, as shown >> by the changes in the test. > > Either this or something nearby appears to have broken the error > reporting from psql or libpq when failing to get an SSL connection: > > PG16: > > psql 'sslmode=require host=localhost' > psql: error: connection to server at "localhost" (::1), port 65432 > failed: server does not support SSL, but SSL was required > > master: > > psql 'sslmode=require host=localhost' > psql: error: connection to server at "localhost" (::1), port 65432 failed: > > (sic, the output ends after the colon). > > This commit removed that detail error message string from the code, but > I don't see any similar message that would take its place. Thanks, attached patch puts back those messages. This makes one change in the error message behavior compared to v16, in the case that the server responds to GSSRequest with an error instead of rejecting it with 'N'. Previously, libpq would hide the error that the server sent in that case, assuming that it was because the server is an old pre-v12 version that doesn't support GSS and doesn't understand the GSSRequest message. A v11 server responds with a "FATAL: unsupported frontend protocol 1234.5680: server supports 2.0 to 3.0" error if you try to connect to it with GSS. That was a reasonable assumption when the feature was introduced, but v12 was released a long time ago and I don't think it's the most probable cause anymore. The attached patch changes things so that libpq prints the error message that the server sent in that case, making the "server responds with error to GSSRequest" case behave the same as the "server responds with error to SSLRequest" case. We could keep the old behavior if we wanted to, but this is the most convenient way to handle this in the new libpq code, and makes sense anyway IMHO. While testing this some more, I noticed this existing case with stable versions: psql "host=enc-test-localhost.postgresql.example.com hostaddr=127.0.0.1 sslmode=disable gssencmode=require" psql: error: connection to server at "127.0.0.1", port 5432 failed: In the server log: 024-04-25 16:38:59.372 EEST [3939163] LOG: could not accept GSSAPI security context 2024-04-25 16:38:59.372 EEST [3939163] DETAIL: No credentials were supplied, or the credentials were unavailable or inaccessible: Key table entry not found I didn't have everything configured correctly, which is why it failed. That's fine. But the psql error message is missing here too. The attached patch does not fix that case. I think the correct fix would go somewhere in pqsecure_open_gss(). Whenever pqsecure_open_gss() returns PGRES_POLLING_FAILED, it should also put an error in the error buffer with libpq_append_conn_error(). The corresponding pgtls_open_client() function for TLS does that, and so does pqsecure_open_gss() for many other cases, but apparently not that one. -- Heikki Linnakangas Neon (https://neon.tech)