Обсуждение: psql client does not handle WSAEWOULDBLOCK on Windows
Description:
The connection fails with a non-blocking socket error when using psql on
Windows to connect to a PostgreSQL server with GSSAPI enabled. The error is
because the socket error code is obtained by WSAGetLastError() instead of
errno. This causes the value of errno to be incorrect when handling a
non-blocking socket error.
Steps to Reproduce:
1. Compile PostgreSQL client (psql) on Windows.
a. Make sure MIT Kerberos is installed. I use the latest version MIT Kerberos
Version 4.1.
b. Make sure GSSAPI is enabled
2. Attempt to connect to a PostgreSQL server using psql.
a. Set up the Kerberos server and configure the PostgreSQL server by referring
to https://github.com/50wu/kerberos-docker/blob/main/POSTGRES.README.md
b. change the entry to hostgssenc on PostgreSQL server pg_hba.conf and restart
hostgssenc all all 0.0.0.0/0 gss include_realm=0 krb_realm=GPDB.KRB
c. Use the following command to connect to the database server
psql -h <hostname> -U "postgres/krb5-service-example-com.example.com" -d postgres
3. The connection fails with a non-blocking socket error. The error is something like:
psql: error: connection to server at "xxx", port 5432 failed:
Environment:
PostgreSQL version: 16.3
Operating System: Windows 11
Fix Steps:
In the gss_read function of src/interfaces/libpq/fe-secure-gssapi.c, change the
check of the error code to use the SOCK_ERRNO to make sure that EAGAIN,
EWOULDBLOCK and EINTR can be properly handled on Windows and other platforms.
The patch file is attached to this email, please review and consider merging it to
the main code library.
Thanks,
Ning Wu
The connection fails with a non-blocking socket error when using psql on
Windows to connect to a PostgreSQL server with GSSAPI enabled. The error is
because the socket error code is obtained by WSAGetLastError() instead of
errno. This causes the value of errno to be incorrect when handling a
non-blocking socket error.
Steps to Reproduce:
1. Compile PostgreSQL client (psql) on Windows.
a. Make sure MIT Kerberos is installed. I use the latest version MIT Kerberos
Version 4.1.
b. Make sure GSSAPI is enabled
2. Attempt to connect to a PostgreSQL server using psql.
a. Set up the Kerberos server and configure the PostgreSQL server by referring
to https://github.com/50wu/kerberos-docker/blob/main/POSTGRES.README.md
b. change the entry to hostgssenc on PostgreSQL server pg_hba.conf and restart
hostgssenc all all 0.0.0.0/0 gss include_realm=0 krb_realm=GPDB.KRB
c. Use the following command to connect to the database server
psql -h <hostname> -U "postgres/krb5-service-example-com.example.com" -d postgres
3. The connection fails with a non-blocking socket error. The error is something like:
psql: error: connection to server at "xxx", port 5432 failed:
Environment:
PostgreSQL version: 16.3
Operating System: Windows 11
Fix Steps:
In the gss_read function of src/interfaces/libpq/fe-secure-gssapi.c, change the
check of the error code to use the SOCK_ERRNO to make sure that EAGAIN,
EWOULDBLOCK and EINTR can be properly handled on Windows and other platforms.
The patch file is attached to this email, please review and consider merging it to
the main code library.
Thanks,
Ning Wu
Вложения
I have not reproduce your test scenario, looking at code please see following comments: If you check the function definition of pqsecure_raw_read() it actually do set errno like bellow SOCK_ERRNO_SET(result_errno); where result_errno = SOCK_ERRNO Means anybody using those function pqsecure_raw_read/write, does not need to take care of portable ERRNO. Regards Umar Hayat The new status of this patch is: Waiting on Author
Hi Umar,
In the function of gss_read() if print the value of errno and SOCK_ERRNO
separately, I found the values are different:
*ret = pqsecure_raw_read(conn, recv_buffer, length);
if (*ret < 0)
{
printf("errno: %d\n", errno);
printf("result_errno: %d\n", SOCK_ERRNO);
...
errno: 0
result_errno: 10035
Also refer to the
https://learn.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-wsagetlasterror,
It shows that the Windows Sockets function does not set errno, but uses
WSAGetLastError to report errors. And refer to the
https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-recv,
If the function fails, it should call the WSAGetLastError to get the expansion
error message. This further shows that the socket operation error will not be
reported through the errno.
So changing the error code check to use the SOCK_ERRNO instead of errno can be
properly handled on both Windows and other platforms.
separately, I found the values are different:
*ret = pqsecure_raw_read(conn, recv_buffer, length);
if (*ret < 0)
{
printf("errno: %d\n", errno);
printf("result_errno: %d\n", SOCK_ERRNO);
...
errno: 0
result_errno: 10035
Also refer to the
https://learn.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-wsagetlasterror,
It shows that the Windows Sockets function does not set errno, but uses
WSAGetLastError to report errors. And refer to the
https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-recv,
If the function fails, it should call the WSAGetLastError to get the expansion
error message. This further shows that the socket operation error will not be
reported through the errno.
So changing the error code check to use the SOCK_ERRNO instead of errno can be
properly handled on both Windows and other platforms.
To reproduce the issue, I used the following version of Postgres and MIT Kerberos:
PostgreSQL version: 16.3
MIT Kerberos Version 4.1
Operating System: Windows 11
Visual Studio 2022
Operating System: Windows 11
Visual Studio 2022
On Tue, Jul 30, 2024 at 4:47 PM Ning <ning94803@gmail.com> wrote:
Description:
The connection fails with a non-blocking socket error when using psql on
Windows to connect to a PostgreSQL server with GSSAPI enabled. The error is
because the socket error code is obtained by WSAGetLastError() instead of
errno. This causes the value of errno to be incorrect when handling a
non-blocking socket error.
Steps to Reproduce:
1. Compile PostgreSQL client (psql) on Windows.
a. Make sure MIT Kerberos is installed. I use the latest version MIT Kerberos
Version 4.1.
b. Make sure GSSAPI is enabled
2. Attempt to connect to a PostgreSQL server using psql.
a. Set up the Kerberos server and configure the PostgreSQL server by referring
to https://github.com/50wu/kerberos-docker/blob/main/POSTGRES.README.md
b. change the entry to hostgssenc on PostgreSQL server pg_hba.conf and restart
hostgssenc all all 0.0.0.0/0 gss include_realm=0 krb_realm=GPDB.KRB
c. Use the following command to connect to the database server
psql -h <hostname> -U "postgres/krb5-service-example-com.example.com" -d postgres
3. The connection fails with a non-blocking socket error. The error is something like:
psql: error: connection to server at "xxx", port 5432 failed:
Environment:
PostgreSQL version: 16.3
Operating System: Windows 11
Fix Steps:
In the gss_read function of src/interfaces/libpq/fe-secure-gssapi.c, change the
check of the error code to use the SOCK_ERRNO to make sure that EAGAIN,
EWOULDBLOCK and EINTR can be properly handled on Windows and other platforms.
The patch file is attached to this email, please review and consider merging it to
the main code library.
Thanks,
Ning Wu
Michael Paquier <michael@paquier.xyz> writes: > This is going to require some platform-specific check that I don't > have with me, though I am ready to accept that what you are telling > here is true and that we should apply this macro. Could somebody > check that a bit more in depth? Andrew D. perhaps? > One thing that I don't understand is why does this only apply after > the first call of pqsecure_raw_read() in gss_read()? There is a > second call of pqsecure_raw_read() you are not covering but it would > surely need the same treatment, no? > Also, what about pqsecure_raw_write() in pqsecure_open_gss()? > Shouldn't the same check apply? Yeah, I think we pretty much need to use SOCK_ERRNO, SOCK_ERRNO_SET, and SOCK_STRERROR (if relevant) throughout fe-secure-gssapi.c. Directly using errno is correct for syscalls related to the file system, but I think everything in this file is dealing with socket-related errors. Certainly the underlying pqsecure_raw_read and pqsecure_raw_write layer expects those macros to be used. Like you, I'm not really in a position to test this on Windows ... regards, tom lane
I wrote: > Michael Paquier <michael@paquier.xyz> writes: >> Also, what about pqsecure_raw_write() in pqsecure_open_gss()? >> Shouldn't the same check apply? > Yeah, I think we pretty much need to use SOCK_ERRNO, SOCK_ERRNO_SET, > and SOCK_STRERROR (if relevant) throughout fe-secure-gssapi.c. Since nothing seems to be happening here, I took another look and decided that the required changes are really pretty straightforward. fe-secure-gssapi.c doesn't contain any strerror calls, and its touches of errno all appear to relate to socket errors, so we can just change them all. As attached. > Like you, I'm not really in a position to test this on Windows ... I'm still not, but it's straightforward enough that I'm willing to push on my own authority. I'll just put this up for long enough to make sure that the cfbot is happy with it --- though I think it's not building with GSSAPI on Windows, so that may prove little. regards, tom lane From 36f3ee942ef12801e52ce3fc82204455345939a6 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun, 5 Oct 2025 15:05:38 -0400 Subject: [PATCH v2] Use SOCK_ERRNO[_SET] in fe-secure-gssapi.c. On Windows, this code did not handle error conditions correctly at all, since it looked at "errno" which is not used for socket-related errors on that platform. This resulted, for example, in failure to connect to a PostgreSQL server with GSSAPI enabled. We have a convention for dealing with this within libpq, which is to use SOCK_ERRNO and SOCK_ERRNO_SET rather than touching errno directly; but the GSSAPI code is a relative latecomer and did not get that memo. (The equivalent backend code continues to use errno, because the backend does this differently. Maybe libpq's approach should be rethought someday.) Author: Ning Wu <ning94803@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFGqpvg-pRw=cdsUpKYfwY6D3d-m9tw8WMcAEE7HHWfm-oYWvw@mail.gmail.com Backpatch-through: 13 --- src/interfaces/libpq/fe-secure-gssapi.c | 27 ++++++++++++++----------- 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c index bc9e1ce06fa..843b31e175f 100644 --- a/src/interfaces/libpq/fe-secure-gssapi.c +++ b/src/interfaces/libpq/fe-secure-gssapi.c @@ -121,7 +121,7 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) { appendPQExpBufferStr(&conn->errorMessage, "GSSAPI caller failed to retransmit all data needing to be retried\n"); - errno = EINVAL; + SOCK_ERRNO_SET(EINVAL); return -1; } @@ -199,14 +199,14 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) if (major != GSS_S_COMPLETE) { pg_GSS_error(libpq_gettext("GSSAPI wrap error"), conn, major, minor); - errno = EIO; /* for lack of a better idea */ + SOCK_ERRNO_SET(EIO); /* for lack of a better idea */ goto cleanup; } if (conf_state == 0) { libpq_append_conn_error(conn, "outgoing GSSAPI message would not use confidentiality"); - errno = EIO; /* for lack of a better idea */ + SOCK_ERRNO_SET(EIO); /* for lack of a better idea */ goto cleanup; } @@ -215,7 +215,7 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) libpq_append_conn_error(conn, "client tried to send oversize GSSAPI packet (%zu > %zu)", (size_t) output.length, PQ_GSS_MAX_PACKET_SIZE - sizeof(uint32)); - errno = EIO; /* for lack of a better idea */ + SOCK_ERRNO_SET(EIO); /* for lack of a better idea */ goto cleanup; } @@ -341,7 +341,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len) /* If we still haven't got the length, return to the caller */ if (PqGSSRecvLength < sizeof(uint32)) { - errno = EWOULDBLOCK; + SOCK_ERRNO_SET(EWOULDBLOCK); return -1; } } @@ -354,7 +354,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len) libpq_append_conn_error(conn, "oversize GSSAPI packet sent by the server (%zu > %zu)", (size_t) input.length, PQ_GSS_MAX_PACKET_SIZE - sizeof(uint32)); - errno = EIO; /* for lack of a better idea */ + SOCK_ERRNO_SET(EIO); /* for lack of a better idea */ return -1; } @@ -373,7 +373,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len) /* If we don't yet have the whole packet, return to the caller */ if (PqGSSRecvLength - sizeof(uint32) < input.length) { - errno = EWOULDBLOCK; + SOCK_ERRNO_SET(EWOULDBLOCK); return -1; } @@ -393,7 +393,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len) pg_GSS_error(libpq_gettext("GSSAPI unwrap error"), conn, major, minor); ret = -1; - errno = EIO; /* for lack of a better idea */ + SOCK_ERRNO_SET(EIO); /* for lack of a better idea */ goto cleanup; } @@ -401,7 +401,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len) { libpq_append_conn_error(conn, "incoming GSSAPI message did not use confidentiality"); ret = -1; - errno = EIO; /* for lack of a better idea */ + SOCK_ERRNO_SET(EIO); /* for lack of a better idea */ goto cleanup; } @@ -437,7 +437,8 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret) *ret = pqsecure_raw_read(conn, recv_buffer, length); if (*ret < 0) { - if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR) + if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK || + SOCK_ERRNO == EINTR) return PGRES_POLLING_READING; else return PGRES_POLLING_FAILED; @@ -457,7 +458,8 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret) *ret = pqsecure_raw_read(conn, recv_buffer, length); if (*ret < 0) { - if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR) + if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK || + SOCK_ERRNO == EINTR) return PGRES_POLLING_READING; else return PGRES_POLLING_FAILED; @@ -520,7 +522,8 @@ pqsecure_open_gss(PGconn *conn) ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendNext, amount); if (ret < 0) { - if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR) + if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK || + SOCK_ERRNO == EINTR) return PGRES_POLLING_WRITING; else return PGRES_POLLING_FAILED; -- 2.43.7