Обсуждение: [PATCH] Fix socket handle inheritance on Windows

Поиск
Список
Период
Сортировка

[PATCH] Fix socket handle inheritance on Windows

От
Bryan Green
Дата:
Greetings,

I've discovered that PostgreSQL on Windows has a handle inheritance
problem that prevents clean restarts after the postmaster is killed
while child processes are running.

The issue is that Windows handles (files, sockets, pipes, shared memory)
are inheritable by default. When backends spawn child processes
archive_command, COPY TO PROGRAM, etc.—those children inherit all the
backend's handles. Windows uses reference counting, so inherited handles
keep resources alive even after the owning process exits.

I reproduced this with sockets:

1. Started PostgreSQL on port 6565
2. Connected with psql and ran:
   \copy (select 1) to program 'powershell -Command "Start-Sleep 300"'
3. Used Sysinternals handle64.exe to examine handles:
   - PowerShell had inherited socket handles (\Device\Afd)
   - Same handle values in both processes (proving inheritance, not
separate opens)
4. Killed the postmaster
5. netstat showed port 6565 still LISTENING on the dead postmaster PID
6. Restart failed: "Address already in use"
7. Port only freed after killing PowerShell

The socket fix adds WSA_FLAG_NO_HANDLE_INHERIT to WSASocket() in
pgwin32_socket(), and calls SetHandleInformation() in
BackendInitialize() to mark the inherited client socket non-inheritable.
The latter is needed because handles passed to children become
inheritable again on Windows.

TAP test included that verifies the port is freed immediately after
postmaster exit rather than remaining in a zombie state.

The problem affects multiple handle types:

Files: https://commitfest.postgresql.org/patch/6197/
Sockets: Fixed by attached patch
Pipes: Not yet addressed
Shared memory: Not yet addressed (causes "pre-existing shared memory
block" errors)

Patches for pipes and shared memory will follow over the next couple of
days.

-- 
Bryan Green
EDB: https://www.enterprisedb.com

Вложения

Re: [PATCH] Fix socket handle inheritance on Windows

От
Bryan Green
Дата:
On 11/5/2025 11:06 PM, Bryan Green wrote:
> Greetings,
> 
> I've discovered that PostgreSQL on Windows has a handle inheritance
> problem that prevents clean restarts after the postmaster is killed
> while child processes are running.
> 
> The issue is that Windows handles (files, sockets, pipes, shared memory)
> are inheritable by default. When backends spawn child processes
> archive_command, COPY TO PROGRAM, etc.—those children inherit all the
> backend's handles. Windows uses reference counting, so inherited handles
> keep resources alive even after the owning process exits.
> 
> I reproduced this with sockets:
> 
> 1. Started PostgreSQL on port 6565
> 2. Connected with psql and ran:
>    \copy (select 1) to program 'powershell -Command "Start-Sleep 300"'
> 3. Used Sysinternals handle64.exe to examine handles:
>    - PowerShell had inherited socket handles (\Device\Afd)
>    - Same handle values in both processes (proving inheritance, not
> separate opens)
> 4. Killed the postmaster
> 5. netstat showed port 6565 still LISTENING on the dead postmaster PID
> 6. Restart failed: "Address already in use"
> 7. Port only freed after killing PowerShell
> 
> The socket fix adds WSA_FLAG_NO_HANDLE_INHERIT to WSASocket() in
> pgwin32_socket(), and calls SetHandleInformation() in
> BackendInitialize() to mark the inherited client socket non-inheritable.
> The latter is needed because handles passed to children become
> inheritable again on Windows.
> 
> TAP test included that verifies the port is freed immediately after
> postmaster exit rather than remaining in a zombie state.
> 
> The problem affects multiple handle types:
> 
> Files: https://commitfest.postgresql.org/patch/6197/
> Sockets: Fixed by attached patch
> Pipes: Not yet addressed
> Shared memory: Not yet addressed (causes "pre-existing shared memory
> block" errors)
> 
> Patches for pipes and shared memory will follow over the next couple of
> days.
> 
Incorrect extension on the patch.  Attached is the correct patch.

-- 
Bryan Green
EDB: https://www.enterprisedb.com
Вложения

Re: [PATCH] Fix socket handle inheritance on Windows

От
Thomas Munro
Дата:
On Fri, Nov 7, 2025 at 7:53 AM Bryan Green <dbryan.green@gmail.com> wrote:
> > The socket fix adds WSA_FLAG_NO_HANDLE_INHERIT to WSASocket() in
> > pgwin32_socket(), and calls SetHandleInformation() in
> > BackendInitialize() to mark the inherited client socket non-inheritable.
> > The latter is needed because handles passed to children become
> > inheritable again on Windows.

Right, this is a problem too and your analysis makes sense.

-    s = WSASocket(af, type, protocol, NULL, 0, WSA_FLAG_OVERLAPPED);
+    s = WSASocket(af, type, protocol, NULL, 0,
+                  WSA_FLAG_OVERLAPPED | WSA_FLAG_NO_HANDLE_INHERIT);

I wonder if we should control this with a new be-more-like-Unix
SOCK_CLOEXEC flag.  In practice we might never want sockets created
this way to be inherited across CreateProcess, so we might always pass
SOCK_CLOEXEC in places like libpq (we already do that).  But in theory
at least, someone might have a reason to want an inheritable socket
(no doubt with some complications), and it seems attractive to work
the same way cross-platform and also mirror the O_CLOEXEC behaviour
your other thread fixes, and I'm also looking further ahead to the
case of accepted sockets that have some additional needs (see below),
so it might be good to be explicit and consistent about the flags.

+#ifdef WIN32
+    /*
+     * On Windows, the client socket inherited from the postmaster becomes
+     * inheritable again in this process. Prevent child processes spawned
+     * by this backend from inheriting it.
+     */
+    if (!SetHandleInformation((HANDLE) port->sock, HANDLE_FLAG_INHERIT, 0))
+        ereport(WARNING,
+                (errmsg_internal("could not disable socket handle
inheritance: error code %lu",
+                                 GetLastError())));
+#endif

On Unix we also need the equivalent, but it's in pq_init():

#ifndef WIN32

    /* Don't give the socket to any subprograms we execute. */
    if (fcntl(port->sock, F_SETFD, FD_CLOEXEC) < 0)
        elog(FATAL, "fcntl(F_SETFD) failed on socket: %m");
#endif

Would it make sense to have a socket_set_cloexec() function, following
the example of socket_set_nonblocking(), so we could harmonise the
calling code?

I think we'll also need an accept4() function for Windows that accepts
SOCK_CLOEXEC and converts it to WSA_FLAG_NO_HANDLE_INHERIT, now that
you've pointed this out.  It's not needed for your problem report
here, and doesn't even make sense for EXEC_BACKEND by definition, but
I think we'll probably want it for multithreaded PostgreSQL on
Windows.  With backends as threads, at least in one experimental
architecture with listener in the main/only sub-postmaster process,
there is a race window between ye olde accept() and
socket_set_cloexec() that leaks handles if a subprocess is created
concurrently by any thread.  Adopting accept4() is pretty trivial
(something like v5-0001[1]), it just wasn't quite compelling enough to
commit before multithreading started heating up as a topic.  I hadn't
previously grokked that Windows will need to be able to reach its
equivalent flag too for the reason you've pointed out.

I mention that as context for my suggestion that we might want an
explicit SOCK_CLOEXEC flag, because it finishes up being conditional
in the accept4() case assuming that design: multi-process mode needs
it except in EXEC_BACKEND builds which have to call
socket_set_cloexit() in the exec'd child by definition, while
in-development multi-threaded mode needs it on all platforms.  (The
fact that macOS alone has so far refused to implement it is a bit
annoying, and potential workarounds are expensive, but that's a topic
for another day.)

[1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com#abe1bf9def93e897827e878aac8a6945



Re: [PATCH] Fix socket handle inheritance on Windows

От
Bryan Green
Дата:
On 11/6/25 19:03, Thomas Munro wrote:
> On Fri, Nov 7, 2025 at 7:53 AM Bryan Green <dbryan.green@gmail.com> wrote:
>>> The socket fix adds WSA_FLAG_NO_HANDLE_INHERIT to WSASocket() in
>>> pgwin32_socket(), and calls SetHandleInformation() in
>>> BackendInitialize() to mark the inherited client socket non-inheritable.
>>> The latter is needed because handles passed to children become
>>> inheritable again on Windows.
> 
> Right, this is a problem too and your analysis makes sense.
> 
> -    s = WSASocket(af, type, protocol, NULL, 0, WSA_FLAG_OVERLAPPED);
> +    s = WSASocket(af, type, protocol, NULL, 0,
> +                  WSA_FLAG_OVERLAPPED | WSA_FLAG_NO_HANDLE_INHERIT);
> 
> I wonder if we should control this with a new be-more-like-Unix
> SOCK_CLOEXEC flag.  In practice we might never want sockets created
> this way to be inherited across CreateProcess, so we might always pass
> SOCK_CLOEXEC in places like libpq (we already do that).  But in theory
> at least, someone might have a reason to want an inheritable socket
> (no doubt with some complications), and it seems attractive to work
> the same way cross-platform and also mirror the O_CLOEXEC behaviour
> your other thread fixes, and I'm also looking further ahead to the
> case of accepted sockets that have some additional needs (see below),
> so it might be good to be explicit and consistent about the flags.
> 
> +#ifdef WIN32
> +    /*
> +     * On Windows, the client socket inherited from the postmaster becomes
> +     * inheritable again in this process. Prevent child processes spawned
> +     * by this backend from inheriting it.
> +     */
> +    if (!SetHandleInformation((HANDLE) port->sock, HANDLE_FLAG_INHERIT, 0))
> +        ereport(WARNING,
> +                (errmsg_internal("could not disable socket handle
> inheritance: error code %lu",
> +                                 GetLastError())));
> +#endif
> 
> On Unix we also need the equivalent, but it's in pq_init():
> 
> #ifndef WIN32
> 
>      /* Don't give the socket to any subprograms we execute. */
>      if (fcntl(port->sock, F_SETFD, FD_CLOEXEC) < 0)
>          elog(FATAL, "fcntl(F_SETFD) failed on socket: %m");
> #endif
> 
> Would it make sense to have a socket_set_cloexec() function, following
> the example of socket_set_nonblocking(), so we could harmonise the
> calling code?
> 

Yes, I think it would.

> I think we'll also need an accept4() function for Windows that accepts
> SOCK_CLOEXEC and converts it to WSA_FLAG_NO_HANDLE_INHERIT, now that
> you've pointed this out.  It's not needed for your problem report
> here, and doesn't even make sense for EXEC_BACKEND by definition, but
> I think we'll probably want it for multithreaded PostgreSQL on
> Windows.  With backends as threads, at least in one experimental
> architecture with listener in the main/only sub-postmaster process,
> there is a race window between ye olde accept() and
> socket_set_cloexec() that leaks handles if a subprocess is created
> concurrently by any thread.  Adopting accept4() is pretty trivial
> (something like v5-0001[1]), it just wasn't quite compelling enough to
> commit before multithreading started heating up as a topic.  I hadn't
> previously grokked that Windows will need to be able to reach its
> equivalent flag too for the reason you've pointed out.
> 

I will look over [1].

> I mention that as context for my suggestion that we might want an
> explicit SOCK_CLOEXEC flag, because it finishes up being conditional
> in the accept4() case assuming that design: multi-process mode needs
> it except in EXEC_BACKEND builds which have to call
> socket_set_cloexit() in the exec'd child by definition, while
> in-development multi-threaded mode needs it on all platforms.  (The
> fact that macOS alone has so far refused to implement it is a bit
> annoying, and potential workarounds are expensive, but that's a topic
> for another day.)
> 

If you agree, and I think you do, I will implement the SOCK_CLOEXEC 
abstraction and the socket_set_cloexec helper for this.  My bug fix will 
be the first user.

I also need to handle the handles for shared memory and pipes.  I will 
probably just write those up tonight and share for review and discussion.

> [1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com#abe1bf9def93e897827e878aac8a6945


Thanks for your excellent review and suggestions.

--
Bryan Green
EDB: https://www.enterprisedb.com





Re: [PATCH] Fix socket handle inheritance on Windows

От
Thomas Munro
Дата:
On Fri, Nov 7, 2025 at 2:35 PM Bryan Green <dbryan.green@gmail.com> wrote:
> If you agree, and I think you do, I will implement the SOCK_CLOEXEC
> abstraction and the socket_set_cloexec helper for this.  My bug fix will
> be the first user.

+1

Thanks for working on all this stuff.

> I also need to handle the handles for shared memory and pipes.  I will
> probably just write those up tonight and share for review and discussion.

Cool.