On Mon, Nov 22, 2021 at 9:24 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Hmm. Well, if I understand how this works (and I'm not too familiar
> with this Windows code so I maybe I don't), the postmaster duplicates
> the socket into the child process (see
> {write,read}_inheritable_socket()) and then closes its own handle (see
> ServerLoop()'s call to StreamClose(port->sock)). What if the
> postmaster kept the socket open, and then closed its copy after the
> child exits? Then, I guess, maybe, Winsock socket state would live on
> with a non-zero reference count and be able to perform the proper
> graceful TCP shutdown dance, at least as long as the postmaster itself
> is up. Various other ideas: don't do that, but duplicate the socket
> back into the postmaster before exit, or into some other process, or
> rewrite PostgreSQL to use threads...
Hmm, maybe it's still not enough. Now that I have coffee, I thought
about the well known failure of idle_in_transaction_timeout to report
errors on Windows[1]. There'd be no RST on timeout with the above
approach, which is good, but the next time you try to send a query,
perhaps a race begins: the server's TCP stack receives the query
packet and replies with RST (the "normal" kind that is a response to
unreceivable data, not the linger=0 kind that is proactively sent),
meanwhile the client begins to read, and *probably* reads the already
buffered idle-in-transaction-timeout error message, but with unlucky
scheduling the RST arrives first and drops the buffered data (unlike
on Unix), right?
[1] https://www.postgresql.org/message-id/CAP3o3PdzM0BLmNBELA5wV6YoN_1yYBVdoOvz9kYbOuK-YQGFAw%40mail.gmail.com