Re: libpq: Process buffered SSL read bytes to support records >8kB on async API

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
Дата
Msg-id CAOYmi+==Y5ZZiwukN-wzGCyxdrL1dU3wOG-j0nza3MC3R2XhcA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: libpq: Process buffered SSL read bytes to support records >8kB on async API  (Jacob Champion <jacob.champion@enterprisedb.com>)
Ответы Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
Список pgsql-hackers
On Wed, Jul 2, 2025 at 4:12 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> I'll work on proving that code paths other than PQconsumeInput() are
> affected. If they are, I'll start a patch for pqReadData().

Here's one way for a server implementation to hang the client during
PQconnectPoll(): accept the SSLRequest and immediately send a protocol
2.0 error message of (16k + 1) bytes, but split the message across two
TLS records of 8k and (8k + 1) bytes each, and wait for the client to
close the connection.

This case works fine with synchronous connect, but it fails with
async. libpq will fill the first 8k of its 16k input buffer with the
first record. It fills the second half with all but one byte of the
second record, and since it doesn't find a null terminator anywhere in
the buffer, it tells the client to poll on the socket again. Since the
server in this case is waiting for the client to close first, no
additional socket activity is coming, and the connection deadlocks
with one byte pending in the SSL buffer.

I use 2.0 as an example; libpq avoids getting stuck in 3.0 by
enlarging the input buffer to the size of the incoming message. But
that's supposed to be a performance optimization, not anti-deadlock
protection. Even if we were to enshrine the buffer sizes we use in the
protocol specification itself [1], someone's still likely to trip over
this hazard if they start adjusting those performance optimizations.
(Even our use of a 16k initial buffer is itself just an optimization.)

So I think pqReadData() needs to make the same guarantees for SSL/GSS
that it does for plain TCP: when it returns, either 1) there are no
bytes left pending or 2) the socket itself is marked readable.
Otherwise I think we'll continue to chase weird corner cases.

What I'm not sure about is unforeseen consequences -- could fixing
this expose other latent bugs? If we're concerned about that, we might
have to tier the backports to make sure everything remains stable.
Here are the three bugs discussed so far, with the fix for 3 possibly
subsuming 1 and 2 but also affecting more code:

1) pqSocketCheck() checks for pending SSL bytes but not GSS bytes
2) PQconsumeInput() does not drain all pending bytes
3) pqReadData() does not drain all pending bytes

Thanks,
--Jacob

[1] We should not do that.



В списке pgsql-hackers по дате отправления: