SSL renegotiation and other related woes

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	SSL renegotiation and other related woes
Дата	26 января 2015 г. 13:14:23
Msg-id	20150126101405.GA31719@awork2.anarazel.de обсуждение исходный текст
Ответы	Re: SSL renegotiation and other related woes (Andres Freund <andres@2ndquadrant.com>) Re: SSL renegotiation and other related woes (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

When working on getting rid of ImmediateInterruptOK I wanted to verify
that ssl still works correctly. Turned out it didn't. But neither did it
in master.

Turns out there's two major things we do wrong:

1) We ignore the rule that once called and returning SSL_ERROR_WANTS_(READ|WRITE) SSL_read()/write() have to be called
again with the same parameters. Unfortunately that rule doesn't mean just that the same parameters have to be passed
in,but also that we can't just constantly switch between _read()/write(). Especially nonblocking backend code (i.e.
walsender)and the whole frontend code violate this rule.

2) We start renegotiations in be_tls_write() while in nonblocking mode, but don't properly retry to handle socket
readyness.There's a loop that retries handshakes twenty times (???), but what actually is needed is to call
SSL_get_error()and ensure that there's actually data available.

2) is easy enough to fix, but 1) is pretty hard. Before anybody says
that 1) isn't an important rule: It reliably causes connection aborts
within a couple renegotiations. The higher the latency the higher the
likelihood of aborts. Even locally it doesn't take very long to
abort. Errors usually are something like "SSL connection has been closed
unexpectedly" or "SSL Error: sslv3 alert unexpected message" and a host
of other similar messages. There's a couple reports of those in the
archives and I've seen many more in client logs.

As far as I can see the only realistic way to fix 1) is to change both
frontend and backend code to:
a) Always check for socket read/writeability before calling SSL_read/write() when in nonblocking mode. That's a bit
annoying because it nearly doubles the amount of syscalls we do or client communication, but I can't really se an
alternative.That allows us to avoid waiting inside after a WANT_READ/WRITE, or havin to setup a larger state machine
thatkeeps track what we tried last.

b) When SSL_read/write nonetheless returns WANT_READ/WRITE, even though we tested for read/writeability, we're very
likelydoing renegotiation. In that case we'll just have to block. There's already code that busy loops (and thus
waits)in the frontend (c.f. pgtls_read's WANT_WRITE case, triggered during reneg). We can't just return immediately
tothe upper layers as we'd otherwise likely violate the rule about calling ssl with the same parameters again.

c) Add a somewhat hacky optimization whereas we allow to break out of a WANT_READ condition in a nonblocking socket
whenssl->state == SSL_ST_OK. That's the cases where it actually, at least by my reading of the unreadable ssl code,
safeto not wait. That case is somewhat important because we otherwise can end up waiting on both sides due to b),
evenwhen nonblocking calls where actually made. That condition essentially means that we'll only block if
renegotiationor partial reads are in progress. Afaics at least.

d) Remove the SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER hack - we don't actually need it anymore.

These errors are much less frequent when using a plain frontend
(e.g. psql/pgbench) because they don't use copy both stuff - the way
these clients use the FE/BE protocol there's essentially natural
synchronization points where nothing but renegotiation happens. With
walsender (or pipelined queries!) both sides can write at the same time.

My testcase for this is just to setup a server with a low
ssl_renegotiation_limit, generate lots of WAL (wal.sql attached) and
receive data via pg_receivexlog -n. Usually it'll error out quickly.

I've done a preliminary implementation of the above steps and it
survives transferring 25GB of WAL via the replication protocol with a
ssl_renegotiation_limit=100kB - previously it failed much earlier.

Does anybody have a neater way to tackle this? I'm not happy about this
solution, but I really can't think of anything better (save ditching
openssl maybe). I'm willing to clean up my hacked up fix for this, but
not if we can't find agreement on the approach.

Greetings,

Andres Freund

-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alexander Korotkov
Дата: 26 января 2015 г., 12:11:15
Сообщение: pg_dump with both --serializable-deferrable and -j

Следующее

От: Andres Freund
Дата: 26 января 2015 г., 13:20:19
Сообщение: Re: SSL renegotiation and other related woes

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

SSL renegotiation and other related woes

Предыдущее

Следующее