Обсуждение: PostgreSQL 17: Bug in libpq when libpq is dlopened/closed multiple times

Поиск
Список
Период
Сортировка

PostgreSQL 17: Bug in libpq when libpq is dlopened/closed multiple times

От
Daniel Schreiber
Дата:
Dear PostgreSQL developers,

my colleagues and I probably found a bug in libpq when libpq is dlopened 
and closed multiple times during the lifetime of a process. In our setup 
we use a PAM module which links to libpq. The process using PAM is 
linked against openssl, so openssl is loaded during the complete 
lifetime of the process whereas libpq is loaded only during PAM 
authentication (and unloaded when PAM has finished).

We observed the bug on a Debian 13 system using libpq from Debian. To 
reproduce the bug, compile the attached c file using the following gcc 
command line:

gcc libpq1-dlopen.c -Wall -Wextra -o libpq1-dlopen -ldl -lssl -lcrypto

Then run the binary with a postgresql connection string as an argument. 
The connection string has to include 'sslmode=require'. The program will 
in a loop try to dlopen libpq, then connect to the server, finish the 
connection and unload libpq.

According to our findings every time a connection is established after 
dlopening libpq one of the 127 available BIO_METHOD structures in 
OpenSSL is consumed:
https://github.com/postgres/postgres/blob/REL_17_9/src/interfaces/libpq/fe-secure-openssl.c#L1987

So after 127 cycles registering the callbacks fails and in our use case 
the application is no longer able to authenticate using PAM. As a 
workaround we LD_PRELOAD libpq in the application.

I am not subscribed yet to the mailing list, so please CC me.

Thank you,

Daniel
-- 
Daniel Schreiber
Facharbeitsgruppe Systemsoftware
Universitaetsrechenzentrum

Technische Universität Chemnitz
Straße der Nationen 62 (Raum B303)
09111 Chemnitz
Germany

Tel:     +49 371 531 35444


Вложения

Re: PostgreSQL 17: Bug in libpq when libpq is dlopened/closed multiple times

От
Jacob Champion
Дата:
On Fri, Apr 17, 2026 at 7:33 AM Daniel Schreiber
<daniel.schreiber@hrz.tu-chemnitz.de> wrote:
> my colleagues and I probably found a bug in libpq when libpq is dlopened
> and closed multiple times during the lifetime of a process. In our setup
> we use a PAM module which links to libpq. The process using PAM is
> linked against openssl, so openssl is loaded during the complete
> lifetime of the process whereas libpq is loaded only during PAM
> authentication (and unloaded when PAM has finished).
>
> [snip]
>
> According to our findings every time a connection is established after
> dlopening libpq one of the 127 available BIO_METHOD structures in
> OpenSSL is consumed:
> https://github.com/postgres/postgres/blob/REL_17_9/src/interfaces/libpq/fe-secure-openssl.c#L1987

Right. I think in this *particular* case, we should simply skip the
call to BIO_get_new_index(). We don't need it, IIUC.

But I think we may also need to set expectations on whether or not
infinite dlopen/dlclose loops are supported in general. If we ever
come across a situation in which a call to BIO_get_new_index() is
necessary, that leak just fundamentally can't be plugged. The same is
true for any third-party libraries (or their dependencies, or
theirs...) that require "one-time", irreversible calls which can't be
tracked after we're unloaded. And we can't push these concerns up to
the top level application developer, because they don't know we exist.

(I'd be surprised if this were the only such resource leak across all
supported versions and combinations of Kerberos, OpenSSL, OpenLDAP,
Curl, etc. etc. From a quick search, you're the first to report this
in the ten years since the leak was introduced, so there may be more
dragons where you're headed.)

--Jacob