New releases, and happy new year!
От | Daniele Varrazzo |
---|---|
Тема | New releases, and happy new year! |
Дата | |
Msg-id | CA+mi_8bme8iBFTcoDBYHVeKXSQQSccM6MUDVT-=kMp4GLsCDqQ@mail.gmail.com обсуждение исходный текст |
Список | psycopg |
Hello everyone, It's been a while that I haven't written to the ML, but I wanted to take the occasion of the first psycopg release of the new year to say hi :) I have just released psycopg 3.1.17 and psycopg-pool 3.2.1, containing only a few bug fixes but which I wanted to release soon to get quickly some feedback, as they address shortcomings in new developments. Changelogs with links to the issues at: - https://www.psycopg.org/psycopg3/docs/news.html#psycopg-3-1-17 - https://www.psycopg.org/psycopg3/docs/news_pool.html#psycopg-pool-3-2-1 For a recap of the last few development that happened in psycopg: since the inception of psycopg 3, the connection function uses the libpq async path, both in Python sync and async code. The async libpq connection function doesn't honour the connection timeout [1]; handling the timeout is not hard, but unfortunately, not handling it in the libpq, has undesirable repercussions with other libpq features: - multiple hosts in the connection string are not supported. The libpq will try the first but, without a timeout, it doesn't know when to give up on it and try the second. - Connecting to multiple hosts can also be implemented at DNS level, by associating more than one IP to the same hostname. - Where is the host specified? It can be in the connection string, or in an environment variable, but also in a service file. - A recent libpq feature is random hosts selection for load balancing [2]. Needless to say, it cannot work if we cannot try more than one host. We introduced multiple connection attempts in 3.1.13 (ticket #674) and released several regressions fixes and improvements in 3.1.15 and 3.1.16 (#694, #695, #703). In the just released 3.1.17 we moved DNS resolution to Python (#699) to cover the case of multiple IPs per host too. We were already performing name resolution in Python in order to allow non-blocking connection in async code (#259, released back in Psycopg 3.1.0), so now the sync and async connection code paths have become very similar. There is still a lingering problem though: if multiple hostnames or a hostname with multiple IPs are specified in a service file rather than in a connection param or an env var we will not handle them correctly (async connection will be blocking and limited to the first hit). Handling the service file would be a massive code duplication with the libpq, trying to replicate the same location customization, with some of the parameters that at best can be a guess (like the C macro SYSCONFDIR, of which I don't see a way to know at runtime if the pg_config binary is not on the client, and I'm not sure we want to call it). So it's a can of worms that hasn't been open yet, and it is making me wonder if it wouldn't be a better idea to improve the libpq instead. We could use the libpq if it provided: - support for attempt timeout in PQconnectPoll. Maybe this could be introduced by keeping the state of the attempts in the PGconn object; - customization of the getaddrinfo function in order to pass it a callback that we might override from Python. So I think I would check out if there is some desire to improve the libpq, before jumping in the reimplementation of the service file handling. On the connection pool front, in psycopg-pool 3.2.0 we introduced, among other new features, a callback to check the state of a connection on getconn (a request I have resisted to implement for some time, but which seems it became necessary because of the aggressive idle connections disconnection that either people are imposing on themselves or that just comes attached with postgres-as-a-service providers). In 3.2.1 we have improved the retry loop around this check function to avoid accidental busy loops (using the same backoff policy using in connections attempt). We hope that psycopg is serving you well. As usual, feedback is welcome and we thank you very much for your contributions (with good bug reports, features, ideas). Happy hacking and happy new year! -- Daniele [1]: https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PQCONNECTSTARTPARAMS: "The connect_timeout connection parameter is ignored when using PQconnectPoll; it is the application's responsibility to decide whether an excessive amount of time has elapsed." [2] https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNECT-LOAD-BALANCE-HOSTS