Re: Built-in connection pooling
От | Konstantin Knizhnik |
---|---|
Тема | Re: Built-in connection pooling |
Дата | |
Msg-id | a1fdf4eb-4ee0-b35e-34e9-36fadfcbceff@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Built-in connection pooling (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: Built-in connection pooling
|
Список | pgsql-hackers |
On 28.01.2018 03:40, Bruce Momjian wrote:
On Mon, Jan 22, 2018 at 06:51:08PM +0100, Tomas Vondra wrote:Yes, external connection pooling is more flexible. It allows to perform pooling either at client side either at server side (or even combine two approaches).> Also external connection pooling for PostgreSQL is not limited by pgbouncer/pgpool.> There are many frameworks maintaining their own connection pool, for example J2EE, jboss, hibernate,...> I have a filling than about 70% of enterprise systems working with databases are written in Java and doing connection pooling in their own way.>True, but that does not really mean we don't need "our" connection pooling (built-in or not). The connection pools are usually built into the application servers, so each application server has their own independent pool. With larger deployments (a couple of application servers) that quickly causes problems with max_connections.I found this thread and the pthread thread very interesting. Konstantin, thank you for writing prototypes and giving us very useful benchmarks for ideas I thought I might never see. As much as I would like to move forward with coding, I would like to back up and understand where we need to go with these ideas. First, it looks like pthreads and a builtin pooler help mostly with 1000+ connections. It seems like you found that pthreads wasn't sufficient and the builtin pooler was better. Is that correct?
Brief answer is yes.
Pthreads allows to minimize per-connection overhead and make it possible to obtain better results for large number of connections.
But there is a principle problem: Postgres connection is "heave weight" object: each connection maintains it own private cache of catalog, relations, temporary
table pages, prepared statements,... So even through pthreads allows to minimize per-connection memory usage, it is negligible comparing with all this connection
private memory resources. It means that we still need to use connection pooling.
Pthreads provides two main advantages:
1. Simplify interaction between different workers: on need to use shared memory with it's fixed size limitation and
impossibility to use normal pointer for dynamic shared memory. Also no need to implement specialized memory allocator for shared memory.
It makes implementation of parallel query execution and built-on connection pooling much easier.
2. Optimize virtual-to-physical address translation. There is no need to maintain separate address space for each backend, so TLB (translation lookaside buffercan) becomes more efficient.
So it is not completely correct to consider session pooling as alternative to pthreads.
Ideally this two approaches should be combined.
I think that the best approach is to switch to global (shared) caches for execution plans, catalog,...Is there anything we can do differently about allowing long-idle connections to reduce their resource usage, e.g. free their caches? Remove from PGPROC? Could we do it conditionally, e.g. only sessions that don't have open transactions or cursors?
Most of the time this metadata caches are used to be identical for all clients. So it is just waste of memory and time to maintain them separately in each backend.
Certainly shared cached requires some synchronization when can be a point of contention and cause significant degrade of performance.
But taking in account that metadata is updated much rarely than data, I hope using copy-on-write and atomic operations can help to solve this problems.
And in can give a lot of different advantages. For example it will be possible to spend more time in optimizer for detecting optimal execution plan and store manually plans for
future use.
It feels like user and db mismatches are always going to cause pooling problems. Could we actually exit and restart connections that have default session state?
Well, combining multiuser access and connection pooling is really a challenged problem.
I do not know the best solution for it now. It will be much simpler to find solution with pthreads model...
Most of enterprise systems are using pgbouncer or similar connection pooler. In pgbouncer in statement/transaction pooling mode access to the database is performed under the same user. So it means that many existed statements are built in the assumption that database is accessed in this manner.
Concerning "default session state": one of the main drawbacks of pgbouncer and other external poolers is that them do not allow to use prepared statements.
And it leads to up to two times performance penalty on typical OLTP queries. One of the main ideads of built-on session pooling was to eliminate such limitation.
It will require changes in client applications, will not it? Them should be ready that connection can be dropped by server at any moment of time.Right now, if you hit max_connections, we start rejecting new connections. Would it make sense to allow an option to exit idle connections when this happens so new users can connect?
I do not know it is possible to drop idle connection and hide this fact from the client. In my implementation each session keeps minimal necessary information requires for interaction with client (session context). It includes socket, struct Port and session memory context which should be used instead of TopMemoryContext for session specific data.
I know we have relied on external connection poolers to solve all the high connection problems but it seems there might be simple things we can do to improve matters. FYI, I did write a blog entry comparing external and internal connection poolers: https://momjian.us/main/blogs/pgblog/2017.html#April_21_2017
I completely agree with your arguments in this post.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Kyotaro HORIGUCHIДата:
Сообщение: Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)
Следующее
От: Fabien COELHOДата:
Сообщение: Re: PATCH: pgbench - option to build using ppoll() for largerconnection counts