Обсуждение: Pgbench: remove synchronous prepare
Hi, Hackers! I was testing a connection pooler with pgbench and pgbench froze. I checked the traffic and noticed that pgbench just blocks the execution while it is waiting the response to the prepare command. To reproduce the problem, it is enough to run pgbouncer with the session pooling mode and use more clients than the pool size. With the pool size of 20: pgbench -h localhost -p 6432 --client=21 --jobs=1 -S -T 1000 -P 1 postgres --protocol=prepared Pgbench with the extended protocol flag does not have this issue because pgbench sends the whole parse/bind/execute/sync packet sequence at once and waits for the result asynchronously. I suggest implementing this behavior for the prepared protocol too. I attached the pgbouncer configuration to reproduce the issue and the proposed fix. I prefer to add a new function to libpqfe instead of changing the existing behavior or adding a new state to pgbench. Although it is largely duplicated code, it looks to be as non-invasive as possible. Implementation and naming need to be discussed. Tests for pgbench passed. I made small changes to the expected output. Regards, Dmitrii Bondar.
Вложения
Add docs, fix a function number. On 1/27/26 10:34 AM, Dmitrii Bondar wrote: > Hi, Hackers! > > I was testing a connection pooler with pgbench and pgbench froze. I > checked the traffic and noticed that pgbench just blocks the execution > while it is waiting the response to the prepare command. > > To reproduce the problem, it is enough to run pgbouncer with the > session pooling mode and use more clients than the pool size. With the > pool size of 20: > > pgbench -h localhost -p 6432 --client=21 --jobs=1 -S -T 1000 -P 1 > postgres --protocol=prepared > > Pgbench with the extended protocol flag does not have this > issue because pgbench sends the whole parse/bind/execute/sync packet > sequence at once and waits for the result asynchronously. I suggest > implementing this behavior for the prepared protocol too. > > I attached the pgbouncer configuration to reproduce the issue and the > proposed fix. I prefer to add a new function to libpqfe instead of > changing the existing behavior or adding a new state to pgbench. > Although it is largely duplicated code, it looks to be as non-invasive > as possible. Implementation and naming need to be discussed. > > > Tests for pgbench passed. I made small changes to the expected output. > > > Regards, > Dmitrii Bondar.
Вложения
Rebase.Hi Dmitrii,
I tested the latest patch with PgBouncer in session pooling mode (pool size 20, 21 clients).
Before applying the patch, pgbench got stuck under this setup and eventually hit a query_wait_timeout error.
After applying the patch, pgbench runs smoothly even when clients are queued. I can see continuous progress output and normal throughput (~60k TPS), with no errors or stalls.
The change works well in my testing.
Thanks for the patch!
Regards,
Lakshmi G
Hi!
Thank you for reviewing my patch! Should I consider your review complete and move the patch to ‘ready for committer’?
On Mon, Mar 23, 2026 at 11:45 AM Dmitrii Bondar <d.bondar@postgrespro.ru> wrote:Rebase.Hi Dmitrii,
I tested the latest patch with PgBouncer in session pooling mode (pool size 20, 21 clients).
Before applying the patch, pgbench got stuck under this setup and eventually hit a query_wait_timeout error.
After applying the patch, pgbench runs smoothly even when clients are queued. I can see continuous progress output and normal throughput (~60k TPS), with no errors or stalls.
The change works well in my testing.
Thanks for the patch!
Regards,
Lakshmi G
Hi Dmitrii,
Yes, my review is complete. The patch works well in my testing and resolves the blocking issue without any regressions.
You can move it to 'Ready for Committer.'
Regards,
Lakshmi G
Hi!
Thank you for reviewing my patch! Should I consider your review complete and move the patch to ‘ready for committer’?
On 4/6/26 1:54 PM, lakshmi wrote:
On Mon, Mar 23, 2026 at 11:45 AM Dmitrii Bondar <d.bondar@postgrespro.ru> wrote:Rebase.Hi Dmitrii,
I tested the latest patch with PgBouncer in session pooling mode (pool size 20, 21 clients).
Before applying the patch, pgbench got stuck under this setup and eventually hit a query_wait_timeout error.
After applying the patch, pgbench runs smoothly even when clients are queued. I can see continuous progress output and normal throughput (~60k TPS), with no errors or stalls.
The change works well in my testing.
Thanks for the patch!
Regards,
Lakshmi G
On Mon, Mar 16, 2026 at 3:46 AM Dmitrii Bondar <d.bondar@postgrespro.ru> wrote: > Rebase. Hi, I think that this patch is changing more behavior than is explained in the commit message. The existing code calls PQsendQueryPrepared, which only tries to execute an already-prepared query. The replacement code tries to prepare the query. It is not clear to me what's going on here. I would have expected that we would only ever reach that point in the code with the query already prepared; otherwise, the existing code would presumably fail. But if that is the case then how is the new code managing to do anything different than the old code? Another way to see that the patch must be changing more behavior than advertised is the change to 001_pgbench_with_server.pl. That change comes with no comment changes and no explanation of any kind. If this patch were just about doing something asynchronously instead of synchronously, I think that would be fine, but I don't think that's all that is happening here. The original post explains the problem behavior (pgbench freezing under certain circumstances) but I don't understand what causes that behavior. I think I would understand better if the original complaint were about something other than session pooling mode: then, I might expect that we might unexpectedly discover that our session does not have something prepared which we expected to find prepared, and maybe this revised logic in sendCommand() would somehow fix that. But in session pooling mode, shouldn't everything be the same as if connection pooling is not in use at all? What's actually different? -- Robert Haas EDB: http://www.enterprisedb.com
Hi,On Mon, Mar 16, 2026 at 3:46 AM Dmitrii Bondar <d.bondar@postgrespro.ru> wrote:Rebase.Hi, I think that this patch is changing more behavior than is explained in the commit message. The existing code calls PQsendQueryPrepared, which only tries to execute an already-prepared query. The replacement code tries to prepare the query. It is not clear to me what's going on here. I would have expected that we would only ever reach that point in the code with the query already prepared; otherwise, the existing code would presumably fail. But if that is the case then how is the new code managing to do anything different than the old code? Another way to see that the patch must be changing more behavior than advertised is the change to 001_pgbench_with_server.pl. That change comes with no comment changes and no explanation of any kind. If this patch were just about doing something asynchronously instead of synchronously, I think that would be fine, but I don't think that's all that is happening here. The original post explains the problem behavior (pgbench freezing under certain circumstances) but I don't understand what causes that behavior. I think I would understand better if the original complaint were about something other than session pooling mode: then, I might expect that we might unexpectedly discover that our session does not have something prepared which we expected to find prepared, and maybe this revised logic in sendCommand() would somehow fix that. But in session pooling mode, shouldn't everything be the same as if connection pooling is not in use at all? What's actually different?
The patch does not change the existing behavior when a query has already been prepared. Both the old and new code paths use PQsendQueryPrepared, which sends a bind-execute-sync packet sequence without waiting for a response.
The main difference appears when the query has not yet been prepared. In the old code, PQprepare is called, which sends a parse message and then waits for the result via PQexecFinish. Since PQexecFinish blocks until a response arrives, it can block the entire thread if the server has not responded yet.
I replaced the call to PQprepare with a call to the new PQsendPBES function. Like PQsendQueryPrepared, it works asynchronously, but it sends a parse-bind-execute-sync sequence instead. This change avoids thread blocking because it eliminates the need to call PQexecFinish.
I chose to send a parse-bind-execute-sync sequence to match the behavior of extended query mode, in which pgbench sends the same sequence, but with an unnamed statement.
The expected output for 001_pgbench_with_server.pl was changed for the following reason. In the old pgbench code, prepareCommand is called and receives ERROR: syntax error. Since prepareCommand does not return a status, pgbench continues execution and then attempts to run the command with PQsendQueryPrepared. This leads to the error prepared statement .* does not exist, which is caused by the bind packet.
In the new code, PQsendPBES sends a parse-bind-execute-sync packet sequence. If the parse step fails with ERROR: syntax error, all subsequent messages are ignored until the sync packet is processed. That is why the additional prepared statement .* does not exist error from the bind packet no longer appears.
Session mode is indeed the most transparent way to use a pooler. However, pgbench can become stuck when the number of clients exceeds the pool size. If the pooler cannot reserve a backend for a client, it places the client in a waiting queue. In that case, pgbench may wait indefinitely because it is blocked in PQprepare, and the pgbench thread cannot process responses for other clients.
Regards,
Dmitrii Bondar
On Tue, Apr 28, 2026 at 5:29 AM Dmitrii Bondar <d.bondar@postgrespro.ru> wrote: > Session mode is indeed the most transparent way to use a pooler. However, pgbench can become stuck when the number of clientsexceeds the pool size. If the pooler cannot reserve a backend for a client, it places the client in a waiting queue.In that case, pgbench may wait indefinitely because it is blocked in PQprepare, and the pgbench thread cannot processresponses for other clients. Ah, I see! This is a key point I wasn't understanding previously. Why isn't the solution to use the existing PQsendPrepare function instead of adding a new libpq entrypoint? Even if we stick with the design you propose here, I don't think we can add a function with a name like PQsendPBES, and I think we need to find a way to more clearly explain what it does. It's kind of unfortunate that there's such a large gap between the names of the functions and the protocol messages that they send, but if all the other functions are named without reference to the underlying protocol messages, and this one is an exception, then it seems like it's going to be hard to understand. -- Robert Haas EDB: http://www.enterprisedb.com
Why isn't the solution to use the existing PQsendPrepare function instead of adding a new libpq entrypoint?
pgbench is not designed to process a response to a Parse message alone, because of meta-commands. For example, the \gset command requires a tuple to be stored, but a response to a Parse message does not provide one. This leads to the error: pgbench: error: client 0 script 0 command 0 query 0: expected one row, got 0. Sending all additional messages with PQsendQueryPrepared may look like the exact solution, but it is not. PQsendQueryStart does not allow more than one command to be sent unless pipeline mode is enabled. This could be fixed in two ways: either by allowing libpq to send more than one command when pipeline is disabled, or by adding a new state-machine state to pgbench. Both options seem more invasive than the current solution. Adding a new libpq function just for pgbench (at least for now) does not seem ideal either, but it may be simpler and safer.
I don't think we can add a function with a name like PQsendPBES, and I think we need to
I have other suggestions:
"PQsendQueryPrepare" but it is too close to the existing name "PQsendQueryPrepared".
"PQsendPrepareQuery" similar to "PQsendPrepare" but it also executes the query.
"PQsendPrepareExecute" is not especially well aligned with the existing naming scheme, but it may describe the intent quite well.