Re: Refactoring postmaster's code to cleanup after child exit
От | Andres Freund |
---|---|
Тема | Re: Refactoring postmaster's code to cleanup after child exit |
Дата | |
Msg-id | gdojfusnbe3ae47n6qjezclpv4462xbdc2ssoadtyklgdw5dqb@b4r5yw3fcifm обсуждение исходный текст |
Ответ на | Re: Refactoring postmaster's code to cleanup after child exit (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Refactoring postmaster's code to cleanup after child exit
|
Список | pgsql-hackers |
Hi, On 2024-12-10 12:00:12 +0200, Heikki Linnakangas wrote: > On 09/12/2024 22:55, Heikki Linnakangas wrote: > > Not sure how to fix this. A small sleep in the test would work, but in > > principle there's no delay that's guaranteed to be enough. A more robust > > solution would be to run a "select count(*) from pg_stat_activity" and > > wait until the number of connections are what's expected. I'll try that > > and see how complicated that gets.. > > Checking pg_stat_activity doesn't help, because the backend doesn't register > itself in pg_stat_activity until later. A connection that's rejected due to > connection limits never shows up in pg_stat_activity. > > Some options: > > 0. Do nothing > > 1. Add a small sleep to the test > > 2. Move the pgstat_bestart() call earlier in the startup sequence, so that a > backend shows up in pg_stat_activity before it acquires a PGPROC entry, and > stays visible until after it has released its PGPROC entry. This would give > more visibility to backends that are starting up. We don't necessarily *have* a PGPROC entry for that backend when we run out of connections, no? > 3. Rearrange the FATAL error handling so that the process removes itself > from PGPROC before sending the error to the client. That would be kind of > nice anyway. Currently, if sending the rejection error message to the client > blocks, you are holding up a PGPROC slot until the message is sent. The > error message packet is short, so it's highly unlikely to block, but still. This is definitely a problem, there was even a recent thread about it. It can be triggered even with just an ERROR message though :( For this test, could we perhaps rely on the log messages postmaster logs when child processes exit? 2025-03-04 17:56:12.528 EST [3509838][not initialized][:0][[unknown]] LOG: connection received: host=[local] 2025-03-04 17:56:12.528 EST [3509838][client backend][:0][[unknown]] FATAL: sorry, too many clients already 2025-03-04 17:56:12.529 EST [3509817][postmaster][:0][] DEBUG: releasing pm child slot 2 2025-03-04 17:56:12.529 EST [3509817][postmaster][:0][] DEBUG: client backend (PID 3509838) exited with exit code 1 I.e. the test could wait for the 'client backend exited' message using ->wait_for_log()? Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: