Обсуждение: [DOC] Add detail regarding resource consumption wrt max_connections
The documentation for max_connections does not mention that just by having a higher value for max_connections, PostgreSQL will use more resources.
While working with different customers, I noticed that several of them set max_connections to very high numbers, even though they never expected to actually have that many connections to their PostgreSQL instance.
In one extreme case, the user set max_connections to 200000 and was befuddled that the instance was using more memory than another with the same number of connections.
This patch adds language to the documentation pointing to the fact that higher value of max_connections leads to higher consumption of resources by Postgres, adding one paragraph to doc/src/sgml/config.sgml
<para>
PostgreSQL sizes certain resources based directly on the value of
<varname>max_connections</varname>. Increasing its value leads to
higher allocation of those resources, including shared memory.
</para>
PostgreSQL sizes certain resources based directly on the value of
<varname>max_connections</varname>. Increasing its value leads to
higher allocation of those resources, including shared memory.
</para>
Sincerely,
Roberto Mello
Вложения
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed I think it is good to warn the user about the increased allocation of memory for certain parameters so that they do not abuseit by setting it to a huge number without knowing the consequences. It is true that max_connections can increase the size of proc array and other resources, which are allocated in the sharedbuffer, which also means less shared buffer to perform regular data operations. I am sure this is not the only parameterthat affects the memory allocation. "max_prepared_xacts" can also affect the shared memory allocation too so thesame warning message applies here as well. Maybe there are other parameters with similar effects. Instead of stating that higher max_connections results in higher allocation, It may be better to tell the user that if thevalue needs to be set much higher, consider increasing the "shared_buffers" setting as well. thank you ----------------------- Cary Huang Highgo Software Canada www.highgo.ca
On Fri, Jan 12, 2024 at 3:15 PM Cary Huang <cary.huang@highgo.ca> wrote:
I think it is good to warn the user about the increased allocation of memory for certain parameters so that they do not abuse it by setting it to a huge number without knowing the consequences.
It is true that max_connections can increase the size of proc array and other resources, which are allocated in the shared buffer, which also means less shared buffer to perform regular data operations. I am sure this is not the only parameter that affects the memory allocation. "max_prepared_xacts" can also affect the shared memory allocation too so the same warning message applies here as well. Maybe there are other parameters with similar effects.
Instead of stating that higher max_connections results in higher allocation, It may be better to tell the user that if the value needs to be set much higher, consider increasing the "shared_buffers" setting as well.
Appreciate the review, Cary.
My goal was to inform the reader that there are implications to setting max_connections higher. I've personally seen a user mindlessly set this to 50k connections, unaware it would cause unintended consequences.
I can add a suggestion for the user to consider increasing shared_buffers in accordance with higher max_connections, but it would be better if there was a "rule of thumb" guideline to go along. I'm open to suggestions.
I can revise with a similar warning in max_prepared_xacts as well.
Sincerely,
Roberto
Re: [DOC] Add detail regarding resource consumption wrt max_connections
От
reid.thompson@crunchydata.com
Дата:
On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote:
On Fri, Jan 12, 2024 at 3:15 PM Cary Huang <cary.huang@highgo.ca> wrote:I think it is good to warn the user about the increased allocation of memory for certain parameters so that they do not abuse it by setting it to a huge number without knowing the consequences.
It is true that max_connections can increase the size of proc array and other resources, which are allocated in the shared buffer, which also means less shared buffer to perform regular data operations. I am sure this is not the only parameter that affects the memory allocation. "max_prepared_xacts" can also affect the shared memory allocation too so the same warning message applies here as well. Maybe there are other parameters with similar effects.
Instead of stating that higher max_connections results in higher allocation, It may be better to tell the user that if the value needs to be set much higher, consider increasing the "shared_buffers" setting as well.Appreciate the review, Cary.My goal was to inform the reader that there are implications to setting max_connections higher. I've personally seen a user mindlessly set this to 50k connections, unaware it would cause unintended consequences.I can add a suggestion for the user to consider increasing shared_buffers in accordance with higher max_connections, but it would be better if there was a "rule of thumb" guideline to go along. I'm open to suggestions.I can revise with a similar warning in max_prepared_xacts as well.Sincerely,Roberto
Can a "close enough" rule of thumb be calculated from:
postgresql.conf -> log_min_messages = debug3
start postgresql with varying max_connections to get CreateSharedMemoryAndSemaphores() sizes to generate a rough equation
postgresql-12-main.log
max_connections=100
75:2024-01-19 17:04:56.544 EST [2762535] DEBUG: invoking IpcMemoryCreate(size=149110784)
0.149110784GB
max_connections=10000
1203:2024-01-19 17:06:13.502 EST [2764895] DEBUG: invoking IpcMemoryCreate(size=644997120)
0.64499712GB
max_connections=20000
5248:2024-01-19 17:24:27.956 EST [2954550] DEBUG: invoking IpcMemoryCreate(size=1145774080)
1.14577408GB
max_connections=50000
2331:2024-01-19 17:07:27.716 EST [2767079] DEBUG: invoking IpcMemoryCreate(size=2591490048)
2.591490048GB
from lines 184-186
$ rg -B28 -A35 'invoking IpcMemoryCreate'
backend/storage/ipc/ipci.c
158-/*
159- * CreateSharedMemoryAndSemaphores
160- * Creates and initializes shared memory and semaphores.
161- *
162- * This is called by the postmaster or by a standalone backend.
163- * It is also called by a backend forked from the postmaster in the
164- * EXEC_BACKEND case. In the latter case, the shared memory segment
165- * already exists and has been physically attached to, but we have to
166- * initialize pointers in local memory that reference the shared structures,
167- * because we didn't inherit the correct pointer values from the postmaster
168- * as we do in the fork() scenario. The easiest way to do that is to run
169- * through the same code as before. (Note that the called routines mostly
170- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect this case.
171- * This is a bit code-wasteful and could be cleaned up.)
172- */
173-void
174-CreateSharedMemoryAndSemaphores(void)
175-{
176- PGShmemHeader *shim = NULL;
177-
178- if (!IsUnderPostmaster)
179- {
180- PGShmemHeader *seghdr;
181- Size size;
182- int numSemas;
183-
184- /* Compute the size of the shared-memory block */
185- size = CalculateShmemSize(&numSemas);
186: elog(DEBUG3, "invoking IpcMemoryCreate(size=%zu)", size);
187-
188- /*
189- * Create the shmem segment
190- */
191- seghdr = PGSharedMemoryCreate(size, &shim);
192-
193- InitShmemAccess(seghdr);
194-
195- /*
196- * Create semaphores
197- */
198- PGReserveSemaphores(numSemas);
199-
200- /*
201- * If spinlocks are disabled, initialize emulation layer (which
202- * depends on semaphores, so the order is important here).
203- */
204-#ifndef HAVE_SPINLOCKS
205- SpinlockSemaInit();
206-#endif
207- }
208- else
209- {
210- /*
211- * We are reattaching to an existing shared memory segment. This
212- * should only be reached in the EXEC_BACKEND case.
213- */
214-#ifndef EXEC_BACKEND
215- elog(PANIC, "should be attached to shared memory already");
216-#endif
217- }
218-
219- /*
220- * Set up shared memory allocation mechanism
221- */
Re: [DOC] Add detail regarding resource consumption wrt max_connections
От
reid.thompson@crunchydata.com
Дата:
On Fri, 2024-01-19 at 17:37 -0500, reid.thompson@crunchydata.com wrote: > On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote: > > > > I can add a suggestion for the user to consider increasing > > shared_buffers in accordance with higher max_connections, but it > > would be better if there was a "rule of thumb" guideline to go > > along. I'm open to suggestions. > > > > I can revise with a similar warning in max_prepared_xacts as well. > > > > Sincerely, > > > > Roberto > > Can a "close enough" rule of thumb be calculated from: > postgresql.conf -> log_min_messages = debug3 > > start postgresql with varying max_connections to get > CreateSharedMemoryAndSemaphores() sizes to generate a rough equation > or maybe it would be sufficient to advise to set log_min_messages = debug3 on a test DB and start/stop it with varying values of max_connections and look at the differing values in DEBUG: invoking IpcMemoryCreate(size=...) log messages for themselves.
On Mon, Jan 22, 2024 at 8:58 AM <reid.thompson@crunchydata.com> wrote: > On Fri, 2024-01-19 at 17:37 -0500, reid.thompson@crunchydata.com wrote: > > On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote: > > > > > > I can add a suggestion for the user to consider increasing > > > shared_buffers in accordance with higher max_connections, but it > > > would be better if there was a "rule of thumb" guideline to go > > > along. I'm open to suggestions. > > > > > > I can revise with a similar warning in max_prepared_xacts as well. > > > > > > Sincerely, > > > > > > Roberto > > > > Can a "close enough" rule of thumb be calculated from: > > postgresql.conf -> log_min_messages = debug3 > > > > start postgresql with varying max_connections to get > > CreateSharedMemoryAndSemaphores() sizes to generate a rough equation > > > > or maybe it would be sufficient to advise to set log_min_messages = > debug3 on a test DB and start/stop it with varying values of > max_connections and look at the differing values in > DEBUG: invoking IpcMemoryCreate(size=...) log messages for themselves. > > I'm of the opinion that advice suggestingDBA's set things to DEBUG 3 is unfriendly at best. If you really want to add more, there is an existing unfriendly section of the docs at https://www.postgresql.org/docs/devel/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT that mentions this problem, specifically: "If PostgreSQL itself is the cause of the system running out of memory, you can avoid the problem by changing your configuration. In some cases, it may help to lower memory-related configuration parameters, particularly shared_buffers, work_mem, and hash_mem_multiplier. In other cases, the problem may be caused by allowing too many connections to the database server itself. In many cases, it may be better to reduce max_connections and instead make use of external connection-pooling software." I couldn't really find a spot to add in your additional info, but maybe you can find a spot that fits? Or maybe a well written walk-through of this would make for a good wiki page in case people really want to dig in. In any case, I think Roberto's original language is an improvement over what we have now, so I'd probably recommend just going with that, along with a similar note to max_prepared_xacts, and optionally a pointer to the shared mem section of the docs. Robert Treat https://xzilla.net
Hi, On Fri, Jan 12, 2024 at 10:14:38PM +0000, Cary Huang wrote: > I think it is good to warn the user about the increased allocation of > memory for certain parameters so that they do not abuse it by setting > it to a huge number without knowing the consequences. Right, and I think it might be useful to log (i.e. at LOG not DEBUG3 level, with a nicer message) the amount of memory we allocate on startup, that is just one additional line per instance lifetime but might be quite useful to admins. Or maybe two lines if we log whether we could allocate it as huge pages or not as well: |2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856) |2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocatememory > It is true that max_connections can increase the size of proc array > and other resources, which are allocated in the shared buffer, which > also means less shared buffer to perform regular data operations. AFAICT, those resources are allocated on top of shared_buffers, i.e. the total allocated memory is shared_buffers + (some resources) * max_connections + (other resources) * other_factors. > Instead of stating that higher max_connections results in higher > allocation, It may be better to tell the user that if the value needs > to be set much higher, consider increasing the "shared_buffers" > setting as well. Only if what you say above is true and I am at fault. Michael
On Fri, Mar 8, 2024 at 10:47 AM Michael Banck <mbanck@gmx.net> wrote: > > Hi, > > On Fri, Jan 12, 2024 at 10:14:38PM +0000, Cary Huang wrote: > > I think it is good to warn the user about the increased allocation of > > memory for certain parameters so that they do not abuse it by setting > > it to a huge number without knowing the consequences. > > Right, and I think it might be useful to log (i.e. at LOG not DEBUG3 > level, with a nicer message) the amount of memory we allocate on > startup, that is just one additional line per instance lifetime but > might be quite useful to admins. Or maybe two lines if we log whether we > could allocate it as huge pages or not as well: > > |2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856) > |2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocatememory > If we were going to add these details (and I very much like the idea), I would advocate that we put it somewhere more permanent than a single log entry at start-up. Given that database up-times easily run months and sometimes years, it is hard to imagine we'd always have access to the log files to figure this out on any actively running systems. Robert Treat https://xzilla.net
Hi, On Sun, Mar 10, 2024 at 09:58:25AM -0400, Robert Treat wrote: > On Fri, Mar 8, 2024 at 10:47 AM Michael Banck <mbanck@gmx.net> wrote: > > On Fri, Jan 12, 2024 at 10:14:38PM +0000, Cary Huang wrote: > > > I think it is good to warn the user about the increased allocation of > > > memory for certain parameters so that they do not abuse it by setting > > > it to a huge number without knowing the consequences. > > > > Right, and I think it might be useful to log (i.e. at LOG not DEBUG3 > > level, with a nicer message) the amount of memory we allocate on > > startup, that is just one additional line per instance lifetime but > > might be quite useful to admins. Or maybe two lines if we log whether we > > could allocate it as huge pages or not as well: > > > > |2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856) > > |2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocatememory > > > > If we were going to add these details (and I very much like the idea), > I would advocate that we put it somewhere more permanent than a single > log entry at start-up. Given that database up-times easily run months > and sometimes years, it is hard to imagine we'd always have access to > the log files to figure this out on any actively running systems. Well actually, those two numbers are already available at runtime, via the shared_memory_size and (from 17 on) huge_pages_status GUCs. So this would be geared at admins that keeps in long-term storage and want to know what the numbers were a while ago. Maybe it is not that interesting, but I think one or two lines at startup would not hurt. Michael
On Fri, Mar 8, 2024 at 9:52 AM Robert Treat <rob@xzilla.net> wrote: > I'm of the opinion that advice suggestingDBA's set things to DEBUG 3 > is unfriendly at best. If you really want to add more, there is an > existing unfriendly section of the docs at > https://www.postgresql.org/docs/devel/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT > that mentions this problem, specifically: > > "If PostgreSQL itself is the cause of the system running out of > memory, you can avoid the problem by changing your configuration. In > some cases, it may help to lower memory-related configuration > parameters, particularly shared_buffers, work_mem, and > hash_mem_multiplier. In other cases, the problem may be caused by > allowing too many connections to the database server itself. In many > cases, it may be better to reduce max_connections and instead make use > of external connection-pooling software." > > I couldn't really find a spot to add in your additional info, but > maybe you can find a spot that fits? Or maybe a well written > walk-through of this would make for a good wiki page in case people > really want to dig in. > > In any case, I think Roberto's original language is an improvement > over what we have now, so I'd probably recommend just going with that, > along with a similar note to max_prepared_xacts, and optionally a > pointer to the shared mem section of the docs. I agree with this. I don't agree with Cary's statement that if you increase max_connections you should increase shared_buffers as well. That seems situation-dependent to me, and it's also missing Roberto's point, which is that JUST increasing max_connections without doing anything else uses more shared memory. Similarly, I don't think we need to document a detailed testing procedure, as proposed by Reid. If users want to know exactly how many additional resources are used, they can test; either using the DEBUG3 approach, or perhaps more simply via the pg_shmem_allocations view. But I think it's overkill for us to recommend any specific testing procedure here. Rather, I think that it's entirely appropriate to do what Roberto suggested, which is to say, let users know that they're going to use some extra resources if they increase the setting, and then let them figure out what if anything they want to do about that. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Mar 22, 2024 at 1:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > Rather, I think that it's entirely appropriate to do what Roberto > suggested, which is to say, let users know that they're going to use > some extra resources if they increase the setting, and then let them > figure out what if anything they want to do about that. Considering that, and the lack of further comment, I propose to commit the original patch. Objections? -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, May 15, 2024 at 11:14 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Mar 22, 2024 at 1:57 PM Robert Haas <robertmhaas@gmail.com> wrote: > > Rather, I think that it's entirely appropriate to do what Roberto > > suggested, which is to say, let users know that they're going to use > > some extra resources if they increase the setting, and then let them > > figure out what if anything they want to do about that. > > Considering that, and the lack of further comment, I propose to commit > the original patch. > > Objections? > I think the only unresolved question in my mind was if we should add a similar note to the original patch to max_prepared_xacts as well; do you intend to do that? Robert Treat https://xzilla.net
On Wed, May 15, 2024 at 4:00 PM Robert Treat <rob@xzilla.net> wrote: > I think the only unresolved question in my mind was if we should add a > similar note to the original patch to max_prepared_xacts as well; do > you intend to do that? I didn't intend to do that. I don't think it would be incorrect to do so, but then we're kind of getting into a slippery slope of trying to label every parameter that has increases shared memory usage or any other kind of research consumption, and there are probably (pulls number out of the air) twenty of those. It seems more worthwhile to mention it for max_connections than the other (deducts one from previous random guess) nineteen because it affects a whole lot more things, like the size of the fsync queue and the size of the lock table, and also because it tends to get set to relatively large values, unlike, for example, autovacuum_max_workers. If you think we should go further than just doing max_connections, then I think we either need to (a) add a note to every single bloody parameter that affects the size of shared memory or (b) prove that the subset where we add such a note have a significantly larger impact than the others where we don't. Do you think we should get into all that? -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, May 15, 2024 at 4:05 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, May 15, 2024 at 4:00 PM Robert Treat <rob@xzilla.net> wrote: > > I think the only unresolved question in my mind was if we should add a > > similar note to the original patch to max_prepared_xacts as well; do > > you intend to do that? > > I didn't intend to do that. I don't think it would be incorrect to do > so, but then we're kind of getting into a slippery slope of trying to > label every parameter that has increases shared memory usage or any > other kind of research consumption, and there are probably (pulls > number out of the air) twenty of those. It seems more worthwhile to > mention it for max_connections than the other (deducts one from > previous random guess) nineteen because it affects a whole lot more > things, like the size of the fsync queue and the size of the lock > table, and also because it tends to get set to relatively large > values, unlike, for example, autovacuum_max_workers. If you think we > should go further than just doing max_connections, then I think we > either need to (a) add a note to every single bloody parameter that > affects the size of shared memory or (b) prove that the subset where > we add such a note have a significantly larger impact than the others > where we don't. Do you think we should get into all that? > Nope. Let's do the best bang for the buck improvement and we can see if we get any feedback that indicates more needs to be done. Robert Treat https://xzilla.net
On Wed, May 15, 2024 at 4:22 PM Robert Treat <rob@xzilla.net> wrote: > Nope. Let's do the best bang for the buck improvement and we can see > if we get any feedback that indicates more needs to be done. Done. -- Robert Haas EDB: http://www.enterprisedb.com