Обсуждение: icps, shmmax and shmall - Shared Memory tuning
pgsql-general users - background: rh6.1 (glibc 2.1.3), SMP p3-600, 512MB RAM, postgres 7.2.1, kernel 2.4.17 My postgres server decided to act up this morning, and I discovered this in the logs - Apr 26 09:34:16 mito logger: ^IThe Postmaster has informed me that some other backend Apr 26 09:34:16 mito logger: ^Idied abnormally and possibly corrupted shared memory. Apr 26 09:34:16 mito logger: ^II have rolled back the current transaction and am Apr 26 09:39:54 mito logger: ^Igoing to terminate your database system connection and exit. Apr 26 09:43:16 mito logger: IpcMemoryCreate: shmget(key=5432001, size=137175040, 03600) failed: Invalid argument Apr 26 09:43:16 mito logger: Apr 26 09:43:16 mito logger: This error usually means that PostgreSQL's request for a shared memory Apr 26 09:43:16 mito logger: segment exceeded your kernel's SHMMAX parameter. You can either Apr 26 09:43:16 mito logger: reduce the request size or reconfigure the kernel with larger SHMMAX. Apr 26 09:43:16 mito logger: To reduce the request size (currently 137175040 bytes), reduce Apr 26 09:43:16 mito logger: PostgreSQL's shared_buffers parameter (currently 16000) and/or Apr 26 09:43:16 mito logger: its max_connections parameter (currently 256). Apr 26 09:43:16 mito logger: Apr 26 09:43:16 mito logger: If the request size is already small, it's possible that it is less than Apr 26 09:43:16 mito logger: your kernel's SHMMIN parameter, in which case raising the request size or Apr 26 09:43:17 mito logger: reconfiguring SHMMIN is called for. Apr 26 09:43:17 mito logger: Apr 26 09:43:17 mito logger: The PostgreSQL Administrator's Guide contains more information about Apr 26 09:43:17 mito logger: shared memory configuration. The box has 512MB RAM, and so I set the following per the tuning docs (and some mailing list information ) - echo "201326592" > /proc/sys/kernel/shmmax /sbin/sysctl -w kernel.shmmax=201326592 echo "201326592" > /proc/sys/kernel/shmall /sbin/sysctl -w kernel.shmall=201326592 But when I run 'ipcs -l', it lists the shmall (total) as ~800MB. Is this correct, or s there something else I need to do? I've seen some mentions of using 'ipcrm' to clear out stale segments, but I've not seen any pages describing how to do this in a real-world situation. # ipcs -l ------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 196608 ** max total shared memory (kbytes) = 805306368 ** min seg size (bytes) = 1 ------ Semaphore Limits -------- max number of arrays = 128 max semaphores per array = 250 max semaphores system wide = 32000 max ops per semop call = 32 semaphore max value = 32767 ------ Messages: Limits -------- max queues system wide = 16 max size of message (bytes) = 8192 default max size of queue (bytes) = 16384 Here's the output from 'ipcs -m' - ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x00000000 0 nobody 600 46084 13 dest 0x00000000 32769 nobody 600 46084 6 dest 0x00000000 65538 nobody 600 46084 3 dest 0x0052e2c1 98307 postgres 600 137175040 28 0x00000000 131076 nobody 600 46084 5 dest Is this the output I should expect, or am I missing something? I'm not entirely fluent in fine-tuning shared memory, so I certainly appreciate any help you can offer. If you have any other tuning tips for my setup, I'm always open to suggestions. -d __________________________________________________ Do You Yahoo!? Yahoo! Games - play chess, backgammon, pool and more http://games.yahoo.com/
dorian dorian <dorian37076@yahoo.com> writes: > Apr 26 09:43:16 mito logger: IpcMemoryCreate: > shmget(key=5432001, size=137175040, 03600) failed: Invalid argument > ------ Shared Memory Segments -------- > key shmid owner perms bytes > nattch status > 0x0052e2c1 98307 postgres 600 137175040 28 This is very strange. The postmaster should have re-used the existing shmem segment, rather than trying to create a new one as it's evidently doing. Or, if it didn't do that, it should've tried to create a new segment with a different key, not re-use the conflicting key. There's some kind of bug here. Are you up for tracing through IpcMemoryCreate with a debugger to see what's going wrong? If you just want to get going again, you can remove that segment with ipcrm (I think "ipcrm shm 98307" is the syntax to use on Linux) and then the postmaster should start. But it would be useful to understand the failure mode so we can fix it. FWIW, I do not see any comparable problem on rh7.2 (2.4.7-10 kernel) --- the postmaster restarts perfectly cleanly after doing a kill -9 on one of its children. regards, tom lane
--- Tom Lane <tgl@sss.pgh.pa.us> wrote: > dorian dorian <dorian37076@yahoo.com> writes: > > Apr 26 09:43:16 mito logger: IpcMemoryCreate: > > shmget(key=5432001, size=137175040, 03600) failed: > Invalid argument > > > ------ Shared Memory Segments -------- > > key shmid owner perms bytes > > nattch status > > 0x0052e2c1 98307 postgres 600 137175040 > 28 > > This is very strange. [...] There's > some kind of bug here. Are you up for tracing > through IpcMemoryCreate > with a debugger to see what's going wrong? Will this involve any kind of downtime for the server? I'm more than willing to help as long as it doesn't take the box or postgres down again while testing. > If you just want to get going again, you can remove > that segment with > ipcrm (I think "ipcrm shm 98307" is the syntax to > use on Linux) and > then the postmaster should start. This was also in the logs - Apr 26 09:34:16 mito logger: DEBUG: server process (pid 21540) was terminated by signal 9 Apr 26 09:34:16 mito logger: DEBUG: terminating any other active server processes Apr 26 09:34:16 mito logger: NOTICE: Message from PostgreSQL backend: Apr 26 09:34:16 mito logger: ^IThe Postmaster has informed me that some other backend Apr 26 09:34:16 mito logger: ^Idied abnormally and possibly corrupted shared memory. Apr 26 09:34:16 mito logger: ^II have rolled back the current transaction and am Apr 26 09:34:16 mito logger: ^Igoing to terminate your database system connection and exit. Apr 26 09:34:16 mito logger: ^IPlease reconnect to the database system and repeat your query. Apr 26 09:34:16 mito logger: NOTICE: Message from PostgreSQL backend: Apr 26 09:34:16 mito logger: ^IThe Postmaster has informed me that some other backend Apr 26 09:34:16 mito logger: ^Idied abnormally and possibly corrupted shared memory. Apr 26 09:34:16 mito logger: ^II have rolled back the current transaction and am Apr 26 09:34:17 mito kernel: Out of Memory: Killed process 21540 (postmaster). The machine just stopped responding at 9:34 and had to be rebooted. Is there any way to prevent this from happening, via a configuration option in postgres? Thanks very much for all your help! -d __________________________________________________ Do You Yahoo!? Yahoo! Health - your guide to health and wellness http://health.yahoo.com
dorian dorian <dorian37076@yahoo.com> writes: > This was also in the logs - > Apr 26 09:34:17 mito kernel: Out of Memory: Killed > process 21540 (postmaster). Ugh. There's not a lot we can do about the kernel deciding to kill us. > The machine just stopped responding at 9:34 and had to > be rebooted. Is there any way to prevent this from > happening, via a configuration option in postgres? Perhaps you should talk to the kernel developers about why they can't find more graceful ways of dealing with out-of-memory situations :-( I am not sure exactly what Linux considers an out-of-memory situation. If it's dependent on available swap space, then configuring more swap would probably prevent this scenario. If only physical RAM counts, you might need to buy more RAM, or configure Postgres with a smaller shared_buffers value. regards, tom lane
On Sun, Apr 28, 2002 at 04:24:19PM -0400, Tom Lane wrote: > dorian dorian <dorian37076@yahoo.com> writes: > > This was also in the logs - > > > Apr 26 09:34:17 mito kernel: Out of Memory: Killed > > process 21540 (postmaster). > > Ugh. There's not a lot we can do about the kernel deciding to kill us. Not good. > > The machine just stopped responding at 9:34 and had to > > be rebooted. Is there any way to prevent this from > > happening, via a configuration option in postgres? > > Perhaps you should talk to the kernel developers about why they can't > find more graceful ways of dealing with out-of-memory situations :-( It's a bit hard be more graceful when you have no physical memory and no swap available. Something has to give. And if one process happens to be chewing >90% of memory, the kernel decides it should be the target. > I am not sure exactly what Linux considers an out-of-memory situation. > If it's dependent on available swap space, then configuring more swap > would probably prevent this scenario. If only physical RAM counts, > you might need to buy more RAM, or configure Postgres with a smaller > shared_buffers value. Adding more swap space definitly helps, but if you have a query that just eats a lot of memory, it's better to fix the query... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America
Martijn van Oosterhout <kleptog@svana.org> writes: > Adding more swap space definitly helps, but if you have a query that just > eats a lot of memory, it's better to fix the query... The problem here is that the *postmaster* is getting killed. It's not the one consuming excess memory (assuming that the underlying problem is a runaway query, which seems plausible). In any case, why is "kill -9 some process" an appropriate behavior? Sane kernels return an error on sbrk(2) if they don't have any more memory to give out... I suppose people who see this happen a lot might consider launching the postmaster as an inittab entry --- if init sees the postmaster die, it should restart it. Although if old backends are still running, this isn't necessarily going to fix anything. (And it seems to me I have heard that the Linux kernel is willing to gun down init too, so relying on init to survive a memory crunch may be wishful thinking.) regards, tom lane
On Sun, Apr 28, 2002 at 08:12:56PM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > Adding more swap space definitly helps, but if you have a query that just > > eats a lot of memory, it's better to fix the query... > > The problem here is that the *postmaster* is getting killed. It's not > the one consuming excess memory (assuming that the underlying problem > is a runaway query, which seems plausible). It depends on what version you're running. I used to be that it simply killed whatever process asked for the memory when it run out. As you point out, occasionally that was init. In the cases of the postmaster, it's probably one of accept(), connect() or select() that's running out of memory. > In any case, why is "kill -9 some process" an appropriate behavior? > Sane kernels return an error on sbrk(2) if they don't have any more > memory to give out... The problem is that sbrk merely extends your memory map, the memory is not actually allocated until it is used, i.e. it's overcomitting memory. The actual running out of memory will occur in a page fault rather than sbrk() failing. This overcomitting is somewhat optional, depending on your OS. As noted above, other system calls also allocate memory, notably select() and poll(), though read() and write() also. > I suppose people who see this happen a lot might consider launching the > postmaster as an inittab entry --- if init sees the postmaster die, it > should restart it. Although if old backends are still running, this > isn't necessarily going to fix anything. (And it seems to me I have > heard that the Linux kernel is willing to gun down init too, so relying > on init to survive a memory crunch may be wishful thinking.) The "kill large processes" is recent when people started complaining about init being killed. They should have just told these people "get more swap/buy more memory/fix your program" rather than spend ages debating which process is the right one to kill... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America
Martijn van Oosterhout <kleptog@svana.org> writes: > On Sun, Apr 28, 2002 at 08:12:56PM -0400, Tom Lane wrote: >> Sane kernels return an error on sbrk(2) if they don't have any more >> memory to give out... > The problem is that sbrk merely extends your memory map, the memory is not > actually allocated until it is used, i.e. it's overcomitting memory. And this is the application's fault? If Linux overcommits memory, then Linux is broken. Do not bother to argue the point. I shall recommend other Unixen to anyone who wants to run reliable applications. (HPUX for example; which has plenty of faults, but at least it keeps track of how much space it can promise.) regards, tom lane
On Sun, Apr 28, 2002 at 11:47:19PM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > The problem is that sbrk merely extends your memory map, the memory is not > > actually allocated until it is used, i.e. it's overcomitting memory. > > And this is the application's fault? > > If Linux overcommits memory, then Linux is broken. Do not bother to > argue the point. I shall recommend other Unixen to anyone who wants > to run reliable applications. (HPUX for example; which has plenty of > faults, but at least it keeps track of how much space it can promise.) I'm not saying it's a good idea. Indeed, people saying all the time it's bad. But it is the default. If people don't like then they should set the over_commit sysctl off (I forget the exact name). -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America
At 02:52 PM 4/29/02 +1000, Martijn van Oosterhout wrote: >I'm not saying it's a good idea. Indeed, people saying all the time it's >bad. But it is the default. If people don't like then they should set the >over_commit sysctl off (I forget the exact name). Can't find anything on it on the Linux doc site (tldp). Any links? Is it better to turn off overcommit and buy more RAM? That way postgresql or some other process doesn't get killed? All along I had the impression that Linux had big probs with its VM (since 2.x) with OOM situations (and not so good impressions of the people in charge of it), and so it's mainly because of overcommit? Ack. Thanks, Link.
----- Original Message ----- From: Tom Lane <tgl@sss.pgh.pa.us> To: Martijn van Oosterhout <kleptog@svana.org> Cc: dorian dorian <dorian37076@yahoo.com>; <pgsql-general@postgresql.org> Sent: Monday, April 29, 2002 9:17 AM Subject: Re: [GENERAL] icps, shmmax and shmall - Shared Memory tuning > Martijn van Oosterhout <kleptog@svana.org> writes: > > On Sun, Apr 28, 2002 at 08:12:56PM -0400, Tom Lane wrote: > >> Sane kernels return an error on sbrk(2) if they don't have any more > >> memory to give out... > > > The problem is that sbrk merely extends your memory map, the memory is not > > actually allocated until it is used, i.e. it's overcomitting memory. > > And this is the application's fault? > > If Linux overcommits memory, then Linux is broken. Do not bother to > argue the point. IANAKH, but I believe Linux chooses to over-commit cause many programs asks for far more RAM than they ever use. It's a bit of a gamble for the kernel to allow memory overcommit, but it pays off most of the time. I think you can turn it off by placing the line vm.overcommit_memory = 0 in /etc/sysctl.conf --Arsalan.
On Mon, Apr 29, 2002 at 04:38:17PM +0800, Lincoln Yeoh wrote: > At 02:52 PM 4/29/02 +1000, Martijn van Oosterhout wrote: > >I'm not saying it's a good idea. Indeed, people saying all the time it's > >bad. But it is the default. If people don't like then they should set the > >over_commit sysctl off (I forget the exact name). > > Can't find anything on it on the Linux doc site (tldp). Any links? Somewhere in /proc. It's documented in the kernel source. > Is it better to turn off overcommit and buy more RAM? That way postgresql > or some other process doesn't get killed? All turning off overcommit will do is cause your postgres to die with out of memory earlier instead. Buying more memory is certainly an option. But it would be better to work what is actually using the memory. Hypothetically, if one of your users is DoSing your system, buying more memory won't help. > All along I had the impression that Linux had big probs with its VM (since > 2.x) with OOM situations (and not so good impressions of the people in > charge of it), and so it's mainly because of overcommit? Ack. No, it's mainly because people disagree what the system should do about it. Overcommit means that people are allowed to allocate stuff without using it and that's ok. For example, look at the following program: while(1) malloc(1024000); With overcommit on, this program is harmless. The program will fill up it's entire address space and stop. With overcommit off, every other program will start getting out of memory errors. What is the solution here? Newer versions of linux will kill a process like this, I don't know what other OSes do. Linux's VM problems are unrelated to this. They have to do with system behaviour as sizeof(Working Set) approaches sizeof(Physical Memory). Bad is if the machine continuously swaps, good if the machine stays reasonably responsive. If you are experiencing this situation often, more memory is necessary. Hope this clears up any confusion, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America
So to return to my initial question.. Here's a summary of the error message and my settings after the crash - May 1 13:24:42 mito logger: IpcMemoryCreate: shmget(key=5432001, size=140361728, 03600) failed: Invalid argument May 1 13:24:42 mito logger: This error usually means that PostgreSQL's request for a shared memory May 1 13:24:42 mito logger: segment exceeded your kernel's SHMMAX parameter. You can either May 1 13:24:42 mito logger: reduce the request size or reconfigure the kernel with larger SHMMAX. May 1 13:24:42 mito logger: To reduce the request size (currently 140361728 bytes), reduce May 1 13:24:42 mito logger: PostgreSQL's shared_buffers parameter (currently 16384) and/or May 1 13:24:42 mito logger: its max_connections parameter (currently 256). # sysctl kernel.shmmax kernel.shmmax = 201326592 # sysctl kernel.shmall kernel.shmall = 201326592 I've got 512MB RAM - what am I doing wrong? Postgres consistently crashes (and subsequently takes the machine down) on any heavy abuse. Any other settings I should be looking at? __________________________________________________ Do You Yahoo!? Yahoo! Health - your guide to health and wellness http://health.yahoo.com