Обсуждение: icps, shmmax and shmall - Shared Memory tuning

Поиск
Список
Период
Сортировка

icps, shmmax and shmall - Shared Memory tuning

От
dorian dorian
Дата:
pgsql-general users -

background: rh6.1 (glibc 2.1.3), SMP p3-600, 512MB
RAM, postgres 7.2.1, kernel 2.4.17

My postgres server decided to act up this morning, and
I discovered this in the logs -

Apr 26 09:34:16 mito logger: ^IThe Postmaster has
informed me that some other backend
Apr 26 09:34:16 mito logger: ^Idied abnormally and
possibly corrupted shared memory.
Apr 26 09:34:16 mito logger: ^II have rolled back the
current transaction and am
Apr 26 09:39:54 mito logger: ^Igoing to terminate your
database system connection and exit.
Apr 26 09:43:16 mito logger: IpcMemoryCreate:
shmget(key=5432001, size=137175040, 03600) failed:
Invalid
argument
Apr 26 09:43:16 mito logger:
Apr 26 09:43:16 mito logger: This error usually means
that PostgreSQL's request for a shared memory
Apr 26 09:43:16 mito logger: segment exceeded your
kernel's SHMMAX parameter.  You can either
Apr 26 09:43:16 mito logger: reduce the request size
or reconfigure the kernel with larger SHMMAX.
Apr 26 09:43:16 mito logger: To reduce the request
size (currently 137175040 bytes), reduce
Apr 26 09:43:16 mito logger: PostgreSQL's
shared_buffers parameter (currently 16000) and/or
Apr 26 09:43:16 mito logger: its max_connections
parameter (currently 256).
Apr 26 09:43:16 mito logger:
Apr 26 09:43:16 mito logger: If the request size is
already small, it's possible that it is less than
Apr 26 09:43:16 mito logger: your kernel's SHMMIN
parameter, in which case raising the request size or
Apr 26 09:43:17 mito logger: reconfiguring SHMMIN is
called for.
Apr 26 09:43:17 mito logger:
Apr 26 09:43:17 mito logger: The PostgreSQL
Administrator's Guide contains more information about
Apr 26 09:43:17 mito logger: shared memory
configuration.

The box has 512MB RAM, and so I set the following per
the tuning docs (and some mailing list information ) -

echo "201326592" > /proc/sys/kernel/shmmax
/sbin/sysctl -w kernel.shmmax=201326592
echo "201326592" > /proc/sys/kernel/shmall
/sbin/sysctl -w kernel.shmall=201326592

But when I run 'ipcs -l', it lists the shmall (total)
as ~800MB. Is this correct, or s there something else
I need to do? I've seen some mentions of using 'ipcrm'
to clear out stale segments, but I've not seen any
pages describing how to do this in a real-world
situation.

# ipcs -l

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 196608
** max total shared memory (kbytes) = 805306368 **
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767

------ Messages: Limits --------
max queues system wide = 16
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384

Here's the output from 'ipcs -m' -

------ Shared Memory Segments --------
key       shmid     owner     perms     bytes
nattch    status
0x00000000 0         nobody    600       46084     13
      dest
0x00000000 32769     nobody    600       46084     6
      dest
0x00000000 65538     nobody    600       46084     3
      dest
0x0052e2c1 98307     postgres  600       137175040 28

0x00000000 131076    nobody    600       46084     5
      dest

Is this the output I should expect, or am I missing
something? I'm not entirely fluent in fine-tuning
shared memory, so I certainly appreciate any help you
can offer.

If you have any other tuning tips for my setup, I'm
always open to suggestions.

-d

__________________________________________________
Do You Yahoo!?
Yahoo! Games - play chess, backgammon, pool and more
http://games.yahoo.com/

Re: icps, shmmax and shmall - Shared Memory tuning

От
Tom Lane
Дата:
dorian dorian <dorian37076@yahoo.com> writes:
> Apr 26 09:43:16 mito logger: IpcMemoryCreate:
> shmget(key=5432001, size=137175040, 03600) failed: Invalid argument

> ------ Shared Memory Segments --------
> key       shmid     owner     perms     bytes
> nattch    status
> 0x0052e2c1 98307     postgres  600       137175040 28

This is very strange.  The postmaster should have re-used the existing
shmem segment, rather than trying to create a new one as it's evidently
doing.  Or, if it didn't do that, it should've tried to create a new
segment with a different key, not re-use the conflicting key.  There's
some kind of bug here.  Are you up for tracing through IpcMemoryCreate
with a debugger to see what's going wrong?

If you just want to get going again, you can remove that segment with
ipcrm (I think "ipcrm shm 98307" is the syntax to use on Linux) and
then the postmaster should start.  But it would be useful to understand
the failure mode so we can fix it.

FWIW, I do not see any comparable problem on rh7.2 (2.4.7-10 kernel) ---
the postmaster restarts perfectly cleanly after doing a kill -9 on one
of its children.

            regards, tom lane

Re: icps, shmmax and shmall - Shared Memory tuning

От
dorian dorian
Дата:
--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> dorian dorian <dorian37076@yahoo.com> writes:
> > Apr 26 09:43:16 mito logger: IpcMemoryCreate:
> > shmget(key=5432001, size=137175040, 03600) failed:
> Invalid argument
>
> > ------ Shared Memory Segments --------
> > key       shmid     owner     perms     bytes
> > nattch    status
> > 0x0052e2c1 98307     postgres  600       137175040
> 28
>
> This is very strange.  [...]  There's
> some kind of bug here.  Are you up for tracing
> through IpcMemoryCreate
> with a debugger to see what's going wrong?

Will this involve any kind of downtime for the server?
I'm more than willing to help as long as it doesn't
take the box or postgres down again while testing.

> If you just want to get going again, you can remove
> that segment with
> ipcrm (I think "ipcrm shm 98307" is the syntax to
> use on Linux) and
> then the postmaster should start.

This was also in the logs -

Apr 26 09:34:16 mito logger: DEBUG:  server process
(pid 21540) was terminated by signal 9
Apr 26 09:34:16 mito logger: DEBUG:  terminating any
other active server processes
Apr 26 09:34:16 mito logger: NOTICE:  Message from
PostgreSQL backend:
Apr 26 09:34:16 mito logger: ^IThe Postmaster has
informed me that some other backend
Apr 26 09:34:16 mito logger: ^Idied abnormally and
possibly corrupted shared memory.
Apr 26 09:34:16 mito logger: ^II have rolled back the
current transaction and am
Apr 26 09:34:16 mito logger: ^Igoing to terminate your
database system connection and exit.
Apr 26 09:34:16 mito logger: ^IPlease reconnect to the
database system and repeat your query.
Apr 26 09:34:16 mito logger: NOTICE:  Message from
PostgreSQL backend:
Apr 26 09:34:16 mito logger: ^IThe Postmaster has
informed me that some other backend
Apr 26 09:34:16 mito logger: ^Idied abnormally and
possibly corrupted shared memory.
Apr 26 09:34:16 mito logger: ^II have rolled back the
current transaction and am
Apr 26 09:34:17 mito kernel: Out of Memory: Killed
process 21540 (postmaster).

The machine just stopped responding at 9:34 and had to
be rebooted. Is there any way to prevent this from
happening, via a configuration option in postgres?

Thanks very much for all your help!

-d


__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com

Re: icps, shmmax and shmall - Shared Memory tuning

От
Tom Lane
Дата:
dorian dorian <dorian37076@yahoo.com> writes:
> This was also in the logs -

> Apr 26 09:34:17 mito kernel: Out of Memory: Killed
> process 21540 (postmaster).

Ugh.  There's not a lot we can do about the kernel deciding to kill us.

> The machine just stopped responding at 9:34 and had to
> be rebooted. Is there any way to prevent this from
> happening, via a configuration option in postgres?

Perhaps you should talk to the kernel developers about why they can't
find more graceful ways of dealing with out-of-memory situations :-(

I am not sure exactly what Linux considers an out-of-memory situation.
If it's dependent on available swap space, then configuring more swap
would probably prevent this scenario.  If only physical RAM counts,
you might need to buy more RAM, or configure Postgres with a smaller
shared_buffers value.

            regards, tom lane

Re: icps, shmmax and shmall - Shared Memory tuning

От
Martijn van Oosterhout
Дата:
On Sun, Apr 28, 2002 at 04:24:19PM -0400, Tom Lane wrote:
> dorian dorian <dorian37076@yahoo.com> writes:
> > This was also in the logs -
>
> > Apr 26 09:34:17 mito kernel: Out of Memory: Killed
> > process 21540 (postmaster).
>
> Ugh.  There's not a lot we can do about the kernel deciding to kill us.

Not good.

> > The machine just stopped responding at 9:34 and had to
> > be rebooted. Is there any way to prevent this from
> > happening, via a configuration option in postgres?
>
> Perhaps you should talk to the kernel developers about why they can't
> find more graceful ways of dealing with out-of-memory situations :-(

It's a bit hard be more graceful when you have no physical memory and no
swap available. Something has to give. And if one process happens to be
chewing >90% of memory, the kernel decides it should be the target.

> I am not sure exactly what Linux considers an out-of-memory situation.
> If it's dependent on available swap space, then configuring more swap
> would probably prevent this scenario.  If only physical RAM counts,
> you might need to buy more RAM, or configure Postgres with a smaller
> shared_buffers value.

Adding more swap space definitly helps, but if you have a query that just
eats a lot of memory, it's better to fix the query...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

Re: icps, shmmax and shmall - Shared Memory tuning

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Adding more swap space definitly helps, but if you have a query that just
> eats a lot of memory, it's better to fix the query...

The problem here is that the *postmaster* is getting killed.  It's not
the one consuming excess memory (assuming that the underlying problem
is a runaway query, which seems plausible).

In any case, why is "kill -9 some process" an appropriate behavior?
Sane kernels return an error on sbrk(2) if they don't have any more
memory to give out...

I suppose people who see this happen a lot might consider launching the
postmaster as an inittab entry --- if init sees the postmaster die, it
should restart it.  Although if old backends are still running, this
isn't necessarily going to fix anything.  (And it seems to me I have
heard that the Linux kernel is willing to gun down init too, so relying
on init to survive a memory crunch may be wishful thinking.)

            regards, tom lane

Re: icps, shmmax and shmall - Shared Memory tuning

От
Martijn van Oosterhout
Дата:
On Sun, Apr 28, 2002 at 08:12:56PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Adding more swap space definitly helps, but if you have a query that just
> > eats a lot of memory, it's better to fix the query...
>
> The problem here is that the *postmaster* is getting killed.  It's not
> the one consuming excess memory (assuming that the underlying problem
> is a runaway query, which seems plausible).

It depends on what version you're running. I used to be that it simply
killed whatever process asked for the memory when it run out. As you point
out, occasionally that was init. In the cases of the postmaster, it's
probably one of accept(), connect() or select() that's running out of
memory.

> In any case, why is "kill -9 some process" an appropriate behavior?
> Sane kernels return an error on sbrk(2) if they don't have any more
> memory to give out...

The problem is that sbrk merely extends your memory map, the memory is not
actually allocated until it is used, i.e. it's overcomitting memory. The
actual running out of memory will occur in a page fault rather than sbrk()
failing. This overcomitting is somewhat optional, depending on your OS. As
noted above, other system calls also allocate memory, notably select() and
poll(), though read() and write() also.

> I suppose people who see this happen a lot might consider launching the
> postmaster as an inittab entry --- if init sees the postmaster die, it
> should restart it.  Although if old backends are still running, this
> isn't necessarily going to fix anything.  (And it seems to me I have
> heard that the Linux kernel is willing to gun down init too, so relying
> on init to survive a memory crunch may be wishful thinking.)

The "kill large processes" is recent when people started complaining about
init being killed. They should have just told these people "get more
swap/buy more memory/fix your program" rather than spend ages debating which
process is the right one to kill...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

Re: icps, shmmax and shmall - Shared Memory tuning

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Sun, Apr 28, 2002 at 08:12:56PM -0400, Tom Lane wrote:
>> Sane kernels return an error on sbrk(2) if they don't have any more
>> memory to give out...

> The problem is that sbrk merely extends your memory map, the memory is not
> actually allocated until it is used, i.e. it's overcomitting memory.

And this is the application's fault?

If Linux overcommits memory, then Linux is broken.  Do not bother to
argue the point.  I shall recommend other Unixen to anyone who wants
to run reliable applications.  (HPUX for example; which has plenty of
faults, but at least it keeps track of how much space it can promise.)

            regards, tom lane

Re: icps, shmmax and shmall - Shared Memory tuning

От
Martijn van Oosterhout
Дата:
On Sun, Apr 28, 2002 at 11:47:19PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > The problem is that sbrk merely extends your memory map, the memory is not
> > actually allocated until it is used, i.e. it's overcomitting memory.
>
> And this is the application's fault?
>
> If Linux overcommits memory, then Linux is broken.  Do not bother to
> argue the point.  I shall recommend other Unixen to anyone who wants
> to run reliable applications.  (HPUX for example; which has plenty of
> faults, but at least it keeps track of how much space it can promise.)

I'm not saying it's a good idea. Indeed, people saying all the time it's
bad. But it is the default. If people don't like then they should set the
over_commit sysctl off (I forget the exact name).
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

Re: icps, shmmax and shmall - Shared Memory tuning

От
Lincoln Yeoh
Дата:
At 02:52 PM 4/29/02 +1000, Martijn van Oosterhout wrote:
>I'm not saying it's a good idea. Indeed, people saying all the time it's
>bad. But it is the default. If people don't like then they should set the
>over_commit sysctl off (I forget the exact name).

Can't find anything on it on the Linux doc site (tldp). Any links?

Is it better to turn off overcommit and buy more RAM? That way postgresql
or some other process doesn't get killed?

All along I had the impression that Linux had big probs with its VM (since
2.x) with OOM situations (and not so good impressions of the people in
charge of it), and so it's mainly because of overcommit? Ack.

Thanks,
Link.


Re: icps, shmmax and shmall - Shared Memory tuning

От
"Arsalan Zaidi"
Дата:
----- Original Message -----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Martijn van Oosterhout <kleptog@svana.org>
Cc: dorian dorian <dorian37076@yahoo.com>; <pgsql-general@postgresql.org>
Sent: Monday, April 29, 2002 9:17 AM
Subject: Re: [GENERAL] icps, shmmax and shmall - Shared Memory tuning


> Martijn van Oosterhout <kleptog@svana.org> writes:
> > On Sun, Apr 28, 2002 at 08:12:56PM -0400, Tom Lane wrote:
> >> Sane kernels return an error on sbrk(2) if they don't have any more
> >> memory to give out...
>
> > The problem is that sbrk merely extends your memory map, the memory is
not
> > actually allocated until it is used, i.e. it's overcomitting memory.
>
> And this is the application's fault?
>
> If Linux overcommits memory, then Linux is broken.  Do not bother to
> argue the point.

IANAKH, but I believe Linux chooses to over-commit cause many programs asks
for far more RAM than they ever use. It's a bit of a gamble for the kernel
to allow
memory overcommit, but it pays off most of the time.

I think you can turn it off by placing the line

vm.overcommit_memory = 0

in /etc/sysctl.conf

--Arsalan.



Re: icps, shmmax and shmall - Shared Memory tuning

От
Martijn van Oosterhout
Дата:
On Mon, Apr 29, 2002 at 04:38:17PM +0800, Lincoln Yeoh wrote:
> At 02:52 PM 4/29/02 +1000, Martijn van Oosterhout wrote:
> >I'm not saying it's a good idea. Indeed, people saying all the time it's
> >bad. But it is the default. If people don't like then they should set the
> >over_commit sysctl off (I forget the exact name).
>
> Can't find anything on it on the Linux doc site (tldp). Any links?

Somewhere in /proc. It's documented in the kernel source.

> Is it better to turn off overcommit and buy more RAM? That way postgresql
> or some other process doesn't get killed?

All turning off overcommit will do is cause your postgres to die with out of
memory earlier instead. Buying more memory is certainly an option. But it
would be better to work what is actually using the memory. Hypothetically,
if one of your users is DoSing your system, buying more memory won't help.

> All along I had the impression that Linux had big probs with its VM (since
> 2.x) with OOM situations (and not so good impressions of the people in
> charge of it), and so it's mainly because of overcommit? Ack.

No, it's mainly because people disagree what the system should do about it.
Overcommit means that people are allowed to allocate stuff without using it
and that's ok. For example, look at the following program:

while(1)
  malloc(1024000);

With overcommit on, this program is harmless. The program will fill up it's
entire address space and stop. With overcommit off, every other program will
start getting out of memory errors. What is the solution here? Newer
versions of linux will kill a process like this, I don't know what other
OSes do.

Linux's VM problems are unrelated to this. They have to do with system
behaviour as sizeof(Working Set) approaches sizeof(Physical Memory). Bad is
if the machine continuously swaps, good if the machine stays reasonably
responsive. If you are experiencing this situation often, more memory is
necessary.

Hope this clears up any confusion,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

Re: icps, shmmax and shmall - Shared Memory tuning

От
dorian dorian
Дата:
So to return to my initial question.. Here's a summary
of the error message and my settings after the crash -

May  1 13:24:42 mito logger: IpcMemoryCreate:
shmget(key=5432001, size=140361728, 03600) failed:
Invalid argument
May  1 13:24:42 mito logger: This error usually means
that PostgreSQL's request for a shared memory
May  1 13:24:42 mito logger: segment exceeded your
kernel's SHMMAX parameter.  You can either
May  1 13:24:42 mito logger: reduce the request size
or reconfigure the kernel with larger SHMMAX.
May  1 13:24:42 mito logger: To reduce the request
size (currently 140361728 bytes), reduce
May  1 13:24:42 mito logger: PostgreSQL's
shared_buffers parameter (currently 16384) and/or
May  1 13:24:42 mito logger: its max_connections
parameter (currently 256).

# sysctl kernel.shmmax
kernel.shmmax = 201326592

# sysctl kernel.shmall
kernel.shmall = 201326592

I've got 512MB RAM - what am I doing wrong? Postgres
consistently crashes (and subsequently takes the
machine down) on any heavy abuse.

Any other settings I should be looking at?


__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com