Re: Debugging shared memory issues on CentOS

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Debugging shared memory issues on CentOS
Дата
Msg-id 3716.1386737676@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Debugging shared memory issues on CentOS  (Mack Talcott <mack.talcott@gmail.com>)
Ответы Re: Debugging shared memory issues on CentOS  (Mack Talcott <mack.talcott@gmail.com>)
Список pgsql-performance
Mack Talcott <mack.talcott@gmail.com> writes:
> I am trying to debug some shared memory issues with Postgres 9.3.1 and
> CentOS release 6.3 (Final).  I have a database machine that probably has
> some misconfigured shared memory settings.  It's getting into 2+ GB of
> swap.  Restarting postgres frees all of the memory, but after a few hours
> of normal usage it will go back into swap.

Are you sure the kernel isn't just swapping out some idle processes
because it feels like it?  These numbers don't exactly look like a
machine under stress:

> top - 09:38:16 up 1 day, 21:21,  3 users,  load average: 0.40, 0.54, 0.45
> Tasks: 253 total,   2 running, 251 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.7%us,  0.2%sy,  0.0%ni, 97.8%id,  1.2%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Mem:   6998260k total,  6849048k used,   149212k free,      248k buffers
> Swap: 440478516k total,  1981912k used, 438496604k free,  1541356k cached

In particular, you've got 1.5 gig of filesystem cache, so you're hardly
out of memory.  I don't know where the other 5.5 gig of RAM went, but
it doesn't look like postgres is eating it; what else is running on
this box?

These lines look absolutely normal, assuming that you've configured
shared_buffers somewhere in the neighborhood of 1GB:

>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  3534 postgres  20   0 2330m 1.4g 1.1g S  0.0 20.4   1:06.99 postgres:
> deploy mtalcott 10.222.154.172(53495) idle
>  9143 postgres  20   0 2221m 1.1g 983m S  0.0 16.9   0:14.75 postgres:
> deploy mtalcott 10.222.154.167(35811) idle
>  6026 postgres  20   0 2341m 1.1g 864m S  0.0 16.4   0:46.56 postgres:
> deploy mtalcott 10.222.154.167(37110) idle
> 18538 postgres  20   0 2327m 1.1g 865m S  0.0 16.1   2:06.59 postgres:
> deploy mtalcott 10.222.154.172(47796) idle
>  1575 postgres  20   0 2358m 1.1g 858m S  0.0 15.9   1:41.76 postgres:
> deploy mtalcott 10.222.154.172(52560) idle

The key thing to realize about that is that the SHR column is *shared*
memory, ie all these processes are referencing the same chunk of about 1GB
worth of memory.  The process-specific memory is RES minus SHR, and none
of those processes seem tremendously out of line on that measure.  (Note:
the fact that the SHR values aren't all exactly the same is because top
doesn't count a shared page until the process has physically touched that
page.  Even the guy with 1.1g of SHR might not have touched all of the
shared storage yet.)

I'm not sure you have a problem here.  If you do, these figures aren't
showing it.  Having some stuff shoved out to swap is not a problem unless
you have a problem with the swap I/O rate.  You might try watching "vmstat
1" for awhile to see if the si/so columns show significant activity.

            regards, tom lane


В списке pgsql-performance по дате отправления:

Предыдущее
От: Krzysztof Olszewski
Дата:
Сообщение: Problem with slow query with WHERE conditions with OR clause on primary keys
Следующее
От: David Johnston
Дата:
Сообщение: Re: Problem with slow query with WHERE conditions with OR clause on primary keys