Re: One source of constant annoyance identified

Поиск
Список
Период
Сортировка
От Markus Wollny
Тема Re: One source of constant annoyance identified
Дата
Msg-id 2266D0630E43BB4290742247C8910575014CE2C8@dozer.computec.de
обсуждение исходный текст
Ответ на One source of constant annoyance identified  ("Markus Wollny" <Markus.Wollny@computec.de>)
Ответы Re: One source of constant annoyance identified  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-general
Okay, now it's even more annoying...

We just upgraded the server from 1GB to 2GB of RAM because it kept
swapping out about 300MB.

I updated postgresql.conf accordingly, setting

max_connections = 190 (this is the minimum we need to satisfy webservers
and backend-jobs)
shared_buffers = 60000 (that's 468,75MB; I took this value assuming a
recommended value of 25% of RAM)
sort_mem = 80000 (that's 78,125MB, recommended value is 4% of RAM,
equalling 82MB).

Now top-output sorted by memory usage turns out to be:

  2:43pm  up  1:34,  3 users,  load average: 3.29, 2.25, 2.08
123 processes: 118 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 60.5% user,  3.1% system,  0.0% nice, 35.4% idle
CPU1 states: 47.3% user, 15.3% system,  0.0% nice, 36.4% idle
CPU2 states: 51.0% user, 17.1% system,  0.0% nice, 31.3% idle
CPU3 states: 75.4% user,  4.0% system,  0.0% nice, 20.0% idle
Mem:  2061560K av, 2054268K used,    7292K free,       0K shrd,   13924K
buff
Swap: 2097136K av,    7788K used, 2089348K free                 1825104K
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 1652 postgres   9   0  469M 469M  467M S     0.0 23.3   2:21 postmaster
 2041 postgres   9   0  431M 431M  429M S     1.2 21.4   1:32 postmaster
 1588 postgres   9   0  411M 411M  410M S    15.6 20.4   1:25 postmaster
 1597 postgres   9   0  289M 289M  287M S     0.1 14.3   0:31 postmaster
 1849 postgres   9   0  259M 259M  258M S     0.0 12.9   0:14 postmaster
 2046 postgres   9   0  239M 239M  238M S     0.0 11.9   0:25 postmaster
 1973 postgres   9   0  172M 172M  171M S     0.0  8.5   0:21 postmaster
 2142 postgres   9   0  128M 128M  127M S     0.0  6.3   0:04 postmaster
 2156 postgres   9   0  116M 116M  114M S     0.0  5.7   0:02 postmaster
 1598 postgres   9   0 86548  84M 84752 S     1.7  4.1   2:36 postmaster
 1608 postgres   9   0 60932  59M 59356 S     0.0  2.9   2:09 postmaster
 1582 postgres   9   0 57624  56M 55444 S     0.0  2.7   1:58 postmaster
 1609 postgres   9   0 56408  55M 55164 S     8.3  2.7   2:05 postmaster
 1766 postgres   9   0 45248  44M 43824 S     0.0  2.1   1:09 postmaster
 2139 postgres   9   0 45276  44M 43892 S     0.0  2.1   0:01 postmaster
 2045 postgres   9   0 41500  40M 39820 S     1.7  2.0   2:43 postmaster
 1610 postgres   9   0 41336  40M 40172 S     0.0  2.0   0:53 postmaster
 2044 postgres  14   0 38328  37M 37056 R    25.0  1.8   0:33 postmaster
 1881 postgres   9   0 34936  34M 33676 S    17.2  1.6   0:29 postmaster
 2042 postgres   9   0 33144  32M 31920 S     3.7  1.6   1:13 postmaster
 1679 postgres   9   0 32516  31M 31288 S     0.0  1.5   0:05 postmaster
 1678 postgres   9   0 31996  31M 30812 S     0.0  1.5   0:19 postmaster
 1653 postgres   9   0 29424  28M 28180 S     0.0  1.4   0:05 postmaster
 2048 postgres   9   0 27772  27M 26556 S     0.0  1.3   0:09 postmaster
 1802 postgres   9   0 26676  26M 25504 S     0.0  1.2   0:06 postmaster
 2211 postgres   9   0 25940  25M 24592 S     0.0  1.2   0:02 postmaster
 2047 postgres  14   0 25588  24M 23532 R     8.0  1.2   0:55 postmaster
 2065 postgres  11   0 25584  24M 24404 S     3.9  1.2   0:04 postmaster
 1980 postgres  11   0 24584  24M 22864 S     3.5  1.1   0:07 postmaster
 1872 postgres   9   0 23908  23M 22800 S     0.0  1.1   0:04 postmaster
 2068 postgres   9   0 21352  20M 20188 S     0.0  1.0   0:01 postmaster
 2138 postgres   9   0 20928  20M 19644 S     0.0  1.0   0:06 postmaster
 1983 postgres   9   0 20544  20M 19344 S     0.0  0.9   0:04 postmaster
 2342 postgres   9   0 20352  19M 13772 S     0.0  0.9   0:23 postmaster
 2357 postgres  15   0 20260  19M 18532 R    93.1  0.9   1:34 postmaster
 2204 postgres   9   0 19816  19M 18244 S     0.0  0.9   0:01 postmaster
 2199 postgres   9   0 16840  16M 15452 S     0.0  0.8   0:00 postmaster
 2207 postgres   9   0 16784  16M 15512 S     0.0  0.8   0:00 postmaster
 2050 postgres   9   0 15880  15M 14136 S     0.0  0.7   0:06 postmaster
 2200 postgres   9   0 15568  15M 14080 S     0.0  0.7   0:00 postmaster
 2301 postgres   9   0 15076  14M 13940 S     0.0  0.7   0:00 postmaster
 2236 postgres   9   0 14132  13M 12824 S     0.0  0.6   0:00 postmaster
 2346 postgres   9   0 14080  13M 12972 S     0.0  0.6   0:09 postmaster
 2347 postgres   9   0 14064  13M 12960 S     0.0  0.6   0:09 postmaster
 2205 postgres   9   0 13904  13M 12412 S     0.0  0.6   0:00 postmaster
 2339 postgres  11   0 12660  12M 11448 S    36.1  0.6   0:27 postmaster

which is not at all good...

And still the major problem in finding the cause of all this is to
identify the query which causes these huge backends. At the moment I
tend to suspect that there certainly are several queries which need a
bit of RAM, not that much however, and the backend fails to free up the
memory used by processed queries. What puzzles me here is that all these
400MB+ backend-processes are marked as idle (ps-output) respectively
sleeping, so they don't actually do much. But how come a sleeping
process grabs 20% of 2GB of RAM and 15% of processing time?

Another fact that hints at backends just eating memory without reason is
that when I stop the database, processes keep lingering - kill just
doesn't help, I have to kill -9 the last of the lot to get red of them
before restarting the database:

dunkles:/var/lib/pgsql/data/base # /etc/init.d/postgresql stop
Shutting down PostgreSQL
done
dunkles:/var/lib/pgsql/data/base # ps ax|grep post
  863 pts/1    S      0:00 login -- postgres
  885 pts/2    S      0:00 login -- postgres
 1552 pts/0    S      0:01 /opt/pgsql/bin/postmaster -i
-D/var/lib/pgsql/data/base
 1553 pts/0    S      0:00 postgres: stats buffer process

 1554 pts/0    S      0:01 postgres: stats collector process

 1650 pts/0    S      0:00 postgres: postgres template1 212.123.109.25
idle
 1681 pts/0    S      0:00 postgres: postgres abo 212.123.109.25 idle

 1682 pts/0    S      0:00 postgres: postgres bluebox 212.123.109.25
idle
 1683 pts/0    S      0:00 postgres: postgres kidszone 212.123.109.25
idle
 1684 pts/0    S      0:00 postgres: postgres mcv 212.123.109.25 idle

 1685 pts/0    S      0:00 postgres: postgres mpo 212.123.109.25 idle

 1686 pts/0    S      0:00 postgres: postgres nzone 212.123.109.25 idle

 1687 pts/0    S      0:00 postgres: postgres pcaction 212.123.109.25
idle
 1688 pts/0    S      0:00 postgres: postgres pcgames 212.123.109.25
idle
 1689 pts/0    S      0:00 postgres: postgres phppgadmin 212.123.109.25
idle
 1690 pts/0    S      0:00 postgres: postgres pszone 212.123.109.25 idle

 1691 pts/0    S      0:00 postgres: postgres saturn 212.123.109.25 idle

 1693 pts/0    S      0:00 postgres: postgres template1 212.123.109.25
idle
 1780 pts/0    S      0:00 postgres: postgres template1 212.123.108.149
idle
 1781 pts/0    S      0:00 postgres: postgres template1 212.123.108.149
idle
 1784 pts/0    S      0:00 postgres: postgres pcgames 212.123.108.149
idle

This just has some sort of Windows-look&feel to it - processes not
responding any more.

Now I set back the original values before upgrading RAM: shared_buffers
= 32768 (256MB) and sort_mem = 51200 (50MB). Limit for backend size not
seems to be some 265 odd MB (about four to six of those around at any
given time, as always) and top reports:

  3:11pm  up  2:03,  3 users,  load average: 3.24, 1.65, 1.81
163 processes: 156 sleeping, 7 running, 0 zombie, 0 stopped
CPU0 states: 81.3% user, 15.5% system,  0.0% nice,  2.2% idle
CPU1 states: 67.1% user, 26.0% system,  0.0% nice,  6.3% idle
CPU2 states: 69.2% user, 21.0% system,  0.0% nice,  9.3% idle
CPU3 states: 65.4% user, 20.2% system,  0.0% nice, 13.3% idle
Mem:  2061560K av, 2025392K used,   36168K free,       0K shrd,   13108K
buff
Swap: 2097136K av,    7732K used, 2089404K free                 1750556K
cached

Which isn't quite as bad anymore - although it cannot be the optimum
performance for this machine, I can't imagine.

Now what on earth can I do to get this DB running, and when I say
"running" I don't mean "sort of crawling uphills"? Are there any
OS-patches or environment-settings I need to consider? Are there known
memory-leaks? I just doesn't seem to matter whichever settings I take in
postgresql.conf, the database just eats up any available memory in any
case. I cannot use it for production purposes this way. I am quite
dispaired right now - and I am definitely running out of time. Which
bits of linux (glibc, kernel, whatnot) does postgresql 7.2.1 need in
which version to run smoothly? As I mentioned before, we don't use any
"fancy" features of the database like foreign keys, triggers (except
one) or whatever, it's just basic functionality that seems to fall down
on us...

I am willing to try almost anything - but I need to squeeze more
performance out of this thing and I need some hints on which tools to
use to identify the problem. And unfortunately I need all this as soon
as possible... Hints and help are very, very much appreciated. Thank
you!

Regards,

Markus




В списке pgsql-general по дате отправления:

Предыдущее
От: "Mourad EL HADJ MIMOUNE"
Дата:
Сообщение: system catalog tables change
Следующее
От: "Markus Wollny"
Дата:
Сообщение: Why is index disregarded when querying a timestamp?