Re: Problems with PG 9.3

Поиск
Список
Период
Сортировка
От Dhruv Shukla
Тема Re: Problems with PG 9.3
Дата
Msg-id CAFiWeJCUK987NmLHyw42O5kh0B+fSkgvxMNA4TpwaBy_TqEtHA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Problems with PG 9.3  (Scott Marlowe <scott.marlowe@gmail.com>)
Ответы Re: Problems with PG 9.3
Список pgsql-admin
Scott,
Thanks for such a valuable information. I will have a look into it and have it debugged. 

But now things have started slowing down really. Some programs have started to throw errors like

ERROR: DB COPY failed: DBD::Pg::db pg_endcopy
failed: ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.


These were things which were working before with higher memory settings. And some are just running slow, slow means literally slow even though they are connected via netstat and all.

Thanks,
Dhruv



On Tue, Aug 26, 2014 at 2:01 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Tue, Aug 26, 2014 at 12:22 PM, Dhruv Shukla <dhruvshukla82@gmail.com> wrote:
> Its 15 hours now ... that the DB was restarted and things have started to
> get stuck. Apparently taking too long to finish with these settings.... any
> further suggesstions??

Troubleshoot it while it's stuck. If your app isn't stopping /
erroring out when it loses connection then it's broken and someone
needs to code real error handling into it (or you're using a language
that's fundamentally broken in terms of handling network errors). Esp
because with a lower tcp keepalive the app should be told that the
conn died in < 10 minutes.

So I'm going on the assumption that you're losing connection. YOU need
to figure out why. tools like netstat and strace etc are useful here.
If a backend is crashing out, there'll be an entry in the pg logs, if
networking is killing it then maybe a firewall will have logs, if the
oom is killing it then the linux logs on the db server will say. Use
tools like sar and sysstat and zabbix and other monitoring packages to
see if you're running out of ram and oom is killing processes.

I assume you've lowered your work_mem etc down to something more
reasonable, like 16Meg etc. And that you restarted the server after
dropping max conns down to 200. Note that 200 is still far too many,
and you need to look into a pooler to reduce that number down to < 2 x
CPU cores. Anything over that is counter productive and likely to
cause performance issues.

Using netstat -an can you find matching connections from the stalled
machine to the db server? If not you've lost network connection. If
there's no obvious cause in pg or sys logs on the db server then it's
networking.



--
Regards
Dhruv
404-551-2578

В списке pgsql-admin по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: Problems with PG 9.3
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Problems with PG 9.3