Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Дата
Msg-id 1379106295.20999.YahooMailNeo@web162901.mail.bf1.yahoo.com
обсуждение исходный текст
Ответ на Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers  (Merlin Moncure <mmoncure@gmail.com>)
Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
Andres Freund <andres@2ndquadrant.com> wrote:

> Absolutely not claiming the contrary. I think it sucks that we
> couldn't fully figure out what's happening in detail. I'd love to
> get my hand on a setup where it can be reliably reproduced.

I have seen two completely different causes for symptoms like this,
and I suspect that these aren't the only two.

(1)  The dirty page avalanche: PostgreSQL hangs on to a large
number of dirty buffers and then dumps a lot of them at once.  The
OS does the same.  When PostgreSQL dumps its buffers to the OS it
pushes the OS over a "tipping point" where it is writing dirty
buffers too fast for the controller's BBU cache to absorb them.
Everything freezes until the controller writes and accepts OS
writes for a lot of data.  This can take several minutes, during
which time the database seems "frozen".  Cure is some combination
of these: reduce shared_buffers, make the background writer more
aggressive, checkpoint more often, make the OS dirty page writing
more aggressive, add more BBU RAM to the controller.

(2)  Transparent huge page support goes haywire on its defrag work.
Clues on this include very high "system" CPU time during an
episode, and `perf top` shows more time in kernel spinlock
functions than anywhere else.  The database doesn't completely lock
up like with the dirty page avalanche, but it is slow enough that
users often describe it that way.  So far I have only seen this
cured by disabling THP support (in spite of some people urging that
just the defrag be disabled).  It does make me wonder whether there
is something we could do in PostgreSQL to interact better with
THPs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Следующее
От: Merlin Moncure
Дата:
Сообщение: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers