Обсуждение: in defensive of zone_reclaim_mode on linux

Поиск
Список
Период
Сортировка

in defensive of zone_reclaim_mode on linux

От
Ben Chobot
Дата:
Over the last several months, I've seen a lot of grumbling about how zone_reclaim_mode eats babies, kicks puppies, and
basicallyhow you should just turn it off and live happily ever after. I thought I should add a counterexample, because
thatadvice has not proven very good for us. 

Some facts about us:
- postgres 9.3.9
- ubuntu trusty kernels (3.13.0-29-generic #53~precise1-Ubuntu)
- everything in AWS, on 32-core HVM instances with 60GB of ram
- 6GB shared buffers
- mostly simple queries

Generally, this has worked out pretty well for us. However, we've recently added a bunch more load, which, because
we'resharded and each shard has its own user, means we've added more concurrently active users. ("A bunch" = ~300.) We
arebig pgBouncer users, but because we also use transaction pooling, pgBouncer can only do so much to reuse existing
connections.(SET ROLE isn't an option.) 

The end result is that recently, we've been running a dumb number of backends (between 600 and 1k) - which are
*usually*mostly idle, but there are frequent spikes of activity when dozens of them wake up at onces. Even worse, those
spikestend to also come with connection churn, as pgBouncer tears down existing idle connections to build up new
backendsfor different users. 

So our load would hover under 10 most of the time, then spike to over 100 for a minute or two. Connections would get
refused,the system would freeze up... and then everything would go back to normal. The solution? Turning on
zone_reclaim_mode.

It appears that connection churn is far more manageable to Linux with zone_reclaim_mode enabled. I suspect that our
dearthof large, complex queries helps us out as well. Regardless, our systems no longer desperately seek free memory
whenmany idle backends wake up while others are getting torn down and and replaced. Babies and puppies rejoice.  

Our situation might not apply to you. But if it does, give zone_reclaim_mode a chance. It's not (always) as bad as
othershave made it out to be. 

Re: in defensive of zone_reclaim_mode on linux

От
Andres Freund
Дата:
Hi,

On 2015-09-04 15:37:47 -0700, Ben Chobot wrote:
> So our load would hover under 10 most of the time, then spike to over 100 for a minute or two. Connections would get
refused,the system would freeze up... and then everything would go back to normal. The solution? Turning on
zone_reclaim_mode.
>
> It appears that connection churn is far more manageable to Linux with zone_reclaim_mode enabled. I suspect that our
dearthof large, complex queries helps us out as well. Regardless, our systems no longer desperately seek free memory
whenmany idle backends wake up while others are getting torn down and and replaced. Babies and puppies rejoice.  
>
> Our situation might not apply to you. But if it does, give zone_reclaim_mode a chance. It's not (always) as bad as
othershave made it out to be. 

To me that sounds like the negative impact of transparent hugepages
being mitigated to some degree by zone reclaim mode (which'll avoid some
cross-node transfers).

Greetings,

Andres Freund


Re: in defensive of zone_reclaim_mode on linux

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2015-09-04 15:37:47 -0700, Ben Chobot wrote:
>> Our situation might not apply to you. But if it does, give zone_reclaim_mode a chance. It's not (always) as bad as
othershave made it out to be. 

> To me that sounds like the negative impact of transparent hugepages
> being mitigated to some degree by zone reclaim mode (which'll avoid some
> cross-node transfers).

Worth noting here is that Ben is running a 3.13 Linux kernel --- I think
most of the bad rap that zone_reclaim_mode has accumulated comes from
experience with significantly older kernels.  (Which is not to say that
I have heard that the kernel crowd has fixed it.  But maybe they did.)

            regards, tom lane


Re: in defensive of zone_reclaim_mode on linux

От
Ben Chobot
Дата:

On Sep 6, 2015, at 4:07 AM, Andres Freund <andres@anarazel.de> wrote:

To me that sounds like the negative impact of transparent hugepages
being mitigated to some degree by zone reclaim mode (which'll avoid some
cross-node transfers).

FWIW:

$ cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]