Re: Getting out ahead of OOM
От | Joe Conway |
---|---|
Тема | Re: Getting out ahead of OOM |
Дата | |
Msg-id | e52f00fc-ffe3-4745-8082-b492f44c6693@joeconway.com обсуждение исходный текст |
Ответ на | Re: Getting out ahead of OOM (Joseph Hammerman <joe.hammerman@datadoghq.com>) |
Список | pgsql-admin |
On 3/12/25 18:21, Joseph Hammerman wrote: > Joe, can you expand on your recommendation to use cgroup-v2? We're > trying to collect our complete rationale for our request to our internal > team that is tasked with rolling out this configuration change. cgroup-v2 has a much better measure of memory pressure (see PSI[1]), better ability to reclaim memory pages[4], safer delegation, and other advantages. When I last looked the kube support for it was still brand new, but it appears to be well supported now [2][3]. In particular, this statement from [3] is important: Memory QoS uses memory.high to throttle workload approaching its memory limit, ensuring that the system is not overwhelmed by instantaneous memory allocation. With cgroup-v1 a kube memory limit would set memory.limit and usage of the pod (sum across all processes in the pod cgroup) was tracked with memory.usage_in_bytes. Whenever the latter exceeds the former, the OOM killer will whack the process in the pod cgroup with the highest oom_score, irrespective of how much free memory may be available at the host level. With cgroup-v2 it appears that kube uses memory.high[5], which is more of a throttle/soft limit. In cgroup-v2 there is also a new memory.max[6] which is essentially the same as what memory.limit was in v1. Exceeding memory.max would invoke the OOM killer, but since kubernetes limits the pod memory with memory.high, the OOM killer should be avoided. Note that I cannot claim a bunch of hands on experience with this (cgroup-v2 with kubernetes), so please do your own testing and YMMV, etc. [1] https://docs.kernel.org/accounting/psi.html#psi [2] https://kubernetes.io/docs/concepts/architecture/cgroups/ [3] https://kubernetes.io/docs/concepts/workloads/pods/pod-qos/#memory-qos-with-cgroup-v2 [4] https://docs.kernel.org/admin-guide/cgroup-v2.html#:~:text=memory.reclaim [5] https://docs.kernel.org/admin-guide/cgroup-v2.html#:~:text=memory.high [6] https://docs.kernel.org/admin-guide/cgroup-v2.html#:~:text=memory.max -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-admin по дате отправления: