Обсуждение: failed NUMA pages inquiry status: Operation not permitted

Поиск

Список

Период

Сортировка

failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

16 октября, 14:38:35

> src/test/regress/expected/numa.out       |  13 +++
> src/test/regress/expected/numa_1.out     |   5 +

numa_1.out is catching this error:

ERROR:  libnuma initialization failed or NUMA is not supported on this platform

This is what I'm getting when running PG18 in docker on Debian trixie
(libnuma 2.0.19).

However, on older distributions, the error is different:

postgres =# select * from pg_shmem_allocations_numa;
ERROR:  XX000: failed NUMA pages inquiry status: Operation not permitted
LOCATION:  pg_get_shmem_allocations_numa, shmem.c:691

This makes the numa regression tests fail in Docker on Debian bookworm
(libnuma 2.0.16) and older and all of the Ubuntu LTS releases.

The attached patch makes it accept these errors, but perhaps it would
be better to detect it in pg_numa_available().

Christoph

Вложения

0001-numa-Catch-Operation-not-permitted-error.patch

Re: failed NUMA pages inquiry status: Operation not permitted

От

Tomas Vondra

Дата:

16 октября, 17:27:47


On 10/16/25 13:38, Christoph Berg wrote:
>> src/test/regress/expected/numa.out       |  13 +++
>> src/test/regress/expected/numa_1.out     |   5 +
> 
> numa_1.out is catching this error:
> 
> ERROR:  libnuma initialization failed or NUMA is not supported on this platform
> 
> This is what I'm getting when running PG18 in docker on Debian trixie
> (libnuma 2.0.19).
> 
> However, on older distributions, the error is different:
> 
> postgres =# select * from pg_shmem_allocations_numa;
> ERROR:  XX000: failed NUMA pages inquiry status: Operation not permitted
> LOCATION:  pg_get_shmem_allocations_numa, shmem.c:691
> 
> This makes the numa regression tests fail in Docker on Debian bookworm
> (libnuma 2.0.16) and older and all of the Ubuntu LTS releases.
> 

It's probably more about the kernel version. What kernels are used by
these systems?

> The attached patch makes it accept these errors, but perhaps it would
> be better to detect it in pg_numa_available().
> 

Not sure how would that work. It seems this is some sort of permission
check in numa_move_pages, that's not what pg_numa_available does. Also,
it may depending on the page queried (e.g. whether it's exclusive or
shared by multiple processes).

thanks

-- 
Tomas Vondra

Re: failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

16 октября, 17:54:24

Re: Tomas Vondra
> It's probably more about the kernel version. What kernels are used by
> these systems?

It's the very same kernel, just different docker containers on the
same system. I did not investigate yet where the problem is coming
from, different libnuma versions seemed like the best bet.

Same (differing) results on both these systems:
Linux turing 6.16.7+deb14-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.16.7-1 (2025-09-11) x86_64 GNU/Linux
Linux jenkins 6.1.0-39-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.148-1 (2025-08-26) x86_64 GNU/Linux

> Not sure how would that work. It seems this is some sort of permission
> check in numa_move_pages, that's not what pg_numa_available does. Also,
> it may depending on the page queried (e.g. whether it's exclusive or
> shared by multiple processes).

It's probably the lack of some process capability in that environment.
Maybe there is a way to query that, but I don't know much about that
yet.

Christoph

Re: failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

16 октября, 18:06:23

Re: To Tomas Vondra
> It's the very same kernel, just different docker containers on the
> same system. I did not investigate yet where the problem is coming
> from, different libnuma versions seemed like the best bet.

numactl shows the problem already:

Host system:

$ numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
cpubind: 0
nodebind: 0
membind: 0
preferred:

debian:trixie-slim container:

$ numactl --show
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
No NUMA support available on this system.

debian:bookworm-slim container:

$ numactl --show
get_mempolicy: Operation not permitted
get_mempolicy: Operation not permitted
get_mempolicy: Operation not permitted
get_mempolicy: Operation not permitted
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
cpubind: 0
nodebind: 0
membind: 0
preferred:

Running with sudo does not change the result.

So maybe all that's needed is a get_mempolicy() call in
pg_numa_available() ?

Christoph

Re: failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

16 октября, 18:08:22

Re: To Tomas Vondra
> So maybe all that's needed is a get_mempolicy() call in
> pg_numa_available() ?

Or perhaps give up on pg_numa_available, and just have two _1.out and
_2.out that just contain the two different error messages, without
trying to catch the problem.

Christoph

Re: failed NUMA pages inquiry status: Operation not permitted

От

Tomas Vondra

Дата:

16 октября, 18:09:59


On 10/16/25 16:54, Christoph Berg wrote:
> Re: Tomas Vondra
>> It's probably more about the kernel version. What kernels are used by
>> these systems?
> 
> It's the very same kernel, just different docker containers on the
> same system. I did not investigate yet where the problem is coming
> from, different libnuma versions seemed like the best bet.
> 
> Same (differing) results on both these systems:
> Linux turing 6.16.7+deb14-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.16.7-1 (2025-09-11) x86_64 GNU/Linux
> Linux jenkins 6.1.0-39-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.148-1 (2025-08-26) x86_64 GNU/Linux
> 

Hmmm. Those seem like relatively recent kernels.

>> Not sure how would that work. It seems this is some sort of permission
>> check in numa_move_pages, that's not what pg_numa_available does. Also,
>> it may depending on the page queried (e.g. whether it's exclusive or
>> shared by multiple processes).
> 
> It's probably the lack of some process capability in that environment.
> Maybe there is a way to query that, but I don't know much about that
> yet.
> 

move_page() manpage mentions PTRACE_MODE_READ_REALCREDS (man ptrace) so
maybe that's it.

-- 
Tomas Vondra

Re: failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

16 октября, 18:19:52

> So maybe all that's needed is a get_mempolicy() call in
> pg_numa_available() ?

numactl 2.0.19 --show does this:

        if (numa_available() < 0) {
                show_physcpubind();
                printf("No NUMA support available on this system.\n");
                exit(1);
        }

int numa_available(void)
{
        if (get_mempolicy(NULL, NULL, 0, 0, 0) < 0 && (errno == ENOSYS || errno == EPERM))
                return -1;
        return 0;
}

pg_numa_available is already calling numa_available.

But numactl 2.0.16 has this:

int numa_available(void)
{
    if (get_mempolicy(NULL, NULL, 0, 0, 0) < 0 && errno == ENOSYS)
        return -1;
    return 0;
}

... which is not catching the "permission denied" error I am seeing.

So maybe PG should implement numa_available itself like that. (Or
accept the output difference so the regression tests are passing.)

Christoph

Re: failed NUMA pages inquiry status: Operation not permitted

От

Tomas Vondra

Дата:

28 октября, 18:14:43

On 10/16/25 17:19, Christoph Berg wrote:
>> So maybe all that's needed is a get_mempolicy() call in
>> pg_numa_available() ?
> 
> ...
> 
> So maybe PG should implement numa_available itself like that. (Or
> accept the output difference so the regression tests are passing.)
> 

I'm not sure which of those options is better. I'm a bit worried just
accepting the alternative output would hide some failures in the future
(although it's a low risk).

So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
attached patch. It still calls numa_available(), so that we don't
silently miss future libnuma changes.

Can you check this makes it work inside the docker container?

regards

-- 
Tomas Vondra

Вложения

0001-Handle-EPERM-in-pg_numa_init.patch

Re: failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

28 октября, 18:20:50

Re: To Tomas Vondra
> So maybe PG should implement numa_available itself like that.

Following our discussion at pgconf.eu last week, I just implemented
that. The numa and pg_buffercache tests pass in Docker on Debian
bookworm now.

Christoph

Вложения

v2-0001-Make-pg_numa_init-cope-with-Docker.patch

Re: failed NUMA pages inquiry status: Operation not permitted

От

Christoph Berg

Дата:

14 ноября, 15:52:59

Re: Tomas Vondra
> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
> attached patch. It still calls numa_available(), so that we don't
> silently miss future libnuma changes.
> 
> Can you check this makes it work inside the docker container?

Yes your patch works. (Sorry I meant to test earlier, but RL...)

Christoph

Re: failed NUMA pages inquiry status: Operation not permitted

От

Tomas Vondra

Дата:

20 ноября, 15:53:48

On 11/14/25 13:52, Christoph Berg wrote:
> Re: Tomas Vondra
>> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
>> attached patch. It still calls numa_available(), so that we don't
>> silently miss future libnuma changes.
>>
>> Can you check this makes it work inside the docker container?
> 
> Yes your patch works. (Sorry I meant to test earlier, but RL...)
> 

Thanks. I've pushed the fix (and backpatched to 18).


regards

-- 
Tomas Vondra

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: failed NUMA pages inquiry status: Operation not permitted

Вложения

Вложения

Вложения