Обсуждение: failed NUMA pages inquiry status: Operation not permitted

Поиск
Список
Период
Сортировка

failed NUMA pages inquiry status: Operation not permitted

От
Christoph Berg
Дата:
> src/test/regress/expected/numa.out       |  13 +++
> src/test/regress/expected/numa_1.out     |   5 +

numa_1.out is catching this error:

ERROR:  libnuma initialization failed or NUMA is not supported on this platform

This is what I'm getting when running PG18 in docker on Debian trixie
(libnuma 2.0.19).

However, on older distributions, the error is different:

postgres =# select * from pg_shmem_allocations_numa;
ERROR:  XX000: failed NUMA pages inquiry status: Operation not permitted
LOCATION:  pg_get_shmem_allocations_numa, shmem.c:691

This makes the numa regression tests fail in Docker on Debian bookworm
(libnuma 2.0.16) and older and all of the Ubuntu LTS releases.

The attached patch makes it accept these errors, but perhaps it would
be better to detect it in pg_numa_available().

Christoph

Вложения

Re: failed NUMA pages inquiry status: Operation not permitted

От
Tomas Vondra
Дата:

On 10/16/25 13:38, Christoph Berg wrote:
>> src/test/regress/expected/numa.out       |  13 +++
>> src/test/regress/expected/numa_1.out     |   5 +
> 
> numa_1.out is catching this error:
> 
> ERROR:  libnuma initialization failed or NUMA is not supported on this platform
> 
> This is what I'm getting when running PG18 in docker on Debian trixie
> (libnuma 2.0.19).
> 
> However, on older distributions, the error is different:
> 
> postgres =# select * from pg_shmem_allocations_numa;
> ERROR:  XX000: failed NUMA pages inquiry status: Operation not permitted
> LOCATION:  pg_get_shmem_allocations_numa, shmem.c:691
> 
> This makes the numa regression tests fail in Docker on Debian bookworm
> (libnuma 2.0.16) and older and all of the Ubuntu LTS releases.
> 

It's probably more about the kernel version. What kernels are used by
these systems?

> The attached patch makes it accept these errors, but perhaps it would
> be better to detect it in pg_numa_available().
> 

Not sure how would that work. It seems this is some sort of permission
check in numa_move_pages, that's not what pg_numa_available does. Also,
it may depending on the page queried (e.g. whether it's exclusive or
shared by multiple processes).

thanks

-- 
Tomas Vondra




Re: failed NUMA pages inquiry status: Operation not permitted

От
Christoph Berg
Дата:
Re: Tomas Vondra
> It's probably more about the kernel version. What kernels are used by
> these systems?

It's the very same kernel, just different docker containers on the
same system. I did not investigate yet where the problem is coming
from, different libnuma versions seemed like the best bet.

Same (differing) results on both these systems:
Linux turing 6.16.7+deb14-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.16.7-1 (2025-09-11) x86_64 GNU/Linux
Linux jenkins 6.1.0-39-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.148-1 (2025-08-26) x86_64 GNU/Linux

> Not sure how would that work. It seems this is some sort of permission
> check in numa_move_pages, that's not what pg_numa_available does. Also,
> it may depending on the page queried (e.g. whether it's exclusive or
> shared by multiple processes).

It's probably the lack of some process capability in that environment.
Maybe there is a way to query that, but I don't know much about that
yet.

Christoph



Re: failed NUMA pages inquiry status: Operation not permitted

От
Christoph Berg
Дата:
Re: To Tomas Vondra
> It's the very same kernel, just different docker containers on the
> same system. I did not investigate yet where the problem is coming
> from, different libnuma versions seemed like the best bet.

numactl shows the problem already:

Host system:

$ numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
cpubind: 0
nodebind: 0
membind: 0
preferred:

debian:trixie-slim container:

$ numactl --show
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
No NUMA support available on this system.

debian:bookworm-slim container:

$ numactl --show
get_mempolicy: Operation not permitted
get_mempolicy: Operation not permitted
get_mempolicy: Operation not permitted
get_mempolicy: Operation not permitted
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
cpubind: 0
nodebind: 0
membind: 0
preferred:

Running with sudo does not change the result.

So maybe all that's needed is a get_mempolicy() call in
pg_numa_available() ?

Christoph



Re: failed NUMA pages inquiry status: Operation not permitted

От
Christoph Berg
Дата:
Re: To Tomas Vondra
> So maybe all that's needed is a get_mempolicy() call in
> pg_numa_available() ?

Or perhaps give up on pg_numa_available, and just have two _1.out and
_2.out that just contain the two different error messages, without
trying to catch the problem.

Christoph



Re: failed NUMA pages inquiry status: Operation not permitted

От
Tomas Vondra
Дата:

On 10/16/25 16:54, Christoph Berg wrote:
> Re: Tomas Vondra
>> It's probably more about the kernel version. What kernels are used by
>> these systems?
> 
> It's the very same kernel, just different docker containers on the
> same system. I did not investigate yet where the problem is coming
> from, different libnuma versions seemed like the best bet.
> 
> Same (differing) results on both these systems:
> Linux turing 6.16.7+deb14-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.16.7-1 (2025-09-11) x86_64 GNU/Linux
> Linux jenkins 6.1.0-39-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.148-1 (2025-08-26) x86_64 GNU/Linux
> 

Hmmm. Those seem like relatively recent kernels.

>> Not sure how would that work. It seems this is some sort of permission
>> check in numa_move_pages, that's not what pg_numa_available does. Also,
>> it may depending on the page queried (e.g. whether it's exclusive or
>> shared by multiple processes).
> 
> It's probably the lack of some process capability in that environment.
> Maybe there is a way to query that, but I don't know much about that
> yet.
> 

move_page() manpage mentions PTRACE_MODE_READ_REALCREDS (man ptrace) so
maybe that's it.

-- 
Tomas Vondra




Re: failed NUMA pages inquiry status: Operation not permitted

От
Christoph Berg
Дата:
> So maybe all that's needed is a get_mempolicy() call in
> pg_numa_available() ?

numactl 2.0.19 --show does this:

        if (numa_available() < 0) {
                show_physcpubind();
                printf("No NUMA support available on this system.\n");
                exit(1);
        }

int numa_available(void)
{
        if (get_mempolicy(NULL, NULL, 0, 0, 0) < 0 && (errno == ENOSYS || errno == EPERM))
                return -1;
        return 0;
}

pg_numa_available is already calling numa_available.

But numactl 2.0.16 has this:

int numa_available(void)
{
    if (get_mempolicy(NULL, NULL, 0, 0, 0) < 0 && errno == ENOSYS)
        return -1;
    return 0;
}

... which is not catching the "permission denied" error I am seeing.

So maybe PG should implement numa_available itself like that. (Or
accept the output difference so the regression tests are passing.)

Christoph