Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Дата
Msg-id 341186.1711037256@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Список pgsql-hackers
[ redirecting to -hackers ]

alex work <alexwork033@gmail.com> writes:
> We encounter slow `GRANT ROLES` only on PostgreSQL 16 instances up to 42 seconds
> in production, the client process at PostgresSQL would use 100% of the CPU.
> Which is a surprise compared to other instances running older PostgreSQL
> releases. On production we have a *LOT* of ROLEs, which unfortunately a case
> that we did not test before switching the new servers into production mode.

I poked into this a bit.  It seems the problem is that as of v16, we
try to search for the "best" role membership path from the current
user to the target role, and that's done in a very brute-force way,
as a side effect of computing the set of *all* role memberships the
current role has.  In the given case, we could have skipped all that
if we simply tested whether the current role is directly a member
of the target: it is, so there can't be any shorter path.  But in
any case roles_is_member_of has horrid performance when the current
role is a member of a lot of roles.

It looks like part of the blame might be ascribable to catcache.c,
as if you look at the problem microscopically you find that
roles_is_member_of is causing catcache to make a ton of AUTHMEMMEMROLE
catcache lists, and SearchSysCacheList is just iterating linearly
through the cache's list-of-lists, so that search is where the O(N^2)
time is actually getting taken.  Up to now that code has assumed that
any one catcache would not have very many catcache lists.  Maybe it's
time to make that smarter; but since we've gotten away with this
implementation for decades, I can't help feeling that the real issue
is with roles_is_member_of's usage pattern.

For self-containedness, attached is a directly usable shell script
to reproduce the problem.  The complaint is that the last GRANT
takes multiple seconds (about 5s on my machine), rather than
milliseconds.

            regards, tom lane

#!/bin/bash

echo "CREATE ROLE acc WITH LOGIN NOSUPERUSER INHERIT CREATEDB CREATEROLE NOREPLICATION;" > create-roles.sql

#create a lot of `a_` roles and make sure `acc` is member of each one of them:
for idx1 in $(seq -w 1 100); do for idx2 in $(seq -w 1 12); do for idx3 in $(seq -w 1 10); do
 echo "CREATE ROLE a_${idx1}${idx2}${idx3} WITH NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;"
 echo "GRANT a_${idx1}${idx2}${idx3} TO acc WITH ADMIN OPTION;"
done; done; done >>create-roles.sql

#create a lot of `d_` roles and make sure `acc` is member of each one of them:
for idx1 in $(seq -w 1 100); do for idx2 in $(seq -w 1 12); do for idx3 in $(seq -w 1 10); do
 echo "CREATE ROLE d_${idx1}${idx2}${idx3} WITH NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;"
 echo "GRANT d_${idx1}${idx2}${idx3} TO acc WITH ADMIN OPTION;"
done; done; done >>create-roles.sql

#create a lot of `s_` roles:
for idx1 in $(seq -w 1 100); do for idx2 in $(seq -w 1 12); do for idx3 in $(seq -w 1 10); do
 echo "CREATE ROLE s_${idx1}${idx2}${idx3} WITH NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN;"
done; done; done >>create-roles.sql

time psql -f create-roles.sql -q -d postgres

time psql -U acc postgres -c 'GRANT d_0010109 TO s_0010109;'

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Treat
Дата:
Сообщение: Re: Possibility to disable `ALTER SYSTEM`
Следующее
От: Amit Langote
Дата:
Сообщение: Re: remaining sql/json patches