Re: Allow non-superuser to cancel superuser tasks.

Поиск
Список
Период
Сортировка
От Kirill Reshke
Тема Re: Allow non-superuser to cancel superuser tasks.
Дата
Msg-id CALdSSPiOLADNe1ZS-1dnQXWVXgGAC6=3TBKABu9C3_97YGOoMg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Allow non-superuser to cancel superuser tasks.  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Allow non-superuser to cancel superuser tasks.  (Nathan Bossart <nathandbossart@gmail.com>)
Re: Allow non-superuser to cancel superuser tasks.  (Nathan Bossart <nathandbossart@gmail.com>)
Re: Allow non-superuser to cancel superuser tasks.  (Michael Paquier <michael@paquier.xyz>)
Re: Allow non-superuser to cancel superuser tasks.  (Kirill Reshke <reshkekirill@gmail.com>)
Список pgsql-hackers
Posting updated version of this patch with comments above addressed.

1) pg_signal_autovacuum -> pg_signal_autovacuum_worker, as there seems
to be no objections to that.

2)
There are comments on how to write if statement:

> In core, do_autovacuum() is only called in a process without
> a role specified

> It's feeling more natural here to check that we have a superuser-owned
> backend first, and then do a lookup of the process type.

> I figured since there's no reason to rely on that behavior, we might as
> well do a bit of future-proofing in case autovacuum workers are ever not
> run as InvalidOid.

I have combined them into this:

if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
&& pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)

This is both future-proofing and natural, I suppose. Downside of this
is double checking condition (!OidIsValid(proc->roleId) ||
superuser_arg(proc->roleId)), but i think that is ok for the sake of
simplicity.

3) pg_signal_autovacuum_worker Oid changed to random one: 8916

4)

> An invalid BackendType is not false, but B_INVALID.
fixed, thanks

5)

>>>> There is pg_read_all_stats as well, so I don't see a big issue in
>>>> requiring to be a member of this role as well for the sake of what's
>>>> proposing here.
>>>
>>> Well, that tells you quite a bit more than just which PIDs correspond to
>>> autovacuum workers, but maybe that's good enough for now.
>>
>> That may be a good initial compromise, for now.

>Sounds good to me. I will update the documentation.

@Anthony if you feel that documentation update adds much value here,
please do. Given that we know autovacuum worker PIDs from
pg_stat_progress_vacuum, I don't know how to reflect something about
pg_stat_autovac_worker in doc, and if it is worth it.

6)
> + INJECTION_POINT("autovacuum-start");
> Perhaps autovacuum-worker-start is more suited here

fixed, thanks

7)

> +# Copyright (c) 2022-2024, PostgreSQL Global Development Group
> [...]
> +# Copyright (c) 2024-2024, PostgreSQL Global Development Group

> These need to be cleaned up.

> +# Makefile for src/test/recovery
> +#
> +# src/test/recovery/Makefile

> This is incorrect, twice.

Cleaned up, thanks!

8)

> Not sure that there is a huge point in checking after a role that
> holds pg_signal_backend.
Ok. Removed.

Then:

> +like($psql_err, qr/ERROR: permission denied to terminate ...
> Checking only the ERRROR, and not the DETAIL should be sufficient
> here.


After removing the pg_signal_backend test case we have only one place
where errors check is done. So, I think we should keep DETAIL here to
ensure detail is correct (it differs from regular backend case).

9)
> +# Test signaling for pg_signal_autovacuum role.
> This needs a better documentation:

Updated. Hope now the test documentation helps to understand it.

10)

> +ok($terminate_with_pg_signal_av == 0, "Terminating autovacuum worker should succeed with pg_signal_autovacuum
role");
> Is that enough for the validation?

Added:
ok($node->log_contains(qr/FATAL: terminating autovacuum process due to
administrator command/, $offset),
"Autovacuum terminates when role is granted with pg_signal_autovacuum_worker");

11) references to `passcheck` extension removed. errors messages rephrased.

12) injection_point_detach added.

13)
> + INSERT INTO tab_int SELECT * FROM generate_series(1, 1000000);
> A good chunk of the test would be spent on that, but you don't need
> that many tuples to trigger an autovacuum worker as the short naptime
> is able to do it. I would recommend to reduce that to a minimum.

+1
Single tuple works.

14)

v3 suffers from segfault:
2024-04-11 11:28:31.116 UTC [147437] 001_signal_autovacuum.pl LOG:
statement: SELECT pg_terminate_backend(147427);
2024-04-11 11:28:31.116 UTC [147427] FATAL:  terminating autovacuum
process due to administrator command
2024-04-11 11:28:31.116 UTC [147410] LOG:  server process (PID 147427)
was terminated by signal 11: Segmentation fault
2024-04-11 11:28:31.116 UTC [147410] LOG:  terminating any other
active server processes
2024-04-11 11:28:31.117 UTC [147410] LOG:  shutting down because
restart_after_crash is off
2024-04-11 11:28:31.121 UTC [147410] LOG:  database system is shut down

The test doesn't fail because pg_terminate_backend actually meets his
point: autovac is killed. But while dying, autovac also receives
segfault. Thats because of injections points:


(gdb) bt
#0  0x000056361c3379ea in tas (lock=0x7fbcb9632224 <error: Cannot
access memory at address 0x7fbcb9632224>) at
../../../../src/include/storage/s_lock.h:228
#1  ConditionVariableCancelSleep () at condition_variable.c:238
#2  0x000056361c337e4b in ConditionVariableBroadcast
(cv=0x7fbcb66f498c) at condition_variable.c:310
#3  0x000056361c330a40 in CleanupProcSignalState (status=<optimized
out>, arg=<optimized out>) at procsignal.c:240
#4  0x000056361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
#5  0x000056361c3288fc in proc_exit_prepare (code=code@entry=1) at ipc.c:198
#6  0x000056361c3289bf in proc_exit (code=code@entry=1) at ipc.c:111
#7  0x000056361c49ffa8 in errfinish (filename=<optimized out>,
lineno=<optimized out>, funcname=0x56361c654370 <__func__.16>
"ProcessInterrupts") at elog.c:592
#8  0x000056361bf7191e in ProcessInterrupts () at postgres.c:3264
#9  0x000056361c3378d7 in ConditionVariableTimedSleep
(cv=0x7fbcb9632224, timeout=timeout@entry=-1,
wait_event_info=117440513) at condition_variable.c:196
#10 0x000056361c337d0b in ConditionVariableTimedSleep
(wait_event_info=<optimized out>, timeout=-1, cv=<optimized out>) at
condition_variable.c:135
#11 ConditionVariableSleep (cv=<optimized out>,
wait_event_info=<optimized out>) at condition_variable.c:98
#12 0x00000000b96347d0 in ?? ()
#13 0x3a3f1d9baa4f5500 in ?? ()
#14 0x000056361cc6cbd0 in ?? ()
#15 0x000056361ccac300 in ?? ()
#16 0x000056361c62be63 in ?? ()
#17 0x00007fbcb96347d0 in ?? () at injection_points.c:201 from
/home/reshke/postgres/tmp_install/home/reshke/postgres/pgbin/lib/injection_points.so
#18 0x00007fffe4122b10 in ?? ()
#19 0x00007fffe4122b70 in ?? ()
#20 0x0000000000000000 in ?? ()

discovered because of
# Release injection point.
$node->safe_psql('postgres',
"SELECT injection_point_detach('autovacuum-worker-start');");
added

v4 also suffers from that. i will try to fix that.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Zhijie Hou (Fujitsu)"
Дата:
Сообщение: RE: Synchronizing slots from primary to standby
Следующее
От: Alexander Lakhin
Дата:
Сообщение: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands