Обсуждение: Allow non-superuser to cancel superuser tasks.
Hi hackers!
In our Cloud we have a patch, which allows non-superuser role ('mdb_admin') to do some superuser things.  
In particular, we have a patch that allows mdb admin to cancel the autovacuum process and some other processes (processes with application_name = 'MDB'), see the attachment.
This is needed to allow non-superuser roles to run pg_repack and to cancel pg_repack. 
We need to cancel running autovac to run pg_repack (because of locks), and we need to cancel pg_repack sometimes also.
I want to reduce our internal patch size and transfer this logic to extension or to core. 
I have found similar threads [1] and [2], but, as far as I understand, they do not solve this particular case. 
I see 2 possible ways to implement this. The first one is to have hool in pg_signal_backend, and define a hook in extension which can do the thing. 
The second one is to have a predefined role. Something like a `pg_signal_autovacuum` role which can signal running autovac to cancel. But I don't see how we can handle specific `application_name` with this solution.  
Вложения
On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote: > I see 2 possible ways to implement this. The first one is to have hool in > pg_signal_backend, and define a hook in extension which can do the thing. > The second one is to have a predefined role. Something like a > `pg_signal_autovacuum` role which can signal running autovac to cancel. But > I don't see how we can handle specific `application_name` with this > solution. pg_signal_autovacuum seems useful given commit 3a9b18b. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, 26 Feb 2024 at 20:10, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:
> I see 2 possible ways to implement this. The first one is to have hool in
> pg_signal_backend, and define a hook in extension which can do the thing.
> The second one is to have a predefined role. Something like a
> `pg_signal_autovacuum` role which can signal running autovac to cancel. But
> I don't see how we can handle specific `application_name` with this
> solution.
pg_signal_autovacuum seems useful given commit 3a9b18b.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Thank you for your response.
Please find a patch attached. 
In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid from unused_oids script output. 
Also, tap tests for functionality added. I'm not sure where to place them, so I placed them in a separate directory in `src/test/`
Seems that regression tests for this feature are not possible, am i right?
Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend. 
Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or should this role have such little scope...
Вложения
On Tue, 27 Feb 2024 at 01:22, Kirill Reshke <reshkekirill@gmail.com> wrote:
On Mon, 26 Feb 2024 at 20:10, Nathan Bossart <nathandbossart@gmail.com> wrote:On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:
> I see 2 possible ways to implement this. The first one is to have hool in
> pg_signal_backend, and define a hook in extension which can do the thing.
> The second one is to have a predefined role. Something like a
> `pg_signal_autovacuum` role which can signal running autovac to cancel. But
> I don't see how we can handle specific `application_name` with this
> solution.
pg_signal_autovacuum seems useful given commit 3a9b18b.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.comThank you for your response.Please find a patch attached.In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid from unused_oids script output.Also, tap tests for functionality added. I'm not sure where to place them, so I placed them in a separate directory in `src/test/`Seems that regression tests for this feature are not possible, am i right?Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or should this role have such little scope...
Have a little thought on this, will share.
Do we need to test the pg_cancel_backend vs autovacuum case at all?
I think we do. Would it be better to split work into 2 patches: first one with tests against current logic, and second
one with some changes/enhancements which allows to cancel running autovac to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?
On Tue, Feb 27, 2024 at 01:22:31AM +0500, Kirill Reshke wrote: > Also, tap tests for functionality added. I'm not sure where to place them, > so I placed them in a separate directory in `src/test/` > Seems that regression tests for this feature are not possible, am i right? It might be difficult to create reliable tests for pg_signal_autovacuum. If we can, it would probably be easiest to do with a TAP test. > Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend. > Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or > should this role have such little scope... -1. I don't see why giving a role privileges of pg_signal_autovacuum should also give them the ability to signal all other non-superuser backends. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Feb 27, 2024 at 11:59:00PM +0500, Kirill Reshke wrote: > Do we need to test the pg_cancel_backend vs autovacuum case at all? > I think we do. Would it be better to split work into 2 patches: first one > with tests against current logic, and second > one with some changes/enhancements which allows to cancel running autovac > to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)? If we need to add tests for pg_signal_backend, I think it's reasonable to keep those in a separate patch from pg_signal_autovacuum. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Hi, 
I'm new to reviewing postgres patches, but I took an interest in reviewing this patch as recommended by Nathan.
I have the following comments:
>     if (!superuser()) {
>        if (!OidIsValid(proc->roleId)) {
>            LocalPgBackendStatus *local_beentry;
>            local_beentry = pgstat_get_local_beentry_by_backend_id(proc->backendId);
>
>            if (!(local_beentry && local_beentry->backendStatus.st_backendType == B_AUTOVAC_WORKER && 
>                has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM)))
>                    return SIGNAL_BACKEND_NOSUPERUSER;
>        } else {
>            if (superuser_arg(proc->roleId))
>                return SIGNAL_BACKEND_NOSUPERUSER;
>
>            /* Users can signal backends they have role membership in. */
>            if (!has_privs_of_role(GetUserId(), proc->roleId) &&
>                !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
>                return SIGNAL_BACKEND_NOPERMISSION;
>        }
>    }
>
1. I would suggest not to do nested if blocks since it's becoming harder to read. Also, does it make sense to have a
utilitiesfunction in backend_status.c to check if a backend of a given backend id is of a certain backend_type. And
shouldwe check if proc->backendId is invalid?
 
>    ALTER SYSTEM SET autovacuum_vacuum_cost_limit TO 1;
>    ALTER SYSTEM SET autovacuum_vacuum_cost_delay TO 100;
>    ALTER SYSTEM SET autovacuum_naptime TO 1;    
2. Could we set these parameters at the beginning of the test before $node->start with $node->append_conf ? That way we
canavoid restarting the node and doing the sleep later on.
 
> my $res_pid = $node_primary->safe_psql(
>.           'regress',
>    "SELECT pid FROM pg_stat_activity WHERE backend_type = 'autovacuum worker' and datname = 'regress';"
> );
>
> my ($res_reg_psa_1, $stdout_reg_psa_1, $stderr_reg_psa_1)  = $node_primary->psql('regress', qq[
     SET ROLE psa_reg_role_1;
>   SELECT pg_terminate_backend($res_pid);
>  ]);
>
> ok($res_reg_psa_1 != 0, "should fail for non pg_signal_autovacuum");
> like($stderr_reg_psa_1, qr/Only roles with the SUPERUSER attribute may terminate processes of roles with the
SUPERUSERattribute./, "matches");
 
>
> my ($res_reg_psa_2, $stdout_reg_psa_2, $stderr_reg_psa_2) = $node_primary->psql('regress', qq[
>    SET ROLE psa_reg_role_2;
>    SELECT pg_terminate_backend($res_pid);
> ]");
3. Some nits on styling 
4. According to Postgres styles, I believe open brackets should be in a new line 
			
		Another comment that I forgot to mention is that we should also make the documentation change in doc/src/sgml/user-manag.sgmlfor this new predefined role Thanks. -- Anthony Leung Amazon Web Services: https://aws.amazon.com
I took the liberty of continuing to work on this after chatting with Nathan. I've attached the updated patch with some improvements. Thanks. -- Anthony Leung Amazon Web Services: https://aws.amazon.com
Вложения
On Mon, Apr 01, 2024 at 02:29:29PM +0000, Leung, Anthony wrote:
> I've attached the updated patch with some improvements.
Thanks!
+      <row>
+       <entry>pg_signal_autovacuum</entry>
+       <entry>Allow terminating backend running autovacuum</entry>
+      </row>
I think we should be more precise here by calling out the exact types of
workers:
    "Allow signaling autovacuum worker processes to..."
-    if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
-        !superuser())
-        return SIGNAL_BACKEND_NOSUPERUSER;
-
-    /* Users can signal backends they have role membership in. */
-    if (!has_privs_of_role(GetUserId(), proc->roleId) &&
-        !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
-        return SIGNAL_BACKEND_NOPERMISSION;
+    if (!superuser())
+    {
+        if (!OidIsValid(proc->roleId))
+        {
+            /*
+             * We only allow user with pg_signal_autovacuum role to terminate
+             * autovacuum worker as an exception. 
+             */
+            if (!(pg_stat_is_backend_autovac_worker(proc->backendId) &&
+                has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM)))
+                return SIGNAL_BACKEND_NOSUPERUSER;
+        }
+        else
+        {
+            if (superuser_arg(proc->roleId))
+                return SIGNAL_BACKEND_NOSUPERUSER;
+
+            /* Users can signal backends they have role membership in. */
+            if (!has_privs_of_role(GetUserId(), proc->roleId) &&
+                !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
+                return SIGNAL_BACKEND_NOPERMISSION;
+        }
+    }
I don't think we should rely on !OidIsValid(proc->roleId) for signaling
autovacuum workers.  That might not always be true, and I don't see any
need to rely on that otherwise.  IMHO we should just add a few lines before
the existing code, which doesn't need to be changed at all:
    if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
        !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
        return SIGNAL_BACKEND_NOAUTOVACUUM;
I also think we need to return something besides SIGNAL_BACKEND_NOSUPERUSER
in this case.  Specifically, we probably need to introduce a new value and
provide the relevant error messages in pg_cancel_backend() and
pg_terminate_backend().
+/* ----------
+ * pg_stat_is_backend_autovac_worker() -
+ *
+ * Return whether the backend of the given backend id is of type autovacuum worker.
+ */
+bool
+pg_stat_is_backend_autovac_worker(BackendId beid)
+{
+    PgBackendStatus *ret;
+
+    Assert(beid != InvalidBackendId);
+
+    ret = pgstat_get_beentry_by_backend_id(beid);
+
+    if (!ret)
+        return false;
+
+    return ret->st_backendType == B_AUTOVAC_WORKER;
+}
Can we have this function return the backend type so that we don't have to
create a new function for every possible type?  That might be handy in the
future.
I haven't looked too closely, but I'm pretty skeptical that the test suite
in your patch would be stable.  Unfortunately, I don't have any better
ideas at the moment besides not adding a test for this new role.
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		
> On 2 Apr 2024, at 01:21, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> I haven't looked too closely, but I'm pretty skeptical that the test suite
> in your patch would be stable.  Unfortunately, I don't have any better
> ideas at the moment besides not adding a test for this new role.
We can add tests just like [0] with injection points.
I mean replace that "sleep 1" with something like "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
Currently we have no infrastructure to wait for autovacuum of particular table, but I think it's doable.
Also I do not like that test is changing system-wide autovac settings, AFAIR these settings can be set for particular
table.
Thanks!
Best regards, Andrey Borodin.
[0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=eeefd4280f6
			
		> I don't think we should rely on !OidIsValid(proc->roleId) for signaling
> autovacuum workers.  That might not always be true, and I don't see any
> need to rely on that otherwise.  IMHO we should just add a few lines before
> the existing code, which doesn't need to be changed at all:
> 
>    if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
>        !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
>        return SIGNAL_BACKEND_NOAUTOVACUUM;
I tried to add them above the existing code. When I test it locally, a user without pg_signal_autovacuum will actually
failat this block because the user is not superuser and !OidIsValid(proc->roleId) is also true in the following:
 
    /*
     * Only allow superusers to signal superuser-owned backends.  Any process
     * not advertising a role might have the importance of a superuser-owned
     * backend, so treat it that way.
     */
    if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
        !superuser())
        return SIGNAL_BACKEND_NOSUPERUSER;
This is what Im planning to do - If the backend is autovacuum worker and the user is not superuser or has
pg_signal_autovacuumrole, we return the new value and provide the relevant error message    
 
              /*
     * If the backend is autovacuum worker, allow user with privileges of the 
               * pg_signal_autovacuum role to signal the backend.
     */
    if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
    {
        if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) || !superuser())
            return SIGNAL_BACKEND_NOAUTOVACUUM;
    }
    /*
     * Only allow superusers to signal superuser-owned backends.  Any process
     * not advertising a role might have the importance of a superuser-owned
     * backend, so treat it that way.
    */
    else if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
             !superuser())
    {
        return SIGNAL_BACKEND_NOSUPERUSER;
    }
    /* Users can signal backends they have role membership in. */
    else if (!has_privs_of_role(GetUserId(), proc->roleId) &&
             !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
    {
        return SIGNAL_BACKEND_NOPERMISSION;
    }
> We can add tests just like [0] with injection points.
> I mean replace that "sleep 1" with something like "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
> Currently we have no infrastructure to wait for autovacuum of particular table, but I think it's doable.
> Also I do not like that test is changing system-wide autovac settings, AFAIR these settings can be set for particular
table.
Thanks for the suggestion. I will take a look at this. Let me also separate the test into a different patch file.
--
Anthony Leung
Amazon Web Services: https://aws.amazon.com
			
		Update - the condition should be && 
    if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
    {
        if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) && !superuser())
            return SIGNAL_BACKEND_NOAUTOVACUUM;
    }
Thanks
--
Anthony Leung
Amazon Web Services: https://aws.amazon.com
			
		On Tue, Apr 02, 2024 at 04:35:28PM +0500, Andrey M. Borodin wrote:
> We can add tests just like [0] with injection points.
> I mean replace that "sleep 1" with something like
> "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
> Currently we have no infrastructure to wait for autovacuum of
> particular table, but I think it's doable.
> Also I do not like that test is changing system-wide autovac
> settings, AFAIR these settings can be set for particular table.
Yeah, hardcoded sleeps are not really acceptable.  On fast machines
they eat in global runtime making the whole slower, impacting the CI.
On slow machines, that's not going to be stable and we have a lot of
buildfarm animals starved on CPU, like the ones running valgrind or
just because their environment is slow (one of my animals runs on a
RPI, for example).  Note that slow machines have a lot of value
because they're usually better at catching race conditions.  Injection
points would indeed make the tests more deterministic by controlling
the waits and wakeups you'd like to have in the patch's tests.
eeefd4280f6e would be a good example of how to implement a test.
--
Michael
			
		Вложения
On Thu, Apr 04, 2024 at 12:30:51AM +0000, Leung, Anthony wrote:
>>    if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
>>        !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
>>        return SIGNAL_BACKEND_NOAUTOVACUUM;
> 
> I tried to add them above the existing code. When I test it locally, a
> user without pg_signal_autovacuum will actually fail at this block
> because the user is not superuser and !OidIsValid(proc->roleId) is also
> true in the following:
Good catch.
> This is what Im planning to do - If the backend is autovacuum worker and
> the user is not superuser or has pg_signal_autovacuum role, we return the
> new value and provide the relevant error message
> 
>               /*
>      * If the backend is autovacuum worker, allow user with privileges of the 
>                * pg_signal_autovacuum role to signal the backend.
>      */
>     if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
>     {
>         if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) || !superuser())
>             return SIGNAL_BACKEND_NOAUTOVACUUM;
>     }
>     /*
>      * Only allow superusers to signal superuser-owned backends.  Any process
>      * not advertising a role might have the importance of a superuser-owned
>      * backend, so treat it that way.
>     */
>     else if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
>              !superuser())
>     {
>         return SIGNAL_BACKEND_NOSUPERUSER;
>     }
>     /* Users can signal backends they have role membership in. */
>     else if (!has_privs_of_role(GetUserId(), proc->roleId) &&
>              !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
>     {
>         return SIGNAL_BACKEND_NOPERMISSION;
>     }
There's no need for the explicit superuser() check in the
pg_signal_autovacuum section.  That's built into has_privs_of_role()
already.
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		I made some updates based on the feedbacks in v2. This patch only contains the code change for allowing the signaling toav worker with pg_signal_autovacuum. I will send a separate patch for the tap test shortly. Thanks -- Anthony Leung Amazon Web Services: https://aws.amazon.com
Вложения
Adding tap test for pg_signal_autovacuum using injection points as a separate patch. I also made a minor change on the originalpatch. Thanks. -- Anthony Amazon Web Services: https://aws.amazon.com
Вложения
> On 5 Apr 2024, at 05:03, Leung, Anthony <antholeu@amazon.com> wrote: > > Adding tap test for pg_signal_autovacuum using injection points as a separate patch. I also made a minor change on theoriginal patch. The test looks good, but: 1. remove references to passcheck :) 2. detach injection point when it's not needed anymore Thanks! Best regards, Andrey Borodin.
On Fri, Apr 05, 2024 at 12:03:05AM +0000, Leung, Anthony wrote:
> Adding tap test for pg_signal_autovacuum using injection points as a
> separate patch. I also made a minor change on the original patch.
+    ret = pgstat_get_beentry_by_proc_number(procNumber);
+
+    if (!ret)
+        return false;
An invalid BackendType is not false, but B_INVALID.
+{ oid => '6312', oid_symbol => 'ROLE_PG_SIGNAL_AUTOVACUUM',
OIDs in patches under development should use a value in the range
8000-9999.  Newly-assigned OIDs are renumbered after the feature
freeze.
+    /*
+     * If the backend is autovacuum worker, allow user with the privileges of
+     * pg_signal_autovacuum role to signal the backend.
+     */
+    if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
+    {
+        if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
+            return SIGNAL_BACKEND_NOAUTOVACUUM;
+    }
I was wondering why this is not done after we've checked that we have
a superuser-owned backend, and this is giving me a pause.  @Nathan,
why do you think we should not rely on the roleId for an autovacuum
worker?  In core, do_autovacuum() is only called in a process without
a role specified, and I've noticed your remark here:
https://www.postgresql.org/message-id/20240401202146.GA2354284@nathanxps13
It's feeling more natural here to check that we have a superuser-owned
backend first, and then do a lookup of the process type.
One thing that we should definitely not do is letting any user calling
pg_signal_backend() know that a given PID maps to an autovacuum
worker.  This information is hidden in pg_stat_activity.  And
actually, doesn't the patch leak this information to all users when
calling pg_signal_backend with random PID numbers because of the fact
that SIGNAL_BACKEND_NOAUTOVACUUM exists?  Any role could guess which
PIDs are used by an autovacuum worker because of the granularity
required for the error related to pg_signal_autovacuum.
+    INJECTION_POINT("autovacuum-start");
Perhaps autovacuum-worker-start is more suited here.  I am not sure
that the beginning of do_autovacuum() is the optimal location, as what
matters is that we've done InitPostgres() to be able to grab the PID
from pg_stat_activity.  This location does the job.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+    plan skip_all => 'Injection points not supported by this build';
+}
[...]
+$node->safe_psql('postgres',
+    "SELECT injection_points_attach('autovacuum-start', 'wait');");
[...]
+# Wait until the autovacuum worker starts
+$node->wait_for_event('autovacuum worker', 'autovacuum-start');
This integration with injection points looks correct to me.
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
[...]
+# Copyright (c) 2024-2024, PostgreSQL Global Development Group
These need to be cleaned up.
+# Makefile for src/test/recovery
+#
+# src/test/recovery/Makefile
This is incorrect, twice.  No problems for me with using a new path in
src/test/ for that kind of tests.  There are no similar locations.
+    INSERT INTO tab_int SELECT * FROM generate_series(1, 1000000);
A good chunk of the test would be spent on that, but you don't need
that many tuples to trigger an autovacuum worker as the short naptime
is able to do it.  I would recommend to reduce that to a minimum.
+# User with signal_backend_role cannot terminate autovacuum worker
Not sure that there is a huge point in checking after a role that
holds pg_signal_backend.  An autovacuum worker is not a backend.  Only
the case of a role not member of pg_signal_autovacuum should be
enough.
+# Test signaling for pg_signal_autovacuum role.
This needs a better documentation: the purpose of the test is to
signal an autovacuum worker, aka it uses an injection point to ensure
that the worker for the whole duration of the test.
It seems to me that it would be a better practice to wakeup the
injection point and detach it before being done with the worker.
That's not mandatory but it would encourage the correct flow if this
code is copy-pasted around to other tests.
+like($psql_err, qr/ERROR:  permission denied to terminate ...
Checking only the ERRROR, and not the DETAIL should be sufficient
here.
+# User with pg_signal_backend can terminate autovacuum worker
+my $terminate_with_pg_signal_av = $node->psql('postgres', qq(
+    SET ROLE signal_autovacuum_role;
+    SELECT pg_terminate_backend($av_pid);
+), stdout => \$psql_out, stderr => \$psql_err);
+
+ok($terminate_with_pg_signal_av == 0, "Terminating autovacuum worker should succeed with pg_signal_autovacuum role");
Is that enough for the validation?  How about checking some pattern in
the server logs from an offset before running this last query?
--
Michael
			
		Вложения
On Fri, Apr 05, 2024 at 02:39:05PM +0900, Michael Paquier wrote:
> +    /*
> +     * If the backend is autovacuum worker, allow user with the privileges of
> +     * pg_signal_autovacuum role to signal the backend.
> +     */
> +    if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
> +    {
> +        if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
> +            return SIGNAL_BACKEND_NOAUTOVACUUM;
> +    }
> 
> I was wondering why this is not done after we've checked that we have
> a superuser-owned backend, and this is giving me a pause.  @Nathan,
> why do you think we should not rely on the roleId for an autovacuum
> worker?  In core, do_autovacuum() is only called in a process without
> a role specified, and I've noticed your remark here:
> https://www.postgresql.org/message-id/20240401202146.GA2354284@nathanxps13
> It's feeling more natural here to check that we have a superuser-owned
> backend first, and then do a lookup of the process type.
I figured since there's no reason to rely on that behavior, we might as
well do a bit of future-proofing in case autovacuum workers are ever not
run as InvalidOid.  It'd be easy enough to fix this code if that ever
happened, so I'm not too worried about this.
> One thing that we should definitely not do is letting any user calling
> pg_signal_backend() know that a given PID maps to an autovacuum
> worker.  This information is hidden in pg_stat_activity.  And
> actually, doesn't the patch leak this information to all users when
> calling pg_signal_backend with random PID numbers because of the fact
> that SIGNAL_BACKEND_NOAUTOVACUUM exists?  Any role could guess which
> PIDs are used by an autovacuum worker because of the granularity
> required for the error related to pg_signal_autovacuum.
Hm.  I hadn't considered that angle.  IIUC right now they'll just get the
generic superuser error for autovacuum workers.  I don't know how concerned
to be about users distinguishing autovacuum workers from other superuser
backends, _but_ if roles with pg_signal_autovacuum can't even figure out
the PIDs for the autovacuum workers, then this feature seems kind-of
useless.  Perhaps we should allow roles with privileges of
pg_signal_autovacuum to see the autovacuum workers in pg_stat_activity.
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		On Fri, Apr 05, 2024 at 07:56:56AM -0500, Nathan Bossart wrote: > On Fri, Apr 05, 2024 at 02:39:05PM +0900, Michael Paquier wrote: >> One thing that we should definitely not do is letting any user calling >> pg_signal_backend() know that a given PID maps to an autovacuum >> worker. This information is hidden in pg_stat_activity. And >> actually, doesn't the patch leak this information to all users when >> calling pg_signal_backend with random PID numbers because of the fact >> that SIGNAL_BACKEND_NOAUTOVACUUM exists? Any role could guess which >> PIDs are used by an autovacuum worker because of the granularity >> required for the error related to pg_signal_autovacuum. > > Hm. I hadn't considered that angle. IIUC right now they'll just get the > generic superuser error for autovacuum workers. I don't know how concerned > to be about users distinguishing autovacuum workers from other superuser > backends, _but_ if roles with pg_signal_autovacuum can't even figure out > the PIDs for the autovacuum workers, then this feature seems kind-of > useless. Perhaps we should allow roles with privileges of > pg_signal_autovacuum to see the autovacuum workers in pg_stat_activity. There is pg_read_all_stats as well, so I don't see a big issue in requiring to be a member of this role as well for the sake of what's proposing here. I'd rather not leak any information at the end for anybody calling pg_signal_backend without access to the stats, so checking the backend type after the role sounds kind of a safer long-term approach for me. -- Michael
Вложения
On Sat, Apr 06, 2024 at 08:56:04AM +0900, Michael Paquier wrote: > There is pg_read_all_stats as well, so I don't see a big issue in > requiring to be a member of this role as well for the sake of what's > proposing here. Well, that tells you quite a bit more than just which PIDs correspond to autovacuum workers, but maybe that's good enough for now. > I'd rather not leak any information at the end for > anybody calling pg_signal_backend without access to the stats, so > checking the backend type after the role sounds kind of a safer > long-term approach for me. I'm not following what you mean by this. Are you suggesting that we should keep the existing superuser message for the autovacuum workers? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Apr 05, 2024 at 08:07:51PM -0500, Nathan Bossart wrote:
> On Sat, Apr 06, 2024 at 08:56:04AM +0900, Michael Paquier wrote:
>> There is pg_read_all_stats as well, so I don't see a big issue in
>> requiring to be a member of this role as well for the sake of what's
>> proposing here.
>
> Well, that tells you quite a bit more than just which PIDs correspond to
> autovacuum workers, but maybe that's good enough for now.
That may be a good initial compromise, for now.
>> I'd rather not leak any information at the end for
>> anybody calling pg_signal_backend without access to the stats, so
>> checking the backend type after the role sounds kind of a safer
>> long-term approach for me.
>
> I'm not following what you mean by this.  Are you suggesting that we should
> keep the existing superuser message for the autovacuum workers?
Mostly.  Just to be clear the patch has the following problem:
=# CREATE ROLE popo LOGIN;
CREATE ROLE
=# CREATE EXTENSION injection_points;
CREATE EXTENSION
=# select injection_points_attach('autovacuum-start', 'wait');
 injection_points_attach
-------------------------
(1 row)
=# select pid, backend_type from pg_stat_activity
     where wait_event = 'autovacuum-start' LIMIT 1;
  pid  |   backend_type
-------+-------------------
 14163 | autovacuum worker
(1 row)
=> \c postgres popo
You are now connected to database "postgres" as user "popo".
=> select pg_terminate_backend(14163);
ERROR:  42501: permission denied to terminate autovacuum worker backend
DETAIL:  Only roles with the SUPERUSER attribute or with privileges of
the "pg_signal_autovacuum" role may terminate autovacuum worker
backend
LOCATION:  pg_terminate_backend, signalfuncs.c:267
=> select backend_type from pg_stat_activity where pid = 14163;
 backend_type
--------------
 null
(1 row)
And we should try to reshape things so as we get an ERROR like
"permission denied to terminate process" or "permission denied to
cancel query" for all the error paths, including autovacuum workers
and backends, so as we never leak any information about the backend
types involved when a role has no permission to issue the signal.
Perhaps that's the most intuitive thing as well, because autovacuum
workers are backends.  One thing that we could do is to mention both
pg_signal_backend and pg_signal_autovacuum in the errdetail, and have
both cases be handled by SIGNAL_BACKEND_NOPERMISSION on failure.
--
Michael
			
		Вложения
>>> There is pg_read_all_stats as well, so I don't see a big issue in
>>> requiring to be a member of this role as well for the sake of what's
>>> proposing here.
>>
>> Well, that tells you quite a bit more than just which PIDs correspond to
>> autovacuum workers, but maybe that's good enough for now.
>
> That may be a good initial compromise, for now.
Sounds good to me. I will update the documentation.
> And we should try to reshape things so as we get an ERROR like
> "permission denied to terminate process" or "permission denied to
> cancel query" for all the error paths, including autovacuum workers 
> and backends, so as we never leak any information about the backend
> types involved when a role has no permission to issue the signal.
> Perhaps that's the most intuitive thing as well, because autovacuum
> workers are backends. One thing that we could do is to mention both
> pg_signal_backend and pg_signal_autovacuum in the errdetail, and have
> both cases be handled by SIGNAL_BACKEND_NOPERMISSION on failure.
I understand your concern that we should avoid exposing the fact that the backend which the user is attempting to
terminateis an AV worker unless the user has pg_signal_backend privileges and pg_signal_autovacuum privileges. 
 
But Im not following how we can re-use SIGNAL_BACKEND_NOPERMISSION for this. If we return SIGNAL_BACKEND_NOPERMISSION
hereas the following, it'll stay return the "permission denied to terminate / cancel query" errmsg and errdetail in
pg_cancel/terminate_backend.
    /*
     * If the backend is autovacuum worker, allow user with the privileges of
     * pg_signal_autovacuum role to signal the backend.
     */
    if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
    {
        if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
            return SIGNAL_BACKEND_NOPERMISSION;
    }
Are you suggesting that we check if the backend is B_AUTOVAC in pg_cancel/ terminate_backend? That seems a bit unclean
tome since pg_cancel_backend & pg_cancel_backend does not access to the procNumber to check the type of the backend.
 
IMHO, we can keep SIGNAL_BACKEND_NOAUTOVACUUM but just improve the errmsg / errdetail to not expose that the backend is
anAV worker. It'll also be helpful if you can suggest what errdetail we should use here.
 
Thanks
--
Anthony Leung
Amazon Web Services: https://aws.amazon.com
			
		On Mon, Apr 08, 2024 at 05:42:05PM +0000, Leung, Anthony wrote: > Are you suggesting that we check if the backend is B_AUTOVAC in > pg_cancel/ terminate_backend? That seems a bit unclean to me since > pg_cancel_backend & pg_cancel_backend does not access to the > procNumber to check the type of the backend. > > IMHO, we can keep SIGNAL_BACKEND_NOAUTOVACUUM but just improve the > errmsg / errdetail to not expose that the backend is an AV > worker. It'll also be helpful if you can suggest what errdetail we > should use here. The thing is that you cannot rely on a lookup of the backend type for the error information, or you open yourself to letting the caller of pg_cancel_backend or pg_terminate_backend know if a backend is controlled by a superuser or if a backend is an autovacuum worker. And they may have no access to this information by default, except if the role is a member of pg_read_all_stats able to scan pg_stat_activity. An option that I can think of, even if it is not the most elegant ever, would be list all the possible system users that can be used in the errdetail under a single SIGNAL_BACKEND_NO* state. In the case of your patch it would mean to mention both pg_signal_backend and pg_signal_autovacuum. The choice of pg_signal_autovacuum is a bit inconsistent, as well, because autovacuum workers operate like regular backends. This name can also be confused with the autovacuum launcher. -- Michael
Вложения
Hi, thanks for looking into this.
On Tue, 9 Apr 2024 at 08:53, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Apr 08, 2024 at 05:42:05PM +0000, Leung, Anthony wrote:
> Are you suggesting that we check if the backend is B_AUTOVAC in
> pg_cancel/ terminate_backend? That seems a bit unclean to me since
> pg_cancel_backend & pg_cancel_backend does not access to the
> procNumber to check the type of the backend.
>
> IMHO, we can keep SIGNAL_BACKEND_NOAUTOVACUUM but just improve the
> errmsg / errdetail to not expose that the backend is an AV
> worker. It'll also be helpful if you can suggest what errdetail we
> should use here.
The thing is that you cannot rely on a lookup of the backend type for
the error information, or you open yourself to letting the caller of
pg_cancel_backend or pg_terminate_backend know if a backend is
controlled by a superuser or if a backend is an autovacuum worker.
Good catch. Thanks.  I think we need to update the error message to not leak backend type info.
> The choice of pg_signal_autovacuum is a bit inconsistent, as well,
> because autovacuum workers operate like regular backends. This name
> can also be confused with the autovacuum launcher.
> because autovacuum workers operate like regular backends. This name
> can also be confused with the autovacuum launcher.
Ok. What would be a good choice? Is `pg_signal_autovacuum_worker` good enough?
On Wed, Apr 10, 2024 at 12:52:19AM +0300, Kirill Reshke wrote: > On Tue, 9 Apr 2024 at 08:53, Michael Paquier <michael@paquier.xyz> wrote: >> The thing is that you cannot rely on a lookup of the backend type for >> the error information, or you open yourself to letting the caller of >> pg_cancel_backend or pg_terminate_backend know if a backend is >> controlled by a superuser or if a backend is an autovacuum worker. > > Good catch. Thanks. I think we need to update the error message to not > leak backend type info. Yep, that's necessary I am afraid. >> The choice of pg_signal_autovacuum is a bit inconsistent, as well, >> because autovacuum workers operate like regular backends. This name >> can also be confused with the autovacuum launcher. > > Ok. What would be a good choice? Is `pg_signal_autovacuum_worker` good > enough? Sounds fine to me. Perhaps others have an opinion about that? -- Michael
Вложения
On Wed, Apr 10, 2024 at 07:58:39AM +0900, Michael Paquier wrote:
> On Wed, Apr 10, 2024 at 12:52:19AM +0300, Kirill Reshke wrote:
>> On Tue, 9 Apr 2024 at 08:53, Michael Paquier <michael@paquier.xyz> wrote:
>>> The thing is that you cannot rely on a lookup of the backend type for
>>> the error information, or you open yourself to letting the caller of
>>> pg_cancel_backend or pg_terminate_backend know if a backend is
>>> controlled by a superuser or if a backend is an autovacuum worker.
>> 
>> Good catch. Thanks.  I think we need to update the error message to not
>> leak backend type info.
> 
> Yep, that's necessary I am afraid.
Isn't it relatively easy to discover this same information today via
pg_stat_progress_vacuum?  That has the following code:
        /* Value available to all callers */
        values[0] = Int32GetDatum(beentry->st_procpid);
        values[1] = ObjectIdGetDatum(beentry->st_databaseid);
I guess I'm not quite following why we are worried about leaking whether a
backend is an autovacuum worker.
>>> The choice of pg_signal_autovacuum is a bit inconsistent, as well,
>>> because autovacuum workers operate like regular backends.  This name
>>> can also be confused with the autovacuum launcher.
>> 
>> Ok. What would be a good choice? Is `pg_signal_autovacuum_worker` good
>> enough?
> 
> Sounds fine to me.  Perhaps others have an opinion about that?
WFM
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		On Wed, Apr 10, 2024 at 10:00:34AM -0500, Nathan Bossart wrote: > Isn't it relatively easy to discover this same information today via > pg_stat_progress_vacuum? That has the following code: > > /* Value available to all callers */ > values[0] = Int32GetDatum(beentry->st_procpid); > values[1] = ObjectIdGetDatum(beentry->st_databaseid); > > I guess I'm not quite following why we are worried about leaking whether a > backend is an autovacuum worker. Good point. I've missed that we make no effort currently to hide any PID information from the progress tables. And we can guess more context data because of the per-table split of the progress tables. This choice comes down to b6fb6471f6af that has introduced the progress report facility, so this ship has long sailed it seems. And it makes my argument kind of moot. -- Michael
Вложения
Posting updated version of this patch with comments above addressed.
1) pg_signal_autovacuum -> pg_signal_autovacuum_worker, as there seems
to be no objections to that.
2)
There are comments on how to write if statement:
> In core, do_autovacuum() is only called in a process without
> a role specified
> It's feeling more natural here to check that we have a superuser-owned
> backend first, and then do a lookup of the process type.
> I figured since there's no reason to rely on that behavior, we might as
> well do a bit of future-proofing in case autovacuum workers are ever not
> run as InvalidOid.
I have combined them into this:
if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
&& pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
This is both future-proofing and natural, I suppose. Downside of this
is double checking condition (!OidIsValid(proc->roleId) ||
superuser_arg(proc->roleId)), but i think that is ok for the sake of
simplicity.
3) pg_signal_autovacuum_worker Oid changed to random one: 8916
4)
> An invalid BackendType is not false, but B_INVALID.
fixed, thanks
5)
>>>> There is pg_read_all_stats as well, so I don't see a big issue in
>>>> requiring to be a member of this role as well for the sake of what's
>>>> proposing here.
>>>
>>> Well, that tells you quite a bit more than just which PIDs correspond to
>>> autovacuum workers, but maybe that's good enough for now.
>>
>> That may be a good initial compromise, for now.
>Sounds good to me. I will update the documentation.
@Anthony if you feel that documentation update adds much value here,
please do. Given that we know autovacuum worker PIDs from
pg_stat_progress_vacuum, I don't know how to reflect something about
pg_stat_autovac_worker in doc, and if it is worth it.
6)
> + INJECTION_POINT("autovacuum-start");
> Perhaps autovacuum-worker-start is more suited here
fixed, thanks
7)
> +# Copyright (c) 2022-2024, PostgreSQL Global Development Group
> [...]
> +# Copyright (c) 2024-2024, PostgreSQL Global Development Group
> These need to be cleaned up.
> +# Makefile for src/test/recovery
> +#
> +# src/test/recovery/Makefile
> This is incorrect, twice.
Cleaned up, thanks!
8)
> Not sure that there is a huge point in checking after a role that
> holds pg_signal_backend.
Ok. Removed.
Then:
> +like($psql_err, qr/ERROR: permission denied to terminate ...
> Checking only the ERRROR, and not the DETAIL should be sufficient
> here.
After removing the pg_signal_backend test case we have only one place
where errors check is done. So, I think we should keep DETAIL here to
ensure detail is correct (it differs from regular backend case).
9)
> +# Test signaling for pg_signal_autovacuum role.
> This needs a better documentation:
Updated. Hope now the test documentation helps to understand it.
10)
> +ok($terminate_with_pg_signal_av == 0, "Terminating autovacuum worker should succeed with pg_signal_autovacuum
role");
> Is that enough for the validation?
Added:
ok($node->log_contains(qr/FATAL: terminating autovacuum process due to
administrator command/, $offset),
"Autovacuum terminates when role is granted with pg_signal_autovacuum_worker");
11) references to `passcheck` extension removed. errors messages rephrased.
12) injection_point_detach added.
13)
> + INSERT INTO tab_int SELECT * FROM generate_series(1, 1000000);
> A good chunk of the test would be spent on that, but you don't need
> that many tuples to trigger an autovacuum worker as the short naptime
> is able to do it. I would recommend to reduce that to a minimum.
+1
Single tuple works.
14)
v3 suffers from segfault:
2024-04-11 11:28:31.116 UTC [147437] 001_signal_autovacuum.pl LOG:
statement: SELECT pg_terminate_backend(147427);
2024-04-11 11:28:31.116 UTC [147427] FATAL:  terminating autovacuum
process due to administrator command
2024-04-11 11:28:31.116 UTC [147410] LOG:  server process (PID 147427)
was terminated by signal 11: Segmentation fault
2024-04-11 11:28:31.116 UTC [147410] LOG:  terminating any other
active server processes
2024-04-11 11:28:31.117 UTC [147410] LOG:  shutting down because
restart_after_crash is off
2024-04-11 11:28:31.121 UTC [147410] LOG:  database system is shut down
The test doesn't fail because pg_terminate_backend actually meets his
point: autovac is killed. But while dying, autovac also receives
segfault. Thats because of injections points:
(gdb) bt
#0  0x000056361c3379ea in tas (lock=0x7fbcb9632224 <error: Cannot
access memory at address 0x7fbcb9632224>) at
../../../../src/include/storage/s_lock.h:228
#1  ConditionVariableCancelSleep () at condition_variable.c:238
#2  0x000056361c337e4b in ConditionVariableBroadcast
(cv=0x7fbcb66f498c) at condition_variable.c:310
#3  0x000056361c330a40 in CleanupProcSignalState (status=<optimized
out>, arg=<optimized out>) at procsignal.c:240
#4  0x000056361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
#5  0x000056361c3288fc in proc_exit_prepare (code=code@entry=1) at ipc.c:198
#6  0x000056361c3289bf in proc_exit (code=code@entry=1) at ipc.c:111
#7  0x000056361c49ffa8 in errfinish (filename=<optimized out>,
lineno=<optimized out>, funcname=0x56361c654370 <__func__.16>
"ProcessInterrupts") at elog.c:592
#8  0x000056361bf7191e in ProcessInterrupts () at postgres.c:3264
#9  0x000056361c3378d7 in ConditionVariableTimedSleep
(cv=0x7fbcb9632224, timeout=timeout@entry=-1,
wait_event_info=117440513) at condition_variable.c:196
#10 0x000056361c337d0b in ConditionVariableTimedSleep
(wait_event_info=<optimized out>, timeout=-1, cv=<optimized out>) at
condition_variable.c:135
#11 ConditionVariableSleep (cv=<optimized out>,
wait_event_info=<optimized out>) at condition_variable.c:98
#12 0x00000000b96347d0 in ?? ()
#13 0x3a3f1d9baa4f5500 in ?? ()
#14 0x000056361cc6cbd0 in ?? ()
#15 0x000056361ccac300 in ?? ()
#16 0x000056361c62be63 in ?? ()
#17 0x00007fbcb96347d0 in ?? () at injection_points.c:201 from
/home/reshke/postgres/tmp_install/home/reshke/postgres/pgbin/lib/injection_points.so
#18 0x00007fffe4122b10 in ?? ()
#19 0x00007fffe4122b70 in ?? ()
#20 0x0000000000000000 in ?? ()
discovered because of
# Release injection point.
$node->safe_psql('postgres',
"SELECT injection_point_detach('autovacuum-worker-start');");
added
v4 also suffers from that. i will try to fix that.
			
		Вложения
On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> Posting updated version of this patch with comments above addressed.
I look for a commitfest entry for this one, but couldn't find it.  Would
you mind either creating one or, if I've somehow missed it, pointing me to
the existing entry?
    https://commitfest.postgresql.org/48/
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
>> It's feeling more natural here to check that we have a superuser-owned
>> backend first, and then do a lookup of the process type.
> 
>> I figured since there's no reason to rely on that behavior, we might as
>> well do a bit of future-proofing in case autovacuum workers are ever not
>> run as InvalidOid.
> 
> I have combined them into this:
> 
> if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
> && pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
> 
> This is both future-proofing and natural, I suppose. Downside of this
> is double checking condition (!OidIsValid(proc->roleId) ||
> superuser_arg(proc->roleId)), but i think that is ok for the sake of
> simplicity.
If we want to retain the check, IMO we might as well combine the first two
blocks like Anthony proposed:
    if (!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
    {
        ProcNumber procNumber = GetNumberFromPGProc(proc);
        PGBackendStatus procStatus = pgstat_get_beentry_by_proc_number(procNumber);
        if (procStatus && procStatus->st_backendType == B_AUTOVAC_WORKER &&
            !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVAC_WORKER))
            return SIGNAL_BACKEND_NOAUTOVAC;
        else if (!superuser())
            return SIGNAL_BACKEND_NOSUPERUSER;
    }
+      <row>
+       <entry>pg_signal_autovacuum_worker</entry>
+       <entry>Allow signaling autovacuum worker backend to cancel or terminate</entry>
+      </row>
I think we need to be more specific about what pg_cancel_backend() and
pg_terminate_backend() do for autovacuum workers.  The code offers some
clues:
    /*
     * SIGINT is used to signal canceling the current table's vacuum; SIGTERM
     * means abort and exit cleanly, and SIGQUIT means abandon ship.
     */
    pqsignal(SIGINT, StatementCancelHandler);
    pqsignal(SIGTERM, die);
+/* ----------
+ * pgstat_get_backend_type() -
+ *
+ * Return the backend type of the backend for the given proc number.
+ * ----------
+ */
+BackendType
+pgstat_get_backend_type(ProcNumber procNumber)
+{
+    PgBackendStatus *ret;
+
+    ret = pgstat_get_beentry_by_proc_number(procNumber);
+
+    if (!ret)
+        return B_INVALID;
+
+    return ret->st_backendType;
+}
I'm not sure we really need to introduce a new function for this.  I
avoided using it in my example snippet above.  But, maybe it'll come in
handy down the road...
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		On Thu, 11 Apr 2024 at 19:07, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote: > > Posting updated version of this patch with comments above addressed. > > I look for a commitfest entry for this one, but couldn't find it. Would > you mind either creating one Done: https://commitfest.postgresql.org/48/4922/
On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote:
> The test doesn't fail because pg_terminate_backend actually meets his
> point: autovac is killed. But while dying, autovac also receives
> segfault. Thats because of injections points:
>
> (gdb) bt
> #0  0x000056361c3379ea in tas (lock=0x7fbcb9632224 <error: Cannot
> access memory at address 0x7fbcb9632224>) at
> ../../../../src/include/storage/s_lock.h:228
> #1  ConditionVariableCancelSleep () at condition_variable.c:238
> #2  0x000056361c337e4b in ConditionVariableBroadcast
> (cv=0x7fbcb66f498c) at condition_variable.c:310
> #3  0x000056361c330a40 in CleanupProcSignalState (status=<optimized
> out>, arg=<optimized out>) at procsignal.c:240
> #4  0x000056361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276
> #5  0x000056361c3288fc in proc_exit_prepare (code=code@entry=1) at ipc.c:198
> #6  0x000056361c3289bf in proc_exit (code=code@entry=1) at ipc.c:111
> #7  0x000056361c49ffa8 in errfinish (filename=<optimized out>,
> lineno=<optimized out>, funcname=0x56361c654370 <__func__.16>
> "ProcessInterrupts") at elog.c:592
> #8  0x000056361bf7191e in ProcessInterrupts () at postgres.c:3264
> #9  0x000056361c3378d7 in ConditionVariableTimedSleep
> (cv=0x7fbcb9632224, timeout=timeout@entry=-1,
> wait_event_info=117440513) at condition_variable.c:196
> #10 0x000056361c337d0b in ConditionVariableTimedSleep
> (wait_event_info=<optimized out>, timeout=-1, cv=<optimized out>) at
> condition_variable.c:135
> #11 ConditionVariableSleep (cv=<optimized out>,
> wait_event_info=<optimized out>) at condition_variable.c:98
>
> discovered because of
> # Release injection point.
> $node->safe_psql('postgres',
> "SELECT injection_point_detach('autovacuum-worker-start');");
> added
>
> v4 also suffers from that. i will try to fix that.
I can see this stack trace as well.  Capturing a bit more than your
own stack, this is crashing in the autovacuum worker while waiting on
a condition variable when processing a ProcessInterrupts().
That may point to a legit bug with condition variables in this
context, actually?  From what I can see, issuing a signal on a backend
process waiting with a condition variable is able to process the
interrupt correctly.
--
Michael
			
		Вложения
At Fri, 12 Apr 2024 09:10:35 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Thu, Apr 11, 2024 at 04:55:59PM +0500, Kirill Reshke wrote: > > The test doesn't fail because pg_terminate_backend actually meets his > > point: autovac is killed. But while dying, autovac also receives > > segfault. Thats because of injections points: > > > > (gdb) bt > > #0 0x000056361c3379ea in tas (lock=0x7fbcb9632224 <error: Cannot > > access memory at address 0x7fbcb9632224>) at > > ../../../../src/include/storage/s_lock.h:228 > > #1 ConditionVariableCancelSleep () at condition_variable.c:238 ... > > #3 0x000056361c330a40 in CleanupProcSignalState (status=<optimized out>, arg=<optimized out>) at procsignal.c:240 > > #4 0x000056361c328801 in shmem_exit (code=code@entry=1) at ipc.c:276 > > #9 0x000056361c3378d7 in ConditionVariableTimedSleep > > (cv=0x7fbcb9632224, timeout=timeout@entry=-1, ... > I can see this stack trace as well. Capturing a bit more than your > own stack, this is crashing in the autovacuum worker while waiting on > a condition variable when processing a ProcessInterrupts(). > > That may point to a legit bug with condition variables in this > context, actually? From what I can see, issuing a signal on a backend > process waiting with a condition variable is able to process the > interrupt correctly. ProcSignalInit sets up CleanupProcSignalState to be called via on_shmem_exit. If the CV is allocated in a dsm segment, shmem_exit should have detached the region for the CV. CV cleanup code should be invoked via before_shmem_exit. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Thu, 11 Apr 2024 at 16:55, Kirill Reshke <reshkekirill@gmail.com> wrote: > 7) > > > +# Copyright (c) 2022-2024, PostgreSQL Global Development Group > > [...] > > +# Copyright (c) 2024-2024, PostgreSQL Global Development Group > > > These need to be cleaned up. > > > +# Makefile for src/test/recovery > > +# > > +# src/test/recovery/Makefile > > > This is incorrect, twice. > > Cleaned up, thanks! Oh, wait, I did this wrong. Should i use +# Copyright (c) 2024-2024, PostgreSQL Global Development Group (Like in src/test/signals/meson.build & src/test/signals/t/001_signal_autovacuum.pl) or +# +# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group +# Portions Copyright (c) 1994, Regents of the University of California +# (Like in src/test/signals/Makefile) at the beginning of each added file?
On Fri, Apr 12, 2024 at 01:32:42PM +0500, Kirill Reshke wrote: > +# > +# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group > +# Portions Copyright (c) 1994, Regents of the University of California > +# > (Like in src/test/signals/Makefile) > > at the beginning of each added file? Assuming that these files are merged in 2024, you could just use: Copyright (c) 2024, PostgreSQL Global Development Group See for example slotsync.c introduced recently in commit ddd5f4f54a02. -- Michael
Вложения
I adjusted 0001 based on my upthread feedback. -- nathan
Вложения
> On 13 Jun 2024, at 02:04, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> I adjusted 0001 based on my upthread feedback.
This patch looks good to me. Spellchecker is complaining about “signaling” instead of “signalling”, but ISTM it’s OK.
I’ve tried to dig into the test.
The problem is CV is allocated in
inj_state = GetNamedDSMSegment("injection_points”,
which seems to be destroyed in
shmem_exit() calling dsm_backend_shutdown()
This happens before we broadcast that sleep is over.
I think this might happen with any wait on injection point if it is pg_terminate_backend()ed.
Is there way to wake up from CV sleep before processing actual termination?
Thanks!
Best regards, Andrey Borodin.
			
		On Fri, Jun 14, 2024 at 12:06:36PM +0500, Andrey M. Borodin wrote: > This patch looks good to me. Thanks for looking. > Spellchecker is complaining about "signaling" instead of "signalling", > but ISTM it´s OK. I think this is an en-US versus en-GB thing. We've standardized on en-US for "cancel" (see commits 8c9da14, 21f1e15, and af26857), so IMO we might as well do so for "signal," too. -- nathan
On Fri, Jun 14, 2024 at 12:06:36PM +0500, Andrey M. Borodin wrote:
> I’ve tried to dig into the test.
> The problem is CV is allocated in
>
> inj_state = GetNamedDSMSegment("injection_points”,
>
> which seems to be destroyed in
>
> shmem_exit() calling dsm_backend_shutdown()
>
> This happens before we broadcast that sleep is over.
> I think this might happen with any wait on injection point if it is
> pg_terminate_backend()ed.
Except if I am missing something, this is not a problem for a normal
backend, for example with one using a `SELECT injection_points_run()`.
> Is there way to wake up from CV sleep before processing actual termination?
I am honestly not sure if this is worth complicating the sigjmp path
of the autovacuum worker just for the sake of this test.  It seems to
me that it would be simple enough to move the injection point
autovacuum-worker-start within the transaction block a few lines down
in do_autovacuum(), no?
--
Michael
			
		Вложения
> On 21 Jun 2024, at 09:01, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Jun 14, 2024 at 12:06:36PM +0500, Andrey M. Borodin wrote:
>> I’ve tried to dig into the test.
>> The problem is CV is allocated in
>>
>> inj_state = GetNamedDSMSegment("injection_points”,
>>
>> which seems to be destroyed in
>>
>> shmem_exit() calling dsm_backend_shutdown()
>>
>> This happens before we broadcast that sleep is over.
>> I think this might happen with any wait on injection point if it is
>> pg_terminate_backend()ed.
>
> Except if I am missing something, this is not a problem for a normal
> backend, for example with one using a `SELECT injection_points_run()`.
Yes, i’ve tried to get similar error in other CV-sleeps and in injection points of normal backend - everything works
justfine. The error is specific to just this test. 
>> Is there way to wake up from CV sleep before processing actual termination?
>
> I am honestly not sure if this is worth complicating the sigjmp path
> of the autovacuum worker just for the sake of this test.  It seems to
> me that it would be simple enough to move the injection point
> autovacuum-worker-start within the transaction block a few lines down
> in do_autovacuum(), no?
Thanks for the pointer, I’ll try this approach!
Best regards, Andrey Borodin,
			
		On Fri, Jun 14, 2024 at 03:12:50PM -0500, Nathan Bossart wrote: > On Fri, Jun 14, 2024 at 12:06:36PM +0500, Andrey M. Borodin wrote: > > This patch looks good to me. > > Thanks for looking. While double-checking the whole, where I don't have much to say about 0001, I have fixed a few issues with the test presented upthread and stabilized it (CI and my stuff are both OK). I'd suggest to move it to test_misc/, because there is no clear category where to put it, and we have another test with injection points there for timeouts so the module dependency with EXTRA_INSTALL is already cleared. What do you think? -- Michael
Вложения
On Fri, Jun 21, 2024 at 10:31:30AM +0500, Andrey M. Borodin wrote: > Thanks for the pointer, I’ll try this approach! Thanks. FWIW, I've put my mind into it, and fixed the thing a few minutes ago: https://www.postgresql.org/message-id/ZnURUaujl39wSoEW%40paquier.xyz -- Michael
Вложения
> On 21 Jun 2024, at 10:36, Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jun 14, 2024 at 03:12:50PM -0500, Nathan Bossart wrote: >> On Fri, Jun 14, 2024 at 12:06:36PM +0500, Andrey M. Borodin wrote: >>> This patch looks good to me. >> >> Thanks for looking. > > While double-checking the whole, where I don't have much to say about > 0001, I have fixed a few issues with the test presented upthread and > stabilized it (CI and my stuff are both OK). I'd suggest to move it > to test_misc/, because there is no clear category where to put it, and > we have another test with injection points there for timeouts so the > module dependency with EXTRA_INSTALL is already cleared. > > What do you think? Thanks Michael! All changes look good to me. I just have one more concern: we do not wakeup() upon test end. I observed that there might happen more autovacuums and startsleeping in injection point. In every case I observed - these autovacuums quit gracefully. But is it guaranteed thattest will shut down node even if some of backends are waiting in injection points? Or, perhaps, should we always wakeup() after detaching? (in case when new point run might happen) Best regards, Andrey Borodin.
I've committed 0001. It looks like 0002 failed CI testing [0], but I haven't investigated why. [0] https://cirrus-ci.com/task/5668467599212544 -- nathan
Hi On Tue, 9 Jul 2024 at 23:13, Nathan Bossart <nathandbossart@gmail.com> wrote: > > I've committed 0001. It looks like 0002 failed CI testing [0], but I > haven't investigated why. > > [0] https://cirrus-ci.com/task/5668467599212544 > > -- > nathan The problem is the error message has been changed. # DETAIL: Only roles with privileges of the "pg_signal_autovacuum_worker" role may terminate autovacuum workers.' # doesn't match '(?^:ERROR: permission denied to terminate process\nDETAIL: Only roles with privileges of the "pg_signal_autovacuum_worker" role may terminate autovacuum worker processes.)' # Looks like you failed 1 test of 2. I changed the test to match the error message.
Вложения
On Tue, Jul 09, 2024 at 01:12:59PM -0500, Nathan Bossart wrote: > I've committed 0001. It looks like 0002 failed CI testing [0], but I > haven't investigated why. > > [0] https://cirrus-ci.com/task/5668467599212544 Nice catch by the CI. This looks like a race condition to me. I think that we should wait for the autovacuum worker to exit, and then scan the server logs we expect. For this failure, look at the timestamps of the server logs: 2024-07-08 12:48:23.271 UTC [32697][client backend] [006_signal_autovacuum.pl][11/3:0] LOG: statement: SELECT pg_terminate_backend(32672); 2024-07-08 12:48:23.275 UTC [32697][client backend] [006_signal_autovacuum.pl][:0] LOG: disconnection: session time: 0:00:00.018 user=postgres database=postgres host=[local] 2024-07-08 12:48:23.278 UTC [32672][autovacuum worker] FATAL: terminating autovacuum process due to administrator command And then the timestamp of the tests: [12:48:23.277](0.058s) not ok 2 - autovacuum worker signaled with pg_signal_autovacuum_worker granted We check for the contents of the logs 1ms before they are generated, hence failing the lookup check because the test is faster than the backend. Like what we are doing in 010_pg_basebackup.pl, we could do a poll_query_until() until the PID of the autovacuum worker is gone from pg_stat_activity before fetching the logs as ProcessInterrupts() stuff would be logged before the process exits, say: +# Wait for the autovacuum worker to exit before scanning the logs. +$node->poll_query_until('postgres', + "SELECT count(*) = 0 FROM pg_stat_activity " + . "WHERE pid = $av_pid AND backend_type = 'autovacuum worker';"); That gives something like the attached. Does that look correct to you? -- Michael
Вложения
On Wed, Jul 10, 2024 at 10:03:04AM +0500, Kirill Reshke wrote: > The problem is the error message has been changed. > > # DETAIL: Only roles with privileges of the > "pg_signal_autovacuum_worker" role may terminate autovacuum workers.' > # doesn't match '(?^:ERROR: permission denied to terminate > process\nDETAIL: Only roles with privileges of the > "pg_signal_autovacuum_worker" role may terminate autovacuum worker > processes.)' > # Looks like you failed 1 test of 2. > > I changed the test to match the error message. The script has two tests, and the CI is failing for the second test where we expect the signal to be processed: [12:48:23.370] # Failed test 'autovacuum worker signaled with pg_signal_autovacuum_worker granted' [12:48:23.370] # at t/006_signal_autovacuum.pl line 90. It is true that the first test where we expect the signal to not go through also failed as the DETAIL string has been changed, which is what you've fixed :) -- Michael
Вложения
> The script has two tests, and the CI is failing for the second test > where we expect the signal to be processed: > [12:48:23.370] # Failed test 'autovacuum worker signaled with > pg_signal_autovacuum_worker granted' > [12:48:23.370] # at t/006_signal_autovacuum.pl line 90. > -- > Michael That's very strange, because the test works fine on my virtual machine. Also, it seems that it works in Cirrus [0], as there is this line: [autovacuum worker] FATAL: terminating autovacuum process due to administrator command after `SET ROLE signal_autovacuum_worker_role;` and `SELECT pg_terminate_backend` in the log file. Somehow the `node->log_contains` check does not catch that. Maybe there is some issue with `$offset`? Will try to investigate [0] https://api.cirrus-ci.com/v1/artifact/task/5668467599212544/log/src/test/modules/test_misc/tmp_check/log/006_signal_autovacuum_node.log
> On 10 Jul 2024, at 11:27, Kirill Reshke <reshkekirill@gmail.com> wrote: > > That's very strange, because the test works fine on my virtual > machine. Also, it seems that it works in Cirrus [0], as there is this > line: So far I could not reproduce that failure. I’ve checkouted 6edec53 from CFbot repository, but it works fine in both Cirrus[0,1,2] and my machines… It seems like we have to rely on intuition to know what happened. Best regards, Andrey Borodin. [0] https://github.com/x4m/postgres_g/runs/27266322657 [1] https://github.com/x4m/postgres_g/runs/27266278325 [2] https://github.com/x4m/postgres_g/runs/27266052318
Hi, that's for digging into this. Turns out I completely missed one of
your emails today morning.
On Wed, 10 Jul 2024 at 10:15, Michael Paquier <michael@paquier.xyz> wrote:
> And then the timestamp of the tests:
> [12:48:23.277](0.058s) not ok 2 - autovacuum worker signaled with
> pg_signal_autovacuum_worker granted
>
> We check for the contents of the logs 1ms before they are generated,
> hence failing the lookup check because the test is faster than the
> backend.
>
> Like what we are doing in 010_pg_basebackup.pl, we could do a
> poll_query_until() until the PID of the autovacuum worker is gone from
> pg_stat_activity before fetching the logs as ProcessInterrupts() stuff
> would be logged before the process exits, say:
> +# Wait for the autovacuum worker to exit before scanning the logs.
> +$node->poll_query_until('postgres',
> +   "SELECT count(*) = 0 FROM pg_stat_activity "
> +   . "WHERE pid = $av_pid AND backend_type = 'autovacuum worker';");
>
> That gives something like the attached.  Does that look correct to
> you?
> --
> Michael
+1.
			
		On Wed, Jul 10, 2024 at 10:57:45PM +0500, Kirill Reshke wrote: > Hi, that's for digging into this. Turns out I completely missed one of > your emails today morning. Don't worry. Using this domain tends to put my emails in one's spam folder. -- Michael
Вложения
On Wed, Jul 10, 2024 at 02:14:55PM +0900, Michael Paquier wrote:
> +# Only non-superuser roles granted pg_signal_autovacuum_worker are allowed
> +# to signal autovacuum workers.  This test uses an injection point located
> +# at the beginning of the autovacuum worker startup.
nitpick: Superuser roles are also allowed to signal autovacuum workers.
Maybe this should read "Only roles with privileges of..."
> +# Create some content and set an aggressive autovacuum.
> +$node->safe_psql(
> +    'postgres', qq(
> +    CREATE TABLE tab_int(i int);
> +    ALTER TABLE tab_int SET (autovacuum_vacuum_cost_limit = 1);
> +    ALTER TABLE tab_int SET (autovacuum_vacuum_cost_delay = 100);
> +));
> +
> +$node->safe_psql(
> +    'postgres', qq(
> +    INSERT INTO tab_int VALUES(1);
> +));
> +
> +# Wait until an autovacuum worker starts.
> +$node->wait_for_event('autovacuum worker', 'autovacuum-worker-start');
I'm not following how this is guaranteed to trigger an autovacuum quickly.
Shouldn't we set autovacuum_vacuum_insert_threshold to 1 so that it is
eligible for autovacuum?
> +# Wait for the autovacuum worker to exit before scanning the logs.
> +$node->poll_query_until('postgres',
> +    "SELECT count(*) = 0 FROM pg_stat_activity "
> +    . "WHERE pid = $av_pid AND backend_type = 'autovacuum worker';");
WFM.  Even if the PID is quickly reused, this should work.  We just might
end up waiting a little longer.
Is it worth testing cancellation, too?
-- 
nathan
			
		On Thu, Jul 11, 2024 at 08:50:57PM -0500, Nathan Bossart wrote:
> I'm not following how this is guaranteed to trigger an autovacuum quickly.
> Shouldn't we set autovacuum_vacuum_insert_threshold to 1 so that it is
> eligible for autovacuum?
You are right, this is not going to influence a faster startup of a
worker, so we could remove that entirely.  On closer look, the main
bottlebeck is that the test is spending a lot of time waiting on the
naptime even if we are using the minimum value of 1s, as the scan of
pg_stat_activity checking for workers keeps looping.
[ ..thinks.. ]
I have one trick in my sleeve for this one to make the launcher more
responsive in checking the timestamps of the database list.  That's
not perfect, but it reduces the wait behind the worker startups by
400ms (1s previously as of the naptime, 600ms now) with a reload to
set the launcher's latch after the injection point has been attached.
The difference in runtime is noticeable.
>> +# Wait for the autovacuum worker to exit before scanning the logs.
>> +$node->poll_query_until('postgres',
>> +    "SELECT count(*) = 0 FROM pg_stat_activity "
>> +    . "WHERE pid = $av_pid AND backend_type = 'autovacuum worker';");
>
> WFM.  Even if the PID is quickly reused, this should work.  We just might
> end up waiting a little longer.
>
> Is it worth testing cancellation, too?
The point is to check after pg_signal_backend, so I am not sure it is
worth the extra cycles for the cancellation.
Attaching the idea, with a fix for the comment you have mentioned and
appending "regress_" the role names for the warnings generated by
-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, while on it.
What do you think?
--
Michael
			
		Вложения
On Fri, Jul 12, 2024 at 02:21:09PM +0900, Michael Paquier wrote: > On Thu, Jul 11, 2024 at 08:50:57PM -0500, Nathan Bossart wrote: >> I'm not following how this is guaranteed to trigger an autovacuum quickly. >> Shouldn't we set autovacuum_vacuum_insert_threshold to 1 so that it is >> eligible for autovacuum? > > You are right, this is not going to influence a faster startup of a > worker, so we could remove that entirely. On closer look, the main > bottlebeck is that the test is spending a lot of time waiting on the > naptime even if we are using the minimum value of 1s, as the scan of > pg_stat_activity checking for workers keeps looping. I suppose it would be silly to allow even lower values for autovacuum_naptime (e.g., by moving it to ConfigureNamesReal and setting the minimum to 0.1). > I have one trick in my sleeve for this one to make the launcher more > responsive in checking the timestamps of the database list. That's > not perfect, but it reduces the wait behind the worker startups by > 400ms (1s previously as of the naptime, 600ms now) with a reload to > set the launcher's latch after the injection point has been attached. > The difference in runtime is noticeable. That's a neat trick. I was confused why this test generates an autovacuum worker at all, but I now see that you are pausing it before we even gather the list of tables that need to be vacuumed. >> Is it worth testing cancellation, too? > > The point is to check after pg_signal_backend, so I am not sure it is > worth the extra cycles for the cancellation. Agreed. > What do you think? Looks reasonable to me. -- nathan
On Fri, Jul 12, 2024 at 11:19:05AM -0500, Nathan Bossart wrote: > I suppose it would be silly to allow even lower values for > autovacuum_naptime (e.g., by moving it to ConfigureNamesReal and setting > the minimum to 0.1). I've thought about that as well, and did not mention it as this would encourage insanely low naptime values resulting in fork() bursts. > That's a neat trick. I was confused why this test generates an autovacuum > worker at all, but I now see that you are pausing it before we even gather > the list of tables that need to be vacuumed. Yep. More aggressive signals aren't going to help. One thing I also considered here is to manipulate the db list timestamps inside a USE_INJECTION_POINTS block in the launcher to make the spawn more aggressive. Anyway, with 600ms in detection where I've tested it, I can live with the responsiveness of the patch as proposed. > Looks reasonable to me. Thanks. I'll see about stressing the buildfarm tomorrow or so, after looking at how the CI reacts. -- Michael
Вложения
On Mon, Jul 15, 2024 at 09:54:43AM +0900, Michael Paquier wrote: > Thanks. I'll see about stressing the buildfarm tomorrow or so, after > looking at how the CI reacts. There were a few more things here: 1) The new test was missing from test_misc/meson.build. 2) With 1) fixed, the CI has been complaining about the test stability, when retrieving the PID of a worker with this query: SELECT pid FROM pg_stat_activity WHERE backend_type = 'autovacuum worker' And it's annoying to have missed what's wrong here: - We don't check that the PID comes from a worker waiting on an injection point, so it could be a PID of something running, still gone once the signals are processed. - No limit check, so we could finish with a list of PIDs while only one is necessary. Windows was slow enough to spot that, spawning multiple autovacuum workers waiting on the injection point. After improving all that, I have checked again the CI and it was happy, so applied on HEAD. Let's see now how the buildfarm reacts. -- Michael
Вложения
Hi, On 2024-07-09 13:12:59 -0500, Nathan Bossart wrote: > I've committed 0001. I justed ended up looking at this code for some boring reason. One thing that has me worried a bit is that pg_signal_backend() now does pgstat_get_beentry_by_proc_number(), triggering a pgstat_read_current_status() further down. pgstat_read_current_status() can be fairly expensive, both in CPU and in memory. It copies each proc's activity strings, which each can be 1kB by default! The "saving grace" is that most of the time the pid will be sourced from pg_stat_activity, which already will will have done pgstat_read_current_status(). But I don't think that's always the case. Another concern is that pgstat_read_current_status() won't actually refresh the information if already cached, which might cause us to operate on outdated information. A malicious user could start a transaction, cause pgstat_read_current_status() to be called, and wait for the PID to be reused for some interesting process, and then signal, which the outdated information from the prior pgstat_read_current_status() would allow. The threat of this isn't huge, there's some fundamental raciness around pg_signal_backend() but this does open the window a lot wider. And all of this just because we need the target proc's BackendType, a single 4 byte value. I think it'd be better to introduce something that fetches a live BackendType. We have such functionality already, see pgstat_get_backend_current_activity(). Or we could have a copy of the backend type in PGPROC. Greetings, Andres Freund
On Fri, 22 Nov 2024 at 22:13, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2024-07-09 13:12:59 -0500, Nathan Bossart wrote: > > I've committed 0001. > > I justed ended up looking at this code for some boring reason. One thing that > has me worried a bit is that pg_signal_backend() now does > pgstat_get_beentry_by_proc_number(), triggering a pgstat_read_current_status() > further down. > > pgstat_read_current_status() can be fairly expensive, both in CPU and in > memory. It copies each proc's activity strings, which each can be 1kB by > default! > > The "saving grace" is that most of the time the pid will be sourced from > pg_stat_activity, which already will will have done > pgstat_read_current_status(). But I don't think that's always the case. > > > Another concern is that pgstat_read_current_status() won't actually refresh > the information if already cached, which might cause us to operate on outdated > information. A malicious user could start a transaction, cause > pgstat_read_current_status() to be called, and wait for the PID to be reused > for some interesting process, and then signal, which the outdated information > from the prior pgstat_read_current_status() would allow. > > The threat of this isn't huge, there's some fundamental raciness around > pg_signal_backend() but this does open the window a lot wider. > > > And all of this just because we need the target proc's BackendType, a single 4 > byte value. > > I think it'd be better to introduce something that fetches a live > BackendType. We have such functionality already, see > pgstat_get_backend_current_activity(). > > Or we could have a copy of the backend type in PGPROC. > > Greetings, > > Andres Freund Hi, thanks for taking care of this. I agree with this analysis, and it appears that the worries are legitimate. For the fix, I believe that copy-pasting `pgstat_get_backend_current_activity` to get the backend type should solve the issue. Not sure if this is the correct way of doing this. Enlarging PGPROC somehow feels worse. -- Best regards, Kirill Reshke
Hi,
On 2024-11-22 13:21:53 -0600, Nathan Bossart wrote:
> Here is a draft-grade patch for this one.  It seems pretty
> straightforward...
Thanks for looking into it quickly.
> diff --git a/src/backend/storage/ipc/signalfuncs.c b/src/backend/storage/ipc/signalfuncs.c
> index aa729a36e3..8e165e9729 100644
> --- a/src/backend/storage/ipc/signalfuncs.c
> +++ b/src/backend/storage/ipc/signalfuncs.c
> @@ -87,10 +87,7 @@ pg_signal_backend(int pid, int sig)
>       */
>      if (!OidIsValid(proc->roleId) || superuser_arg(proc->roleId))
>      {
> -        ProcNumber    procNumber = GetNumberFromPGProc(proc);
> -        PgBackendStatus *procStatus = pgstat_get_beentry_by_proc_number(procNumber);
> -
> -        if (procStatus && procStatus->st_backendType == B_AUTOVAC_WORKER)
> +        if (pgstat_get_backend_type(pid) == B_AUTOVAC_WORKER)
>          {
>              if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM_WORKER))
>                  return SIGNAL_BACKEND_NOAUTOVAC;
Because we already mapped the pid to a ProcNumber, it'd be cheaper to access
the backend status via procnumber.
> +BackendType
> +pgstat_get_backend_type(int pid)
> +{
> +    PgBackendStatus *beentry = BackendStatusArray;
> +
> +    for (int i = 1; i <= MaxBackends; i++)
> +    {
> +        volatile PgBackendStatus *vbeentry = beentry;
> +        bool        found;
> +
> +        for (;;)
> +        {
> +            int            before_changecount;
> +            int            after_changecount;
> +
> +            pgstat_begin_read_activity(vbeentry, before_changecount);
> +
> +            found = (vbeentry->st_procpid == pid);
> +
> +            pgstat_end_read_activity(vbeentry, after_changecount);
We don't need the pgstat_begin_read_activity() protocol when just accessing a
single 4 byte value - we assume in lots of places that can be read in a
non-tearable way.
> +            if (pgstat_read_activity_complete(before_changecount,
> +                                              after_changecount))
> +                break;
> +
> +            CHECK_FOR_INTERRUPTS();
> +        }
> +
> +        if (found)
> +            return beentry->st_backendType;
But if we were to follow it, we'd certainly need to use it here too.
Greetings,
Andres Freund
			
		On Sat, 23 Nov 2024, 07:44 Nathan Bossart, <nathandbossart@gmail.com> wrote:
On Fri, Nov 22, 2024 at 06:13:16PM -0500, Andres Freund wrote:
>> - if (procStatus && procStatus->st_backendType == B_AUTOVAC_WORKER)
>> + if (pgstat_get_backend_type(pid) == B_AUTOVAC_WORKER)
>
> Because we already mapped the pid to a ProcNumber, it'd be cheaper to access
> the backend status via procnumber.
D'oh, I missed that ProcNumber could be used as an index for the
BackendStatusArray. Is the attached more like what you are imagining?
> We don't need the pgstat_begin_read_activity() protocol when just accessing a
> single 4 byte value - we assume in lots of places that can be read in a
> non-tearable way.
>
>> + if (pgstat_read_activity_complete(before_changecount,
>> + after_changecount))
>> + break;
>> +
>> + CHECK_FOR_INTERRUPTS();
>> + }
>> +
>> + if (found)
>> + return beentry->st_backendType;
>
> But if we were to follow it, we'd certainly need to use it here too.
I see.
--
nathan
Hi!
LGTM
Hi, On 2024-11-22 20:44:34 -0600, Nathan Bossart wrote: > On Fri, Nov 22, 2024 at 06:13:16PM -0500, Andres Freund wrote: > >> - if (procStatus && procStatus->st_backendType == B_AUTOVAC_WORKER) > >> + if (pgstat_get_backend_type(pid) == B_AUTOVAC_WORKER) > > > > Because we already mapped the pid to a ProcNumber, it'd be cheaper to access > > the backend status via procnumber. > > D'oh, I missed that ProcNumber could be used as an index for the > BackendStatusArray. Is the attached more like what you are imagining? Yes. I'd probably add two function header comments: 1) explicit caution that this is fetching information not from the snapshot but "live" data 2) the return value might be out of date, that the procnumber needs to be valid and that the caller is responsible for permission checking I'd also add a comment do the code saying that it's fine to bypass the changecount mechanism, because we're reading a single 4 byte integer. Greetings, Andres Freund
On 2024-11-26 15:07:24 -0600, Nathan Bossart wrote: > I've attempted to add all these details in v3. LGTM!
On Tue, Nov 26, 2024 at 04:47:07PM -0500, Andres Freund wrote: > LGTM! Committed. -- nathan