Обсуждение: Proposal: two new role attributes and/or capabilities?

Поиск
Список
Период
Сортировка

Proposal: two new role attributes and/or capabilities?

От
José Luis Tallón
Дата:
Hello,
    I've found myself needing two role capabilities? as of lately, when 
thinking about restricting some roles to the barely minimum allowed 
permissions needed to perform their duties ... as opposed to having a 
superuser role devoted to these task.
    The "capabilities" would be:
* MAINTENANCE --- Ability to run    VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),    ANALYZE (including SET LOCAL
statistics_targetTO 10000),    REINDEX CONCURRENTLY  (but not the blocking, regular, one)    REFRESH MATERIALIZED VIEW
CONCURRENTLY(but not the blocking one)    COPY ???
 
    Rationale: delegate the routine maintenance tasks to a low 
privilege role, which can't do harm (apart from some performance 
degradation) --- hence the "no exclusive locking operations" requirement.

* IMPERSONATE --- Ability to do "SET AUTHORIZATION TO some_role;" and 
"RESET AUTHORIZATION"    This might be further refined to provide a way to say "This role is 
authorized to impersonate role1 but no other"
    Rationale: for use by connection poolers (esp. pgBouncer), where 
the role used for connection would only have the LOGIN and IMPERSONATE 
privileges. The remaining operations would be authorized against the 
supplanted role (i.e. ability to create tables/indexes or views, perform 
DML and/or DDL, etc)    AFAIK, a superuser role is needed for this purpose currently.

    The relevant code is quite simple and looks like it could be very 
useful. Any suggestions / input on this?    I can certainly prepare a patch for this (bear with me, It'll be my 
first here), and I'm willing to include more features if deemed useful.



Regards,
    / J.L.




Re: Proposal: two new role attributes and/or capabilities?

От
Stephen Frost
Дата:
* José Luis Tallón (jltallon@adv-solutions.net) wrote:
>     I've found myself needing two role capabilities? as of lately,
> when thinking about restricting some roles to the barely minimum
> allowed permissions needed to perform their duties ... as opposed to
> having a superuser role devoted to these task.

Excellent.  We've been looking at the same considerations.

>     The "capabilities" would be:
> * MAINTENANCE --- Ability to run
>     VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
>     ANALYZE (including SET LOCAL statistics_target TO 10000),

There's likely to be discussion about these from the perspective that
you really shouldn't need to run them all that much.  Why isn't
autovacuum able to handle this?

>     REINDEX CONCURRENTLY  (but not the blocking, regular, one)
>     REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)

These are interesting, but would these make sense at the role level?
Both of these commands explicitly take specific relations to operate
against, after all.

>     COPY ???

The question around this one goes back to the "CREATE DIRECTORY"
discussion that happened this fall.  I'm still hopeful that we can do
*something* in this area, but I'm not sure what that's going to end up
looking like.  The problem with COPY is that it's either trivial to use
it to become a superuser, or insanely difficult to secure sufficiently.

>     Rationale: delegate the routine maintenance tasks to a low
> privilege role, which can't do harm (apart from some performance
> degradation) --- hence the "no exclusive locking operations"
> requirement.

This makes sense for the reindex/refresh cases, though "no harm" might
be over-stating it.

> * IMPERSONATE --- Ability to do "SET AUTHORIZATION TO some_role;"
> and "RESET AUTHORIZATION"
>     This might be further refined to provide a way to say "This role
> is authorized to impersonate role1 but no other"
>     Rationale: for use by connection poolers (esp. pgBouncer), where
> the role used for connection would only have the LOGIN and
> IMPERSONATE privileges. The remaining operations would be authorized
> against the supplanted role (i.e. ability to create tables/indexes
> or views, perform DML and/or DDL, etc)
>     AFAIK, a superuser role is needed for this purpose currently.

No..  You can have 'no-inherit' roles which you can use for exactly this
purpose.  The initial login role can have no rights on the database,
except to SET ROLE to other roles which have been granted to it.

You should never have your pgBouncer or other pooling connection logging
in as a superuser.
Thanks,
    Stephen

Re: Proposal: two new role attributes and/or capabilities?

От
José Luis Tallón
Дата:
On 12/23/2014 05:29 PM, Stephen Frost wrote:
> * José Luis Tallón (jltallon@adv-solutions.net) wrote:
>>      I've found myself needing two role capabilities? as of lately,
>> when thinking about restricting some roles to the barely minimum
>> allowed permissions needed to perform their duties ... as opposed to
>> having a superuser role devoted to these tasks.
> Excellent.  We've been looking at the same considerations.
>
>>      The "capabilities" would be:
>> * MAINTENANCE --- Ability to run
>>      VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
>>      ANALYZE (including SET LOCAL statistics_target TO 10000),
> There's likely to be discussion about these from the perspective that
> you really shouldn't need to run them all that much.  Why isn't
> autovacuum able to handle this?

For some (arguably, ill-devised) use cases of INSERT - SELECT aggregate 
- DELETE (third party, closed-source app, massive insert rate) at the 
very least, autovacuum can't possibly cope with the change rate in some 
tables, given that there are quite many other interactive queries running.

Manually performing VACUUM / VACUUM ANALYZE on the (few) affected tables 
every 12h or so fixes the performance problem for the particular queries 
without impacting the other users too much --- the tables and indexes in 
question have been moved to a separate tablespace/disk volume of their own.


In short, this addresses situations where some tables have a much higher 
update rate than the rest of the database so that performance degrades 
with time --- the application became unusable after about 6 days' worth 
of updates until the manual vacuums were setup

>>      REINDEX CONCURRENTLY  (but not the blocking, regular, one)
>>      REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)
> These are interesting, but would these make sense at the role level?
> Both of these commands explicitly take specific relations to operate
> against, after all.

Yup. Let's imagine a cron job invoking psql in order to perform 
maintenance routine.
The particular command(s) can be generated on-the-fly by querying the 
catalog and then send them in one go to be run sequentially by the one 
backend.... as a crude form of rate limiting/quality-of-service of sorts 
("renice -p" or even "ionice -p" seems quite inadequate).

This automation becomes impossible to do if the object owners differ 
(only the owner or a superuser can perform these operations AFAICS -- 
there is no mention of it in the current documentation) unless the DBA 
makes the maintenance role a member of every other role ... which 
quickly becomes a problem.

>>      COPY ???
> The question around this one goes back to the "CREATE DIRECTORY"
> discussion that happened this fall.  I'm still hopeful that we can do
> *something* in this area, but I'm not sure what that's going to end up
> looking like.  The problem with COPY is that it's either trivial to use
> it to become a superuser, or insanely difficult to secure sufficiently.

Yes. That's the reason for the question marks  :-\
Some "dump to csv then load somewhere else" kind of jobs might benefit 
from this feature, but I'm not sure the convenience is worth the risk.

>>      Rationale: delegate the routine maintenance tasks to a low
>> privilege role, which can't do harm (apart from some performance
>> degradation) --- hence the "no exclusive locking operations"
>> requirement.
> This makes sense for the reindex/refresh cases, though "no harm" might
> be over-stating it.

Well.... it's performance degradation vs DoS due to massive (exclusive) 
locking  :S
At least restricting it to one backend (connection_limit=1) allows quite 
some rate limit.

>> * IMPERSONATE --- Ability to do "SET AUTHORIZATION TO some_role;"
>> and "RESET AUTHORIZATION"
>>      This might be further refined to provide a way to say "This role
>> is authorized to impersonate role1 but no other"
>>      Rationale: for use by connection poolers (esp. pgBouncer), where
>> the role used for connection would only have the LOGIN and
>> IMPERSONATE privileges. The remaining operations would be authorized
>> against the supplanted role (i.e. ability to create tables/indexes
>> or views, perform DML and/or DDL, etc)
>>      AFAIK, a superuser role is needed for this purpose currently.
> No..  You can have 'no-inherit' roles which you can use for exactly this
> purpose.  The initial login role can have no rights on the database,
> except to SET ROLE to other roles which have been granted to it.

Hmm.... the current documentation states that: "The specified role_name 
must be a role that the current session user is a member of".
I can see use cases where making the login role a member of every other 
used role quickly becomes a burden, and that's the main driver for this 
feature (I'm thinking about multiple app servers running several 
applications each, minimum two roles per application)

> You should never have your pgBouncer or other pooling connection logging
> in as a superuser.

At least the default pgBouncer config explicitly says (albeit for 8.2)
doc/faq.txt:server_reset_query = RESET ALL; SET SESSION AUTHORIZATION 
DEFAULT; ... so at least some people (including me) had assumed that SET 
SESSION AUTHORIZATION (with force_user) is being used, and this is 
described as superuser only.
However, a quick look at pgbouncer's sources shows we were wrong.

Thank you for the clarification, Stephen.

    / J.L.




Re: Proposal: two new role attributes and/or capabilities?

От
David G Johnston
Дата:
José Luis Tallón-2 wrote
> On 12/23/2014 05:29 PM, Stephen Frost wrote:
>> * José Luis Tallón (

> jltallon@

> ) wrote:
>>> * IMPERSONATE --- Ability to do "SET AUTHORIZATION TO some_role;"
>>> and "RESET AUTHORIZATION"
>>>      This might be further refined to provide a way to say "This role
>>> is authorized to impersonate role1 but no other"
>>>      Rationale: for use by connection poolers (esp. pgBouncer), where
>>> the role used for connection would only have the LOGIN and
>>> IMPERSONATE privileges. The remaining operations would be authorized
>>> against the supplanted role (i.e. ability to create tables/indexes
>>> or views, perform DML and/or DDL, etc)
>>>      AFAIK, a superuser role is needed for this purpose currently.
>> No..  You can have 'no-inherit' roles which you can use for exactly this
>> purpose.  The initial login role can have no rights on the database,
>> except to SET ROLE to other roles which have been granted to it.
>
> Hmm.... the current documentation states that: "The specified role_name
> must be a role that the current session user is a member of".
> I can see use cases where making the login role a member of every other
> used role quickly becomes a burden, and that's the main driver for this
> feature (I'm thinking about multiple app servers running several
> applications each, minimum two roles per application)

So you want to say:

GRANT IMPERSONATE TO bouncer; --covers the "ALL" requirement

instead of

GRANT victim1 TO bouncer;
GRANT victim2 TO bouncer;
etc...

-- these would still be used to cover the "limited users" requirement
?

Seems contrary to the principle of least privilege goal...

I'd rather there be better, more user friendly, SQL-based APIs to the
permissions system that would facilitate performing and reviewing grants.

If something like IMPERSONATE was added I would strongly suggest a
corresponding "[NO]IMPERSONATE" for CREATE USER so that the admin can make
specific roles unimpersonable - and also make SUPERUSER roles unimpersonable
by rule.

David J.




--
View this message in context:
http://postgresql.nabble.com/Proposal-two-new-role-attributes-and-or-capabilities-tp5831859p5831868.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: Proposal: two new role attributes and/or capabilities?

От
José Luis Tallón
Дата:
On 12/23/2014 07:01 PM, David G Johnston wrote:
> Hmm.... the current documentation states that: "The specified 
> role_name must be a role that the current session user is a member 
> of". I can see use cases where making the login role a member of every 
> other used role quickly becomes a burden, and that's the main driver 
> for this feature (I'm thinking about multiple app servers running 
> several applications each, minimum two roles per application)
> So you want to say:
>
> GRANT IMPERSONATE TO bouncer; --covers the "ALL" requirement

Yes, and exclusively for this purpose.

> instead of
>
> GRANT victim1 TO bouncer;
> GRANT victim2 TO bouncer;
> etc...
>
> -- these would still be used to cover the "limited users" requirement
> ?

Yup.

> Seems contrary to the principle of least privilege goal...

We still wouldn't grant any CREATE DATABASE, CREATE TABLESPACE, 
CREATE/LOAD EXTENSION, CREATE LANGUAGE, etc (and the ability to create/use/manipulate data within the database 
will still be constrained by the impersonated login)

> I'd rather there be better, more user friendly, SQL-based APIs to the
> permissions system that would facilitate performing and reviewing grants.
+1. All suggestions welcome.
> If something like IMPERSONATE was added I would strongly suggest a
> corresponding "[NO]IMPERSONATE" for CREATE USER so that the admin can make
> specific roles unimpersonable
Indeed, I had thought about this too.

> - and also make SUPERUSER roles unimpersonable by rule.

Yes, of course. Otherwise, the distinction would not have any sense.


Thanks,
    J.L.




Re: Proposal: two new role attributes and/or capabilities?

От
Stephen Frost
Дата:
* José Luis Tallón (jltallon@adv-solutions.net) wrote:
> On 12/23/2014 05:29 PM, Stephen Frost wrote:
> >>     The "capabilities" would be:
> >>* MAINTENANCE --- Ability to run
> >>     VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
> >>     ANALYZE (including SET LOCAL statistics_target TO 10000),
> >There's likely to be discussion about these from the perspective that
> >you really shouldn't need to run them all that much.  Why isn't
> >autovacuum able to handle this?
>
> For some (arguably, ill-devised) use cases of INSERT - SELECT
> aggregate - DELETE (third party, closed-source app, massive insert
> rate) at the very least, autovacuum can't possibly cope with the
> change rate in some tables, given that there are quite many other
> interactive queries running.
>
> Manually performing VACUUM / VACUUM ANALYZE on the (few) affected
> tables every 12h or so fixes the performance problem for the
> particular queries without impacting the other users too much ---
> the tables and indexes in question have been moved to a separate
> tablespace/disk volume of their own.

Autovacuum can certainly run vacuum/analyze on a few tables every 12
hours, so I'm not really following where you see autovacuum being unable
to cope.  I agree that there *are* such cases, but getting more
information about those cases and exactly what solution *does* work
would really help us improve autovacuum to address those use-cases.

> In short, this addresses situations where some tables have a much
> higher update rate than the rest of the database so that performance
> degrades with time --- the application became unusable after about 6
> days' worth of updates until the manual vacuums were setup

This really looks like a configuration issue with autovacuum..  Perhaps
you need to make it more aggressive than the default and have it run
more threads?  Have you turned the autovacuum logging up all the way?
Is autovacuum giving up due to locking?

> >>     REINDEX CONCURRENTLY  (but not the blocking, regular, one)
> >>     REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)
> >These are interesting, but would these make sense at the role level?
> >Both of these commands explicitly take specific relations to operate
> >against, after all.
>
> Yup. Let's imagine a cron job invoking psql in order to perform
> maintenance routine.

If they make sense at a relation level then they should be
relation-level GRANT'd permissions, not role-level attributes.

> The particular command(s) can be generated on-the-fly by querying
> the catalog and then send them in one go to be run sequentially by
> the one backend.... as a crude form of rate
> limiting/quality-of-service of sorts ("renice -p" or even "ionice
> -p" seems quite inadequate).

This sounds like it's something that we might want an autovacuum-like
background process to handle..  Some kind of auto-reindex-concurrently.
There are already plans to deal with updating of materialized views, as
I understand it.

> This automation becomes impossible to do if the object owners differ
> (only the owner or a superuser can perform these operations AFAICS
> -- there is no mention of it in the current documentation) unless
> the DBA makes the maintenance role a member of every other role ...
> which quickly becomes a problem.

I agree that having a maintenance role which is a member of every other
role isn't a very good solution.

> >>     COPY ???
> >The question around this one goes back to the "CREATE DIRECTORY"
> >discussion that happened this fall.  I'm still hopeful that we can do
> >*something* in this area, but I'm not sure what that's going to end up
> >looking like.  The problem with COPY is that it's either trivial to use
> >it to become a superuser, or insanely difficult to secure sufficiently.
>
> Yes. That's the reason for the question marks  :-\
> Some "dump to csv then load somewhere else" kind of jobs might
> benefit from this feature, but I'm not sure the convenience is worth
> the risk.

I've run into quite a few processes which would really benefit from
this, and would even be safe to use (the processes running the COPY
commands don't have any rights on the directories except through PG),
but it's not clear if that use-case is sufficiently broad for the
feature to be worthwhile..  At least, some feel it isn't.  Can you
describe your use-case more and perhaps the needle will move on that
point?

> >>* IMPERSONATE --- Ability to do "SET AUTHORIZATION TO some_role;"
> >>and "RESET AUTHORIZATION"
> >>     This might be further refined to provide a way to say "This role
> >>is authorized to impersonate role1 but no other"
> >>     Rationale: for use by connection poolers (esp. pgBouncer), where
> >>the role used for connection would only have the LOGIN and
> >>IMPERSONATE privileges. The remaining operations would be authorized
> >>against the supplanted role (i.e. ability to create tables/indexes
> >>or views, perform DML and/or DDL, etc)
> >>     AFAIK, a superuser role is needed for this purpose currently.
> >No..  You can have 'no-inherit' roles which you can use for exactly this
> >purpose.  The initial login role can have no rights on the database,
> >except to SET ROLE to other roles which have been granted to it.
>
> Hmm.... the current documentation states that: "The specified
> role_name must be a role that the current session user is a member
> of".

That's correct.

> I can see use cases where making the login role a member of every
> other used role quickly becomes a burden, and that's the main driver
> for this feature (I'm thinking about multiple app servers running
> several applications each, minimum two roles per application)

I suppose, but I think you're over-simplying your use-case.  Consider
this- do you want this login role to be a member of superuser roles?
What about DBA or other privilege-user roles?

> However, a quick look at pgbouncer's sources shows we were wrong.
>
> Thank you for the clarification, Stephen.

Happy to help!
Thanks,
    Stephen

Re: Proposal: two new role attributes and/or capabilities?

От
José Luis Tallón
Дата:
<div class="moz-cite-prefix">On 12/23/2014 07:01 PM, David G Johnston wrote:<br /></div><blockquote
cite="mid:1419357715258-5831868.post@n5.nabble.com"type="cite"><pre wrap="">[snip]
 
So you want to say:

GRANT IMPERSONATE TO bouncer; --covers the "ALL" requirement

instead of

GRANT victim1 TO bouncer;
GRANT victim2 TO bouncer;
etc...

-- these would still be used to cover the "limited users" requirement
?</pre></blockquote><br /><code>GRANT IMPERSONATE ON actual_role TO login_role</code><br /><br />     would actually
getus closer to how some other databases do, now that I think of it. This could be just some syntactic sugar.<br />    
Mightdefinitively ease migrations, if nothing else.<br /><br /><br /> I appreciate the feedback. Thanks!<br /><br /><br
/>    / J.L.<br /><br /> 

Re: Proposal: two new role attributes and/or capabilities?

От
Stephen Frost
Дата:
* David G Johnston (david.g.johnston@gmail.com) wrote:
> I'd rather there be better, more user friendly, SQL-based APIs to the
> permissions system that would facilitate performing and reviewing grants.

This would be *really* nice, I agree.  I've heard tale of people writing
functions that go through the catalog based on a given user and spit
back everything that they have permissions to.  Would be really nice if
we had those kinds of functions built-in.

> If something like IMPERSONATE was added I would strongly suggest a
> corresponding "[NO]IMPERSONATE" for CREATE USER so that the admin can make
> specific roles unimpersonable - and also make SUPERUSER roles unimpersonable
> by rule.

I agree that this would be necessary..  but strikes me as less of a
complete solution than what the existing pg_auth_members approach grants
you.

Perhaps a better idea would be to simply make the bouncer unnecessary by
having a in-PG connection pooler type of system.  That's been discussed
previously and shot down but it's still one of those things that's on my
wish-list for PG.
Thanks,
    Stephen

Re: Proposal: two new role attributes and/or capabilities?

От
Stephen Frost
Дата:
* José Luis Tallón (jltallon@adv-solutions.net) wrote:
> On 12/23/2014 07:01 PM, David G Johnston wrote:
> >[snip]
> >So you want to say:
> >
> >GRANT IMPERSONATE TO bouncer; --covers the "ALL" requirement
> >
> >instead of
> >
> >GRANT victim1 TO bouncer;
> >GRANT victim2 TO bouncer;
> >etc...
> >
> >-- these would still be used to cover the "limited users" requirement
> >?
>
> |GRANT IMPERSONATE ON actual_role TO login_role|
>
>     would actually get us closer to how some other databases do, now
> that I think of it. This could be just some syntactic sugar.
>     Might definitively ease migrations, if nothing else.

Uh, how is this different from GRANT actual_role TO login_role, with use
of noinherit..?
THanks,
    Stephen

Re: Proposal: two new role attributes and/or capabilities?

От
José Luis Tallón
Дата:
On 12/23/2014 07:52 PM, Stephen Frost wrote:
> [snip]
> Manually performing VACUUM / VACUUM ANALYZE on the (few) affected
> tables every 12h or so fixes the performance problem for the
> particular queries without impacting the other users too much ---
> the tables and indexes in question have been moved to a separate
> tablespace/disk volume of their own.
> Autovacuum can certainly run vacuum/analyze on a few tables every 12
> hours, so I'm not really following where you see autovacuum being unable
> to cope.  I agree that there *are* such cases, but getting more
> information about those cases and exactly what solution *does* work
> would really help us improve autovacuum to address those use-cases.

I'll try to. I don't have direct access, and the use case is quite edgy 
to be fair.
Plus, the configuration and hardware leaves quite a bit to be desired...

...but it's a real use case and the solution (even if only treating the 
symptoms) is quite straight-forward and easy.
>> In short, this addresses situations where some tables have a much
>> higher update rate than the rest of the database so that performance
>> degrades with time --- the application became unusable after about 6
>> days' worth of updates until the manual vacuums were setup
> This really looks like a configuration issue with autovacuum..  Perhaps
> you need to make it more aggressive than the default and have it run
> more threads?

Yes to both. Up to something which actually affected performance a bit.
But basically only a few tables exhibited this behaviour among several 
hundreds in this particular situation.

> Have you turned the autovacuum logging up all the way?
> Is autovacuum giving up due to locking?
Not one of my systems, and I don't have access to it anymore, but I 
don't think this was the reason.

However, having some hundred million deleted rows piling every few hours 
quite increases the load. For the record, the (closed-source) 
application did issue the DELETEs on the table, so partitioning + 
TRUNCATE child_part was not applicable.


In any case, I was aiming at making this kind of operations possible and 
easier --- regardless of whether they are solving the right problem or 
not, or whether there exists an optimal solution --- since I have seen 
some real life solutions that could benefit from it.
I agree that routine index maintenance is a better match for this 
feature, though :)
>>>>      REINDEX CONCURRENTLY  (but not the blocking, regular, one)
>>>>      REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)
>>> These are interesting, but would these make sense at the role level?
>>> Both of these commands explicitly take specific relations to operate
>>> against, after all.
>> Yup. Let's imagine a cron job invoking psql in order to perform
>> maintenance routine.
> If they make sense at a relation level then they should be
> relation-level GRANT'd permissions, not role-level attributes.

Same as before.
Let's imagine this coupled with REINDEX SCHEMA CONCURRENTLY ... or 
simply when constructing the list of tables dynamically and there is no 
other use for such a grant.
Arguably, this isn't that much of a problem if there exists a way to 
easily revoke all such permissions from all objects in one go (just like 
recently discussed in another thread)

>> The particular command(s) can be generated on-the-fly by querying
>> the catalog and then send them in one go to be run sequentially by
>> the one backend.... as a crude form of rate
>> limiting/quality-of-service of sorts ("renice -p" or even "ionice
>> -p" seems quite inadequate).
> This sounds like it's something that we might want an autovacuum-like
> background process to handle..  Some kind of auto-reindex-concurrently.
> There are already plans to deal with updating of materialized views, as
> I understand it.

While I can definitively see it for materialized views (they *are* 
views, after all), this pattern potentially gets us adding everything 
but the kitchen sink inside the database.
FWIW, it's only a matter of providing a mechanism for maintenance 
routines to use very unprivileged users to perform their duties on the 
whole cluster without having to explicitly grant permissions and/or 
include these into another, regular, role.
Please keep in mind that these  roles [having only LOGIN and 
MAINTENANCE] would NOT be able to perform any DML or DDL whatsoever, nor 
any queries (unless explicitly granted permission for SELECTs).

> [snip]
>> Yes. That's the reason for the question marks  :-\
>> Some "dump to csv then load somewhere else" kind of jobs might
>> benefit from this feature, but I'm not sure the convenience is worth
>> the risk.
> I've run into quite a few processes which would really benefit from
> this, and would even be safe to use (the processes running the COPY
> commands don't have any rights on the directories except through PG),
> but it's not clear if that use-case is sufficiently broad for the
> feature to be worthwhile..  At least, some feel it isn't.  Can you
> describe your use-case more and perhaps the needle will move on that
> point?

I don't have any concrete example for this right now, but a "psql -f 
query_file.sql | processing_script | load_into_somewhere" scheduled to 
run nightly (every few hours, even) is definitively quite common.
The COPY argument (which actually reads and outputs information) 
actually falsifies the above assumptions, I'm afraid. So this task would 
need explicit permissions anyway.

You seem to have better arguments in favor of this last feature, though.

> [snip]
> I suppose, but I think you're over-simplying your use-case.  Consider
> this- do you want this login role to be a member of superuser roles?
> What about DBA or other privilege-user roles?

As pointed out by David Johnston, impersonating superusers shall be 
explicitly forbidden.
The "noimpersonate" attribute (actually, we might need a better name to 
avoid the implication that it means "can not impersonate others" but 
"can not be impersonated") could easily block this loophole.

This feature is definitely orthogonal to the proposed "magical audit 
role" proposal, which I heartily agree with.
We might need to tweak logging a bit under this circumstances to reveal 
the login user vs the authorization user (quite like UID vs EUID for 
processess), though.



Since the changes to pg_authid seem to be about to be reverted, I'll 
wait a bit for a WIP patch --- this is definitively a lot more effort 
without the bitmask and helper funcs :(
There existed another suggestion ---in the form of text-based 
"capabilities" IIRC--- which might be used for this purpose, however.

Thanks,
    J.L.




Re: Proposal: two new role attributes and/or capabilities?

От
Robert Haas
Дата:
On Tue, Dec 23, 2014 at 11:20 AM, José Luis Tallón
<jltallon@adv-solutions.net> wrote:
>     I've found myself needing two role capabilities? as of lately, when
> thinking about restricting some roles to the barely minimum allowed
> permissions needed to perform their duties ... as opposed to having a
> superuser role devoted to these task.
>
>     The "capabilities" would be:
> * MAINTENANCE --- Ability to run
>     VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
>     ANALYZE (including SET LOCAL statistics_target TO 10000),
>     REINDEX CONCURRENTLY  (but not the blocking, regular, one)
>     REFRESH MATERIALIZED VIEW CONCURRENTLY (but not the blocking one)
>     COPY ???
>
>     Rationale: delegate the routine maintenance tasks to a low privilege
> role, which can't do harm (apart from some performance degradation) ---
> hence the "no exclusive locking operations" requirement.

I think the problem here is that, while almost everybody would
probably agree that something like this is useful, three hackers in a
room will have four or five different opinions on how to set the
boundaries around it.  I for example wouldn't feel too bad about
grouping VACUUM and ANALYZE under the same umbrella, but certainly
would be surprised to see all of the other stuff included.  But you've
got a different idea that is clearly valid, and somebody else might
want yet another thing.  We can avoid those problems by making the
capabilities finer-grained, but of course then you end up with lots
and lots of them, which is annoying too.

> * IMPERSONATE --- Ability to do "SET AUTHORIZATION TO some_role;" and "RESET
> AUTHORIZATION"
>     This might be further refined to provide a way to say "This role is
> authorized to impersonate role1 but no other"

I can't see this providing any meaningful security improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Proposal: two new role attributes and/or capabilities?

От
Jim Nasby
Дата:
On 12/23/14 12:52 PM, Stephen Frost wrote:
> * José Luis Tallón (jltallon@adv-solutions.net) wrote:
>> >On 12/23/2014 05:29 PM, Stephen Frost wrote:
>>>> > >>     The "capabilities" would be:
>>>> > >>* MAINTENANCE --- Ability to run
>>>> > >>     VACUUM [ANALYZE | FREEZE] (but not VACUUM FULL),
>>>> > >>     ANALYZE (including SET LOCAL statistics_target TO 10000),
>>> > >There's likely to be discussion about these from the perspective that
>>> > >you really shouldn't need to run them all that much.  Why isn't
>>> > >autovacuum able to handle this?
>> >
>> >For some (arguably, ill-devised) use cases of INSERT - SELECT
>> >aggregate - DELETE (third party, closed-source app, massive insert
>> >rate) at the very least, autovacuum can't possibly cope with the
>> >change rate in some tables, given that there are quite many other
>> >interactive queries running.
>> >
>> >Manually performing VACUUM / VACUUM ANALYZE on the (few) affected
>> >tables every 12h or so fixes the performance problem for the
>> >particular queries without impacting the other users too much ---
>> >the tables and indexes in question have been moved to a separate
>> >tablespace/disk volume of their own.
> Autovacuum can certainly run vacuum/analyze on a few tables every 12
> hours, so I'm not really following where you see autovacuum being unable
> to cope.  I agree that there*are*  such cases, but getting more
> information about those cases and exactly what solution*does*  work
> would really help us improve autovacuum to address those use-cases.

(going through some old email...)

The two cases I've dealt with recently are:

- Tables with a fair update/delete rate that should always stay small

The problem with these tables is if anything happens to upset vacuuming you can end up with a significantly larger than
expectedtable that's now essentially impossible to shrink. This could be caused by a single long-running transaction
thathappens to be in play when autovac kicks off, or for other reasons. Even once you manage to get all the tuples off
theend of the heap it can still be extremely difficult to grab the lock you need to truncate it. Running a vacuum every
minutefrom cron seems to help control this. Sadly, your indexes still get bloated, so occasionally you want to
re-clustertoo.
 

- Preemptively vacuuming during off-hours

Many sites have either nightly or weekend periods of reduced load. Such sites can gain a great benefit from scheduling
preemptivevacuums to reduce the odds of disruptive vacuuming activity during heavy activity periods. This is especially
truewhen it comes to a scan_all vacuum of a large table; having autovac do one of those at a peak period can really
hosethings.
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Proposal: two new role attributes and/or capabilities?

От
Stephen Frost
Дата:
Jim,

* Jim Nasby (Jim.Nasby@BlueTreble.com) wrote:
> On 12/23/14 12:52 PM, Stephen Frost wrote:
> >Autovacuum can certainly run vacuum/analyze on a few tables every 12
> >hours, so I'm not really following where you see autovacuum being unable
> >to cope.  I agree that there*are*  such cases, but getting more
> >information about those cases and exactly what solution*does*  work
> >would really help us improve autovacuum to address those use-cases.
>
> (going through some old email...)
>
> The two cases I've dealt with recently are:
>
> - Tables with a fair update/delete rate that should always stay small
>
> The problem with these tables is if anything happens to upset vacuuming you can end up with a significantly larger
thanexpected table that's now essentially impossible to shrink. This could be caused by a single long-running
transactionthat happens to be in play when autovac kicks off, or for other reasons. Even once you manage to get all the
tuplesoff the end of the heap it can still be extremely difficult to grab the lock you need to truncate it. Running a
vacuumevery minute from cron seems to help control this. Sadly, your indexes still get bloated, so occasionally you
wantto re-cluster too. 

The difference between the autovacuum-run vacuum and the cron-run vacuum
is that the one running out of cron will just keep holding the lock
until it's actually able to truncate the end of the relation, no?  I
recall discussion previously that we need a way to either support that
in autovacuum for (a configurable set of) regular relations or come up
with a solution that doesn't require that lock.

> - Preemptively vacuuming during off-hours
>
> Many sites have either nightly or weekend periods of reduced load. Such sites can gain a great benefit from
schedulingpreemptive vacuums to reduce the odds of disruptive vacuuming activity during heavy activity periods. This is
especiallytrue when it comes to a scan_all vacuum of a large table; having autovac do one of those at a peak period can
reallyhose things. 

Having preferrable times for autovacuum to run vacuums would certainly
be nice to support this use-case.

All that said, I'm not against a role attribute which allows the user to
vacuum/analyze anything.  I do think that's a bit different from the
existing effort to reduce the activities which require superuser as with
the vacuum/analyze case you *could* have a single role that's a member
of every role that owns the relations which you want to vacuum/analyze.
I grant that it's a bit awkward though.
Thanks,
    Stephen

Re: Proposal: two new role attributes and/or capabilities?

От
Jim Nasby
Дата:
On 1/28/15 7:45 PM, Stephen Frost wrote:
> Jim,
>
> * Jim Nasby (Jim.Nasby@BlueTreble.com) wrote:
>> On 12/23/14 12:52 PM, Stephen Frost wrote:
>>> Autovacuum can certainly run vacuum/analyze on a few tables every 12
>>> hours, so I'm not really following where you see autovacuum being unable
>>> to cope.  I agree that there*are*  such cases, but getting more
>>> information about those cases and exactly what solution*does*  work
>>> would really help us improve autovacuum to address those use-cases.
>>
>> (going through some old email...)
>>
>> The two cases I've dealt with recently are:
>>
>> - Tables with a fair update/delete rate that should always stay small
>>
>> The problem with these tables is if anything happens to upset vacuuming you can end up with a significantly larger
thanexpected table that's now essentially impossible to shrink. This could be caused by a single long-running
transactionthat happens to be in play when autovac kicks off, or for other reasons. Even once you manage to get all the
tuplesoff the end of the heap it can still be extremely difficult to grab the lock you need to truncate it. Running a
vacuumevery minute from cron seems to help control this. Sadly, your indexes still get bloated, so occasionally you
wantto re-cluster too.
 
>
> The difference between the autovacuum-run vacuum and the cron-run vacuum
> is that the one running out of cron will just keep holding the lock
> until it's actually able to truncate the end of the relation, no?  I
> recall discussion previously that we need a way to either support that
> in autovacuum for (a configurable set of) regular relations or come up
> with a solution that doesn't require that lock.

AFAICT, in master, there is no difference in truncation between auto and manual vacuum. What we do is attempt to
acquirethe truncation lock for up to 5 seconds, giving up after that. Once we do have the lock, we check to see how
manypages we can actually truncate. During that check, we test every ~20ms or so to see if someone else is waiting on
ourexclusive lock; if they are we stop counting and will only truncate the relation up to that point.
 

So what this boils down to is that it's very hard to truncate a busy relation and your best bet of doing so is by
repeatedlytrying to.
 

>> - Preemptively vacuuming during off-hours
>>
>> Many sites have either nightly or weekend periods of reduced load. Such sites can gain a great benefit from
schedulingpreemptive vacuums to reduce the odds of disruptive vacuuming activity during heavy activity periods. This is
especiallytrue when it comes to a scan_all vacuum of a large table; having autovac do one of those at a peak period can
reallyhose things.
 
>
> Having preferrable times for autovacuum to run vacuums would certainly
> be nice to support this use-case.
>
> All that said, I'm not against a role attribute which allows the user to
> vacuum/analyze anything.  I do think that's a bit different from the
> existing effort to reduce the activities which require superuser as with
> the vacuum/analyze case you *could* have a single role that's a member
> of every role that owns the relations which you want to vacuum/analyze.
> I grant that it's a bit awkward though.

Yeah, I was mostly just providing some use cases. I'm not opposed to a separate vacuum/analyze permission, but don't
seea huge need for it either. Typically I set this stuff up as a cron on the server itself, utilizing an account that
doesident authentication. I figure if someone manages to compromise that then they probably have root on the box
anyway,which is obviously game over.
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Proposal: two new role attributes and/or capabilities?

От
Robert Haas
Дата:
On Thu, Jan 29, 2015 at 4:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>> The difference between the autovacuum-run vacuum and the cron-run vacuum
>> is that the one running out of cron will just keep holding the lock
>> until it's actually able to truncate the end of the relation, no?  I
>> recall discussion previously that we need a way to either support that
>> in autovacuum for (a configurable set of) regular relations or come up
>> with a solution that doesn't require that lock.
>
> AFAICT, in master, there is no difference in truncation between auto and
> manual vacuum. What we do is attempt to acquire the truncation lock for up
> to 5 seconds, giving up after that. Once we do have the lock, we check to
> see how many pages we can actually truncate. During that check, we test
> every ~20ms or so to see if someone else is waiting on our exclusive lock;
> if they are we stop counting and will only truncate the relation up to that
> point.

I don't think this is true, and I don't think it's been true for a
long time, if ever.  The difference between a manual vacuum and
autovacuum is that autovacuum commits suicide when it conflicts with
somebody else's lock request, and a manual vacuum doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Proposal: two new role attributes and/or capabilities?

От
Jim Nasby
Дата:
On 1/29/15 4:02 PM, Robert Haas wrote:
> On Thu, Jan 29, 2015 at 4:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>>> The difference between the autovacuum-run vacuum and the cron-run vacuum
>>> is that the one running out of cron will just keep holding the lock
>>> until it's actually able to truncate the end of the relation, no?  I
>>> recall discussion previously that we need a way to either support that
>>> in autovacuum for (a configurable set of) regular relations or come up
>>> with a solution that doesn't require that lock.
>>
>> AFAICT, in master, there is no difference in truncation between auto and
>> manual vacuum. What we do is attempt to acquire the truncation lock for up
>> to 5 seconds, giving up after that. Once we do have the lock, we check to
>> see how many pages we can actually truncate. During that check, we test
>> every ~20ms or so to see if someone else is waiting on our exclusive lock;
>> if they are we stop counting and will only truncate the relation up to that
>> point.
>
> I don't think this is true, and I don't think it's been true for a
> long time, if ever.  The difference between a manual vacuum and
> autovacuum is that autovacuum commits suicide when it conflicts with
> somebody else's lock request, and a manual vacuum doesn't.

Any idea where we set that up? The call stack is (note I'm ignoring vacuum full) autovacuum_do_vac_analyze() ->
vacuum()-> vacuum_rel() -> lazy_vacuum_rel() -> lazy_truncate_heap() (which also calls count_nondeletable_pages()), and
Idon't see any IsAutoVacuumWorkerProcess() calls in lazy_truncate_heap or count_nondeletable_pages(). So AFAICT
truncationoperates the same regardless of how the vacuum was started. We also explicitly set things like
statement_timeoutto 0 in autovac, so I don't think that would be responsible for this...
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Proposal: two new role attributes and/or capabilities?

От
Tom Lane
Дата:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 1/29/15 4:02 PM, Robert Haas wrote:
>> I don't think this is true, and I don't think it's been true for a
>> long time, if ever.  The difference between a manual vacuum and
>> autovacuum is that autovacuum commits suicide when it conflicts with
>> somebody else's lock request, and a manual vacuum doesn't.

> Any idea where we set that up?

The relevant logic is in the deadlock detector, which will cancel an
autovacuum transaction if it is blocking somebody else's lock request.
        regards, tom lane