Обсуждение: Broken lock management in policy.c.

Поиск
Список
Период
Сортировка

Broken lock management in policy.c.

От
Tom Lane
Дата:
CREATE POLICY takes AccessExclusiveLock on the table a policy is being
added to -- good -- and then releases it when done -- bad.  Correct
behavior is to hold the lock till commit, because otherwise there is
no guarantee that other backends will see the updated catalog rows in
time, or indeed at all.

The same goes for ALTER POLICY, and possibly DROP POLICY (I've not
found the implementation of that ...)

If we fix this, I believe we could also remove the weasel wording that was
added to create_policy.sgml in commit 43cd468cf01007f3 about how the
system might transiently fail to enforce row security correctly.
        regards, tom lane



Re: Broken lock management in policy.c.

От
Stephen Frost
Дата:
Tom,

On Sunday, January 3, 2016, Tom Lane <tgl@sss.pgh.pa.us> wrote:
CREATE POLICY takes AccessExclusiveLock on the table a policy is being
added to -- good -- and then releases it when done -- bad.  Correct
behavior is to hold the lock till commit, because otherwise there is
no guarantee that other backends will see the updated catalog rows in
time, or indeed at all.

Agreed.  
 
The same goes for ALTER POLICY, and possibly DROP POLICY (I've not
found the implementation of that ...)

DROP POLICY is handled through the generalized drop command and likely doesn't have a problem due to that. 
 
If we fix this, I believe we could also remove the weasel wording that was
added to create_policy.sgml in commit 43cd468cf01007f3 about how the
system might transiently fail to enforce row security correctly.

The issue addressed there is with row locking, not table level locks. The concern is that a user could acquire a lock on a row to which their access to is then removed due to changes in rows which are used by the policy on the table- not any changes to the policy definition itself. The row that is locked might then be updated by the user who removed access to the row and the results of that update seen by the original user via RETURNING.

Peter provided a test case which illustrated the concern. 


Apologies if the above isn't entirely accurate, on my phone at the moment. 

Thanks!

Stephen

Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> CREATE POLICY takes AccessExclusiveLock on the table a policy is being
> added to -- good -- and then releases it when done -- bad.  Correct
> behavior is to hold the lock till commit, because otherwise there is
> no guarantee that other backends will see the updated catalog rows in
> time, or indeed at all.
>
> The same goes for ALTER POLICY, and possibly DROP POLICY (I've not
> found the implementation of that ...)

Right.

> If we fix this, I believe we could also remove the weasel wording that was
> added to create_policy.sgml in commit 43cd468cf01007f3 about how the
> system might transiently fail to enforce row security correctly.

IIUC, then what you say here isn't true, because that issue was about
a transient failure without the involvement of *any* DDL from start to
finish. CREATE POLICY accepts subqueries referencing other relations
in its USING quals. This seems to be idiomatic usage of CREATE POLICY,
in fact.

See my original isolation tester test case, where only the setup involves DDL:

http://www.postgresql.org/message-id/attachment/38467/0001-RLS-isolation-test.patch

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Stephen Frost
Дата:
Tom, Peter,

On Sunday, January 3, 2016, Peter Geoghegan <pg@heroku.com> wrote:
On Sun, Jan 3, 2016 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> CREATE POLICY takes AccessExclusiveLock on the table a policy is being
> added to -- good -- and then releases it when done -- bad.  Correct
> behavior is to hold the lock till commit, because otherwise there is
> no guarantee that other backends will see the updated catalog rows in
> time, or indeed at all.
>
> The same goes for ALTER POLICY, and possibly DROP POLICY (I've not
> found the implementation of that ...)

Right.

> If we fix this, I believe we could also remove the weasel wording that was
> added to create_policy.sgml in commit 43cd468cf01007f3 about how the
> system might transiently fail to enforce row security correctly.

IIUC, then what you say here isn't true, because that issue was about
a transient failure without the involvement of *any* DDL from start to
finish. CREATE POLICY accepts subqueries referencing other relations
in its USING quals. This seems to be idiomatic usage of CREATE POLICY,
in fact.

See my original isolation tester test case, where only the setup involves DDL:

http://www.postgresql.org/message-id/attachment/38467/0001-RLS-isolation-test.patch

Further, as mentioned in the discussion of that issue- it also can apply to updatable security barrier views. It's not specific to RLS. 

Thanks,

Stephen 

Re: Broken lock management in policy.c.

От
Tom Lane
Дата:
Peter Geoghegan <pg@heroku.com> writes:
> On Sun, Jan 3, 2016 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we fix this, I believe we could also remove the weasel wording that was
>> added to create_policy.sgml in commit 43cd468cf01007f3 about how the
>> system might transiently fail to enforce row security correctly.

> IIUC, then what you say here isn't true, because that issue was about
> a transient failure without the involvement of *any* DDL from start to
> finish. CREATE POLICY accepts subqueries referencing other relations
> in its USING quals. This seems to be idiomatic usage of CREATE POLICY,
> in fact.

> See my original isolation tester test case, where only the setup involves DDL:

> http://www.postgresql.org/message-id/attachment/38467/0001-RLS-isolation-test.patch

Hmm.  I agree that this test case's behavior does not depend on CREATE
POLICY's lock mismanagement.  I think what is going on here is that the
RLS quals are being checked with an older snapshot than what controls
the output of the UPDATE RETURNING.  Even the EPQ recheck that's done
after getting update lock on the "information" row doesn't fix it,
because we *don't* insist on taking an update lock on the "users" table,
so we don't see the new value of that row.

If that diagnosis is correct, you could fix it by changing the RLS
policies' sub-selects to use SELECT FOR UPDATE, though the loss of
concurrency might well be unacceptable.

In any case, the text in create_policy.sgml seems to be a misleading
description of the problem, as it's talking about DDL modifications
which are *not* what's happening here.
        regards, tom lane



Re: Broken lock management in policy.c.

От
Stephen Frost
Дата:
Tom,

On Sunday, January 3, 2016, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@heroku.com> writes:
> On Sun, Jan 3, 2016 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we fix this, I believe we could also remove the weasel wording that was
>> added to create_policy.sgml in commit 43cd468cf01007f3 about how the
>> system might transiently fail to enforce row security correctly.

> IIUC, then what you say here isn't true, because that issue was about
> a transient failure without the involvement of *any* DDL from start to
> finish. CREATE POLICY accepts subqueries referencing other relations
> in its USING quals. This seems to be idiomatic usage of CREATE POLICY,
> in fact.

> See my original isolation tester test case, where only the setup involves DDL:

> http://www.postgresql.org/message-id/attachment/38467/0001-RLS-isolation-test.patch

Hmm.  I agree that this test case's behavior does not depend on CREATE
POLICY's lock mismanagement.  I think what is going on here is that the
RLS quals are being checked with an older snapshot than what controls
the output of the UPDATE RETURNING.  Even the EPQ recheck that's done
after getting update lock on the "information" row doesn't fix it,
because we *don't* insist on taking an update lock on the "users" table,
so we don't see the new value of that row.

If that diagnosis is correct, you could fix it by changing the RLS
policies' sub-selects to use SELECT FOR UPDATE, though the loss of
concurrency might well be unacceptable.

In any case, the text in create_policy.sgml seems to be a misleading
description of the problem, as it's talking about DDL modifications
which are *not* what's happening here.                

There was some debate about the right place for that discussion as there didn't seem to be any particularly ideal location. I had been intending to have a locking section in the RLS section rework. 

Thanks,

Stephen

Re: Broken lock management in policy.c.

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> On Sunday, January 3, 2016, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> CREATE POLICY takes AccessExclusiveLock on the table a policy is being
>> added to -- good -- and then releases it when done -- bad.  Correct
>> behavior is to hold the lock till commit, because otherwise there is
>> no guarantee that other backends will see the updated catalog rows in
>> time, or indeed at all.

> Agreed.

On closer inspection, I'd misidentified the functions containing the
bad code --- it was really RemovePolicyById and RemoveRoleFromObjectPolicy
that were wrong.  Fix pushed.
        regards, tom lane



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 5:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm.  I agree that this test case's behavior does not depend on CREATE
> POLICY's lock mismanagement.  I think what is going on here is that the
> RLS quals are being checked with an older snapshot than what controls
> the output of the UPDATE RETURNING.  Even the EPQ recheck that's done
> after getting update lock on the "information" row doesn't fix it,
> because we *don't* insist on taking an update lock on the "users" table,
> so we don't see the new value of that row.

That's clearly what's going on in the example scenario: the EPQ
recheck walks the update chain, and uses a combination of a dirty
snapshot (for the target's scan tuple), and the MVCC snapshot (for
everything else). The scenario involves an attacker exploiting the
inconsistency, where the admin did not anticipate such an
inconsistency.

This seemed reasonably plausible to me, not least because READ
COMMITTED conflict handling is a mystery to the vast majority of
users.

> If that diagnosis is correct, you could fix it by changing the RLS
> policies' sub-selects to use SELECT FOR UPDATE, though the loss of
> concurrency might well be unacceptable.

That was one of the things that we talked about doing. Of course, that
would work because we walk the UPDATE chain to do EPQ recheck for
SELECT FOR UPDATE, just as with an UPDATE or a DELETE.

> In any case, the text in create_policy.sgml seems to be a misleading
> description of the problem, as it's talking about DDL modifications
> which are *not* what's happening here.

I'm not sure what you mean. The CREATE POLICY changes in commit
43cd468cf01007f3 specifically call out the issue illustrated in my
example test case. There are some other changes made in that commit,
but they don't seem to be attempting to address this specific problem.
They also seem fine.

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Stephen Frost
Дата:
Peter,

On Sunday, January 3, 2016, Peter Geoghegan <pg@heroku.com> wrote:
On Sun, Jan 3, 2016 at 5:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In any case, the text in create_policy.sgml seems to be a misleading
> description of the problem, as it's talking about DDL modifications
> which are *not* what's happening here.

I'm not sure what you mean. The CREATE POLICY changes in commit
43cd468cf01007f3 specifically call out the issue illustrated in my
example test case. There are some other changes made in that commit,
but they don't seem to be attempting to address this specific problem.
They also seem fine.

I believe Tom's complaint was that the overall page is about CREATE POLICY, technically, and that the text in attempting to address the concern might be taken under the context of being a CREATE POLICY issue rather than a general RLS issue with row locking. 

Thanks!

Stephen 

Re: Broken lock management in policy.c.

От
Tom Lane
Дата:
Peter Geoghegan <pg@heroku.com> writes:
> On Sun, Jan 3, 2016 at 5:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In any case, the text in create_policy.sgml seems to be a misleading
>> description of the problem, as it's talking about DDL modifications
>> which are *not* what's happening here.

> I'm not sure what you mean. The CREATE POLICY changes in commit
> 43cd468cf01007f3 specifically call out the issue illustrated in my
> example test case. There are some other changes made in that commit,
> but they don't seem to be attempting to address this specific problem.
> They also seem fine.

I'm going to state it as an incontrovertible fact that that paragraph
is so vague as to be useless, because I sure misunderstood it, and in
fact just copy-edited it to talk about changes to RLS policies.  I now
see that it did say "relations referenced by RLS policies", but that
distinction is going to escape just about everybody who does not already
know what effects are being obliquely referred to.

I think we should get rid of it altogether (since I also agree with the
upthread comment that it's in the wrong place) and instead put an example
into section 5.7 saying directly that sub-selects, or in general
references to rows other than the one under test, are risky in RLS
policies.  We could also show the FOR UPDATE workaround there.
        regards, tom lane



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 6:00 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> I'm not sure what you mean. The CREATE POLICY changes in commit
>> 43cd468cf01007f3 specifically call out the issue illustrated in my
>> example test case. There are some other changes made in that commit,
>> but they don't seem to be attempting to address this specific problem.
>> They also seem fine.
>
>
> I believe Tom's complaint was that the overall page is about CREATE POLICY,
> technically, and that the text in attempting to address the concern might be
> taken under the context of being a CREATE POLICY issue rather than a general
> RLS issue with row locking.

Really? But the problem happens as a consequence of having a
subqueries within CREATE POLICY's USING quals (as well as a
misunderstanding made by the admin about just what is possible).

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 6:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm going to state it as an incontrovertible fact that that paragraph
> is so vague as to be useless, because I sure misunderstood it, and in
> fact just copy-edited it to talk about changes to RLS policies.  I now
> see that it did say "relations referenced by RLS policies", but that
> distinction is going to escape just about everybody who does not already
> know what effects are being obliquely referred to.

That's fair.

> I think we should get rid of it altogether (since I also agree with the
> upthread comment that it's in the wrong place) and instead put an example
> into section 5.7 saying directly that sub-selects, or in general
> references to rows other than the one under test, are risky in RLS
> policies.  We could also show the FOR UPDATE workaround there.

I agree that there should be a worked out example. After all, EPQ is a
behavior that is peculiar to Postgres, and the problem hinges on EPQ
being how READ COMMITTED conflicts are handled. An equivalent RLS
feature in any other database system (including other MVCC systems)
would naturally not have this problem, for reasons peculiar to the
other systems.

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Tom Lane
Дата:
Peter Geoghegan <pg@heroku.com> writes:
> On Sun, Jan 3, 2016 at 6:00 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> I believe Tom's complaint was that the overall page is about CREATE POLICY,
>> technically, and that the text in attempting to address the concern might be
>> taken under the context of being a CREATE POLICY issue rather than a general
>> RLS issue with row locking.

> Really? But the problem happens as a consequence of having a
> subqueries within CREATE POLICY's USING quals

If that's what we're talking about, let's say it in precisely that many
words.  With an example.  The current text is 100% useless.
        regards, tom lane



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 6:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Really? But the problem happens as a consequence of having a
>> subqueries within CREATE POLICY's USING quals
>
> If that's what we're talking about, let's say it in precisely that many
> words.  With an example.  The current text is 100% useless.

I agree that the text was unclear, and that that should be fixed,
because it's too complicated to expect anyone to understand this
without an example (indeed, that's why I used isolationtester to
explain the issue). My confusion was only about whether it was
understood that Stephen had fulfilled my request to document this
behavior.

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 6:09 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I think we should get rid of it altogether (since I also agree with the
>> upthread comment that it's in the wrong place) and instead put an example
>> into section 5.7 saying directly that sub-selects, or in general
>> references to rows other than the one under test, are risky in RLS
>> policies.  We could also show the FOR UPDATE workaround there.
>
> I agree that there should be a worked out example.

I should remind you that SELECT FOR UPDATE requires conventional
UPDATE privilege (at least on columns appearing in the SELECT list).
So, among SELECT commands, SELECT FOR UPDATE is special in that it
requires UPDATE privilege. This is not true of equivalent RLS
policies, though.

So, as part of documenting the SELECT FOR UPDATE workaround, you're
going to have to advise admins that they need to have (lesser
privileged) user accounts with conventional UPDATE privilege on a
(USING qual referenced) table that they're most certainly not supposed
to be able to UPDATE at all (since they can totally undermine RLS with
such an UPDATE). RLS can make it impossible to actually proceed with
such an UPDATE, which makes the SELECT FOR UPDATE workaround possible,
but it's all pretty messy.

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Stephen Frost
Дата:
Peter,

On Sunday, January 3, 2016, Peter Geoghegan <pg@heroku.com> wrote:
On Sun, Jan 3, 2016 at 6:09 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I think we should get rid of it altogether (since I also agree with the
>> upthread comment that it's in the wrong place) and instead put an example
>> into section 5.7 saying directly that sub-selects, or in general
>> references to rows other than the one under test, are risky in RLS
>> policies.  We could also show the FOR UPDATE workaround there.
>
> I agree that there should be a worked out example.

I should remind you that SELECT FOR UPDATE requires conventional
UPDATE privilege (at least on columns appearing in the SELECT list).
So, among SELECT commands, SELECT FOR UPDATE is special in that it
requires UPDATE privilege. This is not true of equivalent RLS
policies, though.

So, as part of documenting the SELECT FOR UPDATE workaround, you're
going to have to advise admins that they need to have (lesser
privileged) user accounts with conventional UPDATE privilege on a
(USING qual referenced) table that they're most certainly not supposed
to be able to UPDATE at all (since they can totally undermine RLS with
such an UPDATE). RLS can make it impossible to actually proceed with
such an UPDATE, which makes the SELECT FOR UPDATE workaround possible,
but it's all pretty messy.

A security defined function could be used to address that, of course. That could have performance implications, naturally. 

Thanks,

Stephen

Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 6:41 PM, Stephen Frost <sfrost@snowman.net> wrote:
> A security defined function could be used to address that, of course. That
> could have performance implications, naturally.

True.

I would also advise only referencing a single relation within the
SELECT FOR UPDATE.

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Sun, Jan 3, 2016 at 7:01 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I would also advise only referencing a single relation within the
> SELECT FOR UPDATE.

To state what may be obvious: We should recommend that SELECT FOR
SHARE appear in the CREATE POLICY USING qual as part of this
workaround (not SELECT FOR UPDATE), because there is no need for
anything stronger than that. We only need to prevent the admin
updating a referenced-in-using-qual tuple in a way that allows a
malicious user to exploit an inconsistency in tuple visibility during
EPQ rechec. (Using SELECT FOR KEY SHARE would not reliably workaround
the underlying issue, though.)

-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Tom Lane
Дата:
[ getting back to this now that there's a little time ]

Peter Geoghegan <pg@heroku.com> writes:
> On Sun, Jan 3, 2016 at 7:01 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I would also advise only referencing a single relation within the
>> SELECT FOR UPDATE.

> To state what may be obvious: We should recommend that SELECT FOR
> SHARE appear in the CREATE POLICY USING qual as part of this
> workaround (not SELECT FOR UPDATE), because there is no need for
> anything stronger than that. We only need to prevent the admin
> updating a referenced-in-using-qual tuple in a way that allows a
> malicious user to exploit an inconsistency in tuple visibility during
> EPQ rechec. (Using SELECT FOR KEY SHARE would not reliably workaround
> the underlying issue, though.)

Right, SELECT FOR SHARE would be sufficient and would reduce the
concurrency penalty a bit.

It might be possible to use SELECT FOR KEY SHARE if you knew that
the column you needed to check was a unique-key column, but that
seems unlikely to be common, so I think we can omit the point from
our example.

I'll go draft something up ...
        regards, tom lane



Re: Broken lock management in policy.c.

От
Tom Lane
Дата:
I wrote:
> I'll go draft something up ...

I've pushed an example based on your original test case.  Feel free
to suggest improvements, although at this point they'll probably
land in 9.5.1.
        regards, tom lane



Re: Broken lock management in policy.c.

От
Peter Geoghegan
Дата:
On Mon, Jan 4, 2016 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I've pushed an example based on your original test case.  Feel free
> to suggest improvements, although at this point they'll probably
> land in 9.5.1.

I think that that's a vast improvement. I probably should have pushed
for a worked out example myself, but there are only so many hours in
the day.

I've reviewed your changes, and have nothing further to add.

Thanks
-- 
Peter Geoghegan



Re: Broken lock management in policy.c.

От
Stephen Frost
Дата:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Mon, Jan 4, 2016 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I've pushed an example based on your original test case.  Feel free
> > to suggest improvements, although at this point they'll probably
> > land in 9.5.1.
>
> I think that that's a vast improvement. I probably should have pushed
> for a worked out example myself, but there are only so many hours in
> the day.

Agreed on it being a vast improvement and that there are only so many
hours in the day.

> I've reviewed your changes, and have nothing further to add.

Was just doing that myself and I agree that it looks good.

Thanks!

Stephen