Re: [RFC] Interface of Row Level Security

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: [RFC] Interface of Row Level Security
Дата
Msg-id 2345DB68-3046-4EC4-B228-E9EE2408FCD8@phlo.org
обсуждение исходный текст
Ответ на Re: [RFC] Interface of Row Level Security  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Ответы Re: [RFC] Interface of Row Level Security  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Список pgsql-hackers
On Jun5, 2012, at 22:33 , Kohei KaiGai wrote:
> 2012/6/5 Florian Pflug <fgp@phlo.org>:
>> I can live with any behaviour, as long as it doesn't depends on details
>> of the query plan. My vote would be for always using the role which was
>> active at statement creation time (i.e. at PREPARE/DECLARE time) for
>> RLS purposes though. That way, we could treat the role as being IMMUTABLE
>> I think.

> I have same opinion at the point where the "active role" should be consistent,
> but a bit different in details. I think, it should be a timing at
> beginning of execution,
> not plan creation time, like as existing permission checks are applied
> on InitPlan()
> or its callee.

From a consistency POV, I'd agree. But binding the active role after
planning means giving up a lot of query optimization opportunities. It
will prevent the role from participating in constant folding, which
translates to potentially enormous loss of performance. Take the following
policy function as an example
 (role = 'some_role' AND expensive_function(row)) OR (role = 'other_role' AND row.field = 'some_value')

and assume that there's an index on 'field'. Now, if role is known to be
'other_row', the planner can eliminate the expensive_function() from the
plan, and satisfy row.field = 'some_value' with an index scan. Without
that knowledge, you'll probably get a sequential scan.

I do value consistency, but in this case the benefit of not being
consistent with other privilege checks outweighs the drawbacks I think.

>>> It seems to me, caution of the problem is current_user is performing out
>>> of the snapshot controls. According to the definition of stable function,
>>> its result should not be changed during a particular scan.
>>
>> Yeah, well, ISTM that our definition of STABLE doesn't really take functions
>> whose return value depends on GUCs into account.

> Probably, has_XXX_privilege() should work with reliable data source being
> safe from user-id switching. But it is a tough work to enhance whole of the
> GUC stuff for snapshot aware…

We don't need that for every GUC, though. I'd say its sufficient to somehow
provide the policy function with a stable value of the active role (whatever
"active" ends up meaning). If we do that, we can simply document that making
the policy function depend on other GUCs is a seriously bad idea.

We *do* need that stable "active role" value though, since we cannot very
well advise against using current_role in the policy function without providing
an alternativ…

>>> At least, it is not a fundamental problem of RLS implementation, although
>>> it needs to take an enhancement something like effective user-id per
>>> snapshot basis.
>>
>> Right. We need something like "statement_role" which returns the role the
>> statement is to be executed as (However we define that, though as I said
>> above my vote is for it being the role which created the statement). To make
>> that work, however, the plan cache needs to store that role, since the
>> statement might be re-planned after its initial creation, but before it is
>> executed. At which point the amount of complexity saved by not implementing
>> RLSBYPASS is very nearly zero.

> I'm not sure why the plan cache needs to be stored with a particular user-id.
> As long as we check permissions at executor stage, all we need to do is
> implementation of a function to check permission based on a particular
> user-id at the timing of executor initialization.

Hm, yeah, if you defer everything to execution time, that's true.

>> To reiterate, my point is that we won't get away with zero planner changes
>> even without RLSBYPASS. In fact, we'll be paying *most* of the complexity
>> costs of RLSBYPASS whether we actually add that feature or not. Which makes
>> not adding it look like a pretty bad deal to me.

> It seems to me, 95% of our opinion is common, except for a few detailed stuff.
> I never dislike the idea of RLSBYPASS permission, but I'd like to investigate
> this idea with more time, until v9.3 being closed.
> At least, nobody opposed to allow superusers to bypass RLS policy.
> So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds
> of lines due to syntax support or query rewriting stuff. It is always good idea
> to focus on the core functionality at the starting point.

Absolutely. But at the same time, it's important to check that the design allows
the additional features to be added later easily. In the case of RLS, I'm
worried that decreeing that it's the role at execution time, not planning
time, that counts, we're painting ourselves into a corner. I view RLSBYPASS
as a good sanity check for that. Another one is that GRANT-with-filter-function
idea. Both seem to fall into place quite naturally if handled while planning -
you simply have to add decide which additional WHERE clause to add, if any.
Without any negative performance effects from policies which don't apply to
the current role, if the optimizer does it's job. Getting the same level of
performance when policies are added unconditionally seems quite impossible.
You might be able to make up for some losses by caching STABLE values (which
amounts to another constant folding pass), but you won't ever be able to make up
for a lost index scan opportunity or other optimizations which change the structure
of the whole query plan.

best regards,
Florian Pflug



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.