Обсуждение: GIST/GIN index not used with Row Level Security
When using row level security, GIN and GIST indexes appear to get ignored. Is this expected behavior? Can I change the query to get PostgreSQL using the index? For example, with RLS enabled, this query:
select * from search where search like '%yo'
Creates this query plan:
"Seq Scan on search  (cost=0.00..245.46 rows=1 width=163)"
" Filter: (((tenant_name)::name = CURRENT_USER) AND (search ~~ '%yo'::text))"
" Filter: (((tenant_name)::name = CURRENT_USER) AND (search ~~ '%yo'::text))"
Running this same query with the owner of the table, thereby disabling RLS, the index gets used as expected: 
"Bitmap Heap Scan on search  (cost=4.49..96.33 rows=44 width=163)"
" Recheck Cond: (search ~~ '%yo'::text)"
" -> Bitmap Index Scan on search__gist (cost=0.00..4.48 rows=44 width=0)"
" Index Cond: (search ~~ '%yo'::text)"
" Recheck Cond: (search ~~ '%yo'::text)"
" -> Bitmap Index Scan on search__gist (cost=0.00..4.48 rows=44 width=0)"
" Index Cond: (search ~~ '%yo'::text)"
I see the same behavior with more complex queries, switching to GIN index, more complex RLS rules, using word_similarity instead of like, using full text search and larger data sets (e.g. 100k rows). This is on PostgreSQL v11.1 on Windows 10. 
-- 
Derek
Derek Hans <derek.hans@gmail.com> writes:
> When using row level security, GIN and GIST indexes appear to get ignored.
> Is this expected behavior? Can I change the query to get PostgreSQL using
> the index? For example, with RLS enabled, this query:
Your example is obscuring the issue by incorporating a tenant_name
condition (where did that come from, anyway?) in one case and not
the other.  Without knowing how selective that is, it's hard to
compare the EXPLAIN results.
However, wild-guess time: it might be that without access to the
table statistics, the "search like '%yo'" condition is estimated
to be too unselective to make an indexscan profitable.  And putting
RLS in the way would disable that access if the ~~ operator is not
marked leakproof, which it isn't.
I'm not sure that you should get too excited about this, however.
You're evidently testing on a toy-size table, else the seqscan
cost estimate would be a lot higher.  With a table large enough
to make it really important to guess right, even the default
selectivity estimate might be enough to get an indexscan.
            regards, tom lane
			
		On 8/13/19 11:57 AM, Derek Hans wrote: > When using row level security, GIN and GIST indexes appear to get > ignored. Is this expected behavior? Can I change the query to get > PostgreSQL using the index? For example, with RLS enabled, this query: What are the RLS policies on the table? What is the definition of the GIN index? Best guess is the RLS is preventing access to the field needed by the index. > > select * from search where search like '%yo' > > Creates this query plan: > "Seq Scan on search (cost=0.00..245.46 rows=1 width=163)" > " Filter: (((tenant_name)::name = CURRENT_USER) AND (search ~~ > '%yo'::text))" > > Running this same query with the owner of the table, thereby disabling > RLS, the index gets used as expected: > "Bitmap Heap Scan on search (cost=4.49..96.33 rows=44 width=163)" > " Recheck Cond: (search ~~ '%yo'::text)" > " -> Bitmap Index Scan on search__gist (cost=0.00..4.48 rows=44 width=0)" > " Index Cond: (search ~~ '%yo'::text)" > > I see the same behavior with more complex queries, switching to GIN > index, more complex RLS rules, using word_similarity instead of like, > using full text search and larger data sets (e.g. 100k rows). This is on > PostgreSQL v11.1 on Windows 10. > > -- > *Derek* > +1 (415) 754-0519 |derek.hans@gmail.com <mailto:derek.hans@gmail.com> | > Skype: derek.hans -- Adrian Klaver adrian.klaver@aklaver.com
What are the RLS policies on the table?
From select * from pg_policies:
"((tenant_name)::name = CURRENT_USER)"
What is the definition of the GIN index?
CREATE INDEX search__gist
ON public.search USING gist
(search COLLATE pg_catalog."default" gist_trgm_ops)
TABLESPACE pg_default;
ON public.search USING gist
(search COLLATE pg_catalog."default" gist_trgm_ops)
TABLESPACE pg_default;
Best guess is the RLS is preventing access to the field needed by the index.
I didn't realize RLS can limit access to a specific field/index - my understanding was that it only affects what rows get returned/can be update/inserted.
>
> select * from search where search like '%yo'
>
> Creates this query plan:
> "Seq Scan on search (cost=0.00..245.46 rows=1 width=163)"
> " Filter: (((tenant_name)::name = CURRENT_USER) AND (search ~~
> '%yo'::text))"
>
> Running this same query with the owner of the table, thereby disabling
> RLS, the index gets used as expected:
> "Bitmap Heap Scan on search (cost=4.49..96.33 rows=44 width=163)"
> " Recheck Cond: (search ~~ '%yo'::text)"
> " -> Bitmap Index Scan on search__gist (cost=0.00..4.48 rows=44 width=0)"
> " Index Cond: (search ~~ '%yo'::text)"
>
> I see the same behavior with more complex queries, switching to GIN
> index, more complex RLS rules, using word_similarity instead of like,
> using full text search and larger data sets (e.g. 100k rows). This is on
> PostgreSQL v11.1 on Windows 10.
>
> --
> *Derek*
> +1 (415) 754-0519 |derek.hans@gmail.com <mailto:derek.hans@gmail.com> |
> Skype: derek.hans
--
Adrian Klaver
adrian.klaver@aklaver.com
Derek
Your example is obscuring the issue by incorporating a tenant_name
condition (where did that come from, anyway?) in one case and not
the other. Without knowing how selective that is, it's hard to
compare the EXPLAIN results.
That's RLS kicking in - RLS condition is defined as 
((tenant_name)::name = CURRENT_USER) 
However, wild-guess time: it might be that without access to the
table statistics, the "search like '%yo'" condition is estimated
to be too unselective to make an indexscan profitable. And putting
RLS in the way would disable that access if the ~~ operator is not
marked leakproof, which it isn't.
I didn't realize you could set access to table statistics. How do I enable this access for this user? If that's not possible, it sounds like it effectively blocks the use of GIN/GIST indexes when RLS is in use.
I'm not sure that you should get too excited about this, however.
You're evidently testing on a toy-size table, else the seqscan
cost estimate would be a lot higher. With a table large enough
to make it really important to guess right, even the default
selectivity estimate might be enough to get an indexscan.
I've tried this with larger data sets, with the same results. I discovered this problem because the select was taking 10-30 seconds instead of the expected sub-second, when using larger data sets and more fields getting searched. The example is the simplest repro case I could create. 
regards, tom lane
Derek
Derek Hans <derek.hans@gmail.com> writes:
>> However, wild-guess time: it might be that without access to the
>> table statistics, the "search like '%yo'" condition is estimated
>> to be too unselective to make an indexscan profitable.  And putting
>> RLS in the way would disable that access if the ~~ operator is not
>> marked leakproof, which it isn't.
> I didn't realize you could set access to table statistics. How do I enable
> this access for this user?
You don't --- there's a hard-wired restriction against applying leaky
operators to the table statistics, because they might leak values that
the current user isn't supposed to be able to see.  Overriding that
would pretty much destroy the value of RLS.
It's possible you could get away with marking ~~ (textlike) as leakproof.
The default marking is based on the assumption that textlike might throw
errors based on its RHS input not being a valid pattern ... but I think
the only such error is "LIKE pattern must not end with escape character",
and maybe that's not a big enough leak to concern you.
> If that's not possible, it sounds like it
> effectively blocks the use of GIN/GIST indexes when RLS is in use.
There's a whole lot of daylight between "it doesn't pick an indexscan in
this one example" and "it effectively blocks the use of GIN/GIST".
            regards, tom lane
			
		Thanks for the pointer for marking functions as leakproof, I was unaware of that whole concept. 
Unfortunately only "alter function" supports "leakproof" - "alter operator" does not. Is there a function-equivalent for marking operators as leakproof? Is there any documentation for which operators/functions are leakproof?
In my particular case, RLS is still useful even if operators are leaky as I control the application code and therefore can ensure leaky errors are handled. If it's possible to disable all checking for "leakproof", that would work for me. 
> If that's not possible, it sounds like it
> effectively blocks the use of GIN/GIST indexes when RLS is in use.
There's a whole lot of daylight between "it doesn't pick an indexscan in
this one example" and "it effectively blocks the use of GIN/GIST".
True indeed :). Would you have a working example of using a GIN/GIST index with RLS? All the attempts I've made have ended in seq scans. In practice, I'm looking to implement fuzzy search using trigrams, so % and %> operators are what matter to me. ~~ also happens to fail. Should I expect to be able to use any of these with RLS, large amounts of data and reasonable performance? 
Your description of leakproof (and the documentation I've found) makes it sound like I'm not just hitting an isolated problem, but a general problem with RLS that represents a substantial limitation and is likely worth documenting. 
Derek
Derek Hans <derek.hans@gmail.com> writes:
> Unfortunately only "alter function" supports "leakproof" - "alter operator"
> does not. Is there a function-equivalent for marking operators as
> leakproof?
Leakproofness is a property of the underlying function, not the operator,
so that's where you change it.
> Is there any documentation for which operators/functions are
> leakproof?
select oid::regprocedure from pg_proc where proleakproof;
            regards, tom lane
			
		Greetings, * Derek Hans (derek.hans@gmail.com) wrote: > Unfortunately only "alter function" supports "leakproof" - "alter operator" > does not. Is there a function-equivalent for marking operators as > leakproof? Is there any documentation for which operators/functions are > leakproof? Tom's query downthread provides the complete list. Note that the list is not completely static- it's entirely possible that additional functions can be made leak-proof, what's needed is a careful review of the function code to ensure that it can't leak information about the data (or, if it does today, a patch which removes that). If you have an interest in that then I'd encourage you to dig into the code and look for possible leaks (Tom's already hinted in the direction you'd want to go in) and then propose a patch to address those cases and to mark the function(s) as leakproof. > In my particular case, RLS is still useful even if operators are leaky as I > control the application code and therefore can ensure leaky errors are > handled. If it's possible to disable all checking for "leakproof", that > would work for me. There isn't a way to disable the leakproof-checking system. Certainly in the general case that wouldn't be acceptable and I'm not entirely convinced by your argument that such an option should exist, though you could go through and set all of the functions to be leakproof if you really wish to. > > If that's not possible, it sounds like it > > > effectively blocks the use of GIN/GIST indexes when RLS is in use. > > > > There's a whole lot of daylight between "it doesn't pick an indexscan in > > this one example" and "it effectively blocks the use of GIN/GIST". > > True indeed :). Would you have a working example of using a GIN/GIST index > with RLS? All the attempts I've made have ended in seq scans. In practice, > I'm looking to implement fuzzy search using trigrams, so % and %> operators > are what matter to me. ~~ also happens to fail. Should I expect to be able > to use any of these with RLS, large amounts of data and reasonable > performance? Functions that aren't marked leakproof aren't going to be able to be pushed down. > Your description of leakproof (and the documentation I've found) makes it > sound like I'm not just hitting an isolated problem, but a general problem > with RLS that represents a substantial limitation and is likely worth > documenting. There's some documentation regarding leakproof functions here: https://www.postgresql.org/docs/current/ddl-rowsecurity.html and here: https://www.postgresql.org/docs/11/sql-createfunction.html Of course, patches are welcome to improve on our documentation. One thing that it sounds like you're not quite appreciating is that in the general case, verifying that a function is leakproof isn't optional. Without such a check, any user could create a function and then get PG to push that function down below the RLS checks and therefore gain access to the data that they aren't supposed to be able to see. All that said, there's quite a few functions that *are* marked as leakproof already and they're quite handy and work well with RLS already, as I expect you'll see when you go querying pg_proc. Thanks, Stephen
Вложения
Thanks for the detailed response, super helpful in understanding what's happening, in particular understanding the risk of not marking functions as leakproof. I'll take a look at the underlying code to understand what's involved in getting a function to be leakproof. 
That said, it does seem like it should be possible and reasonable to specify that a user should have access to the table stats so that the query planner works as expected. Maybe it comes down to the fact that RLS is still a work in progress, and I shouldn't be relying on it unless I'm really certain it supports the functionality I need. 
I've updated word_similarity_op(text,text) to be leakproof, and  pg_proc agrees it is. I'm assuming word_similarity_op() is equivalent to <%, though I haven't found explicit confirmation. However, using word_similarity() instead of <% on a 100k row table, without any RLS involved, doesn't make use of the index, while using <% does. Obviously, adding the RLS doesn't make that any better. Any idea what might be the cause?
On Tue, Aug 13, 2019 at 5:39 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,
* Derek Hans (derek.hans@gmail.com) wrote:
> Unfortunately only "alter function" supports "leakproof" - "alter operator"
> does not. Is there a function-equivalent for marking operators as
> leakproof? Is there any documentation for which operators/functions are
> leakproof?
Tom's query downthread provides the complete list.
Note that the list is not completely static- it's entirely possible that
additional functions can be made leak-proof, what's needed is a careful
review of the function code to ensure that it can't leak information
about the data (or, if it does today, a patch which removes that). If
you have an interest in that then I'd encourage you to dig into the code
and look for possible leaks (Tom's already hinted in the direction you'd
want to go in) and then propose a patch to address those cases and to
mark the function(s) as leakproof.
> In my particular case, RLS is still useful even if operators are leaky as I
> control the application code and therefore can ensure leaky errors are
> handled. If it's possible to disable all checking for "leakproof", that
> would work for me.
There isn't a way to disable the leakproof-checking system. Certainly
in the general case that wouldn't be acceptable and I'm not entirely
convinced by your argument that such an option should exist, though you
could go through and set all of the functions to be leakproof if you
really wish to.
> > If that's not possible, it sounds like it
> > > effectively blocks the use of GIN/GIST indexes when RLS is in use.
> >
> > There's a whole lot of daylight between "it doesn't pick an indexscan in
> > this one example" and "it effectively blocks the use of GIN/GIST".
>
> True indeed :). Would you have a working example of using a GIN/GIST index
> with RLS? All the attempts I've made have ended in seq scans. In practice,
> I'm looking to implement fuzzy search using trigrams, so % and %> operators
> are what matter to me. ~~ also happens to fail. Should I expect to be able
> to use any of these with RLS, large amounts of data and reasonable
> performance?
Functions that aren't marked leakproof aren't going to be able to be
pushed down.
> Your description of leakproof (and the documentation I've found) makes it
> sound like I'm not just hitting an isolated problem, but a general problem
> with RLS that represents a substantial limitation and is likely worth
> documenting.
There's some documentation regarding leakproof functions here:
https://www.postgresql.org/docs/current/ddl-rowsecurity.html
and here:
https://www.postgresql.org/docs/11/sql-createfunction.html
Of course, patches are welcome to improve on our documentation.
One thing that it sounds like you're not quite appreciating is that in
the general case, verifying that a function is leakproof isn't optional.
Without such a check, any user could create a function and then get PG
to push that function down below the RLS checks and therefore gain
access to the data that they aren't supposed to be able to see.
All that said, there's quite a few functions that *are* marked as
leakproof already and they're quite handy and work well with RLS
already, as I expect you'll see when you go querying pg_proc.
Thanks,
Stephen
Derek
Greetings, Please don't top-post on these lists. * Derek Hans (derek.hans@gmail.com) wrote: > Thanks for the detailed response, super helpful in understanding what's > happening, in particular understanding the risk of not marking functions as > leakproof. I'll take a look at the underlying code to understand what's > involved in getting a function to be leakproof. Great. > That said, it does seem like it should be possible and reasonable to > specify that a user should have access to the table stats so that the query > planner works as expected. Maybe it comes down to the fact that RLS is > still a work in progress, and I shouldn't be relying on it unless I'm > really certain it supports the functionality I need. PostgreSQL is still very much a work in progress. :) > I've updated word_similarity_op(text,text) to be leakproof, and > pg_proc agrees it is. I'm assuming word_similarity_op() is equivalent to > <%, though I haven't found explicit confirmation. However, using > word_similarity() instead of <% on a 100k row table, without any RLS > involved, doesn't make use of the index, while using <% does. Obviously, > adding the RLS doesn't make that any better. Any idea what might be the > cause? Just to be clear, you should be looking at pg_operator (oprcode) to determine the function that is under the operator that you wish to change to being leakproof. Note that the selectivity functions are associated with the operator, not the function itself. Thanks, Stephen
Вложения
> I've updated word_similarity_op(text,text) to be leakproof, and
> pg_proc agrees it is. I'm assuming word_similarity_op() is equivalent to
> <%, though I haven't found explicit confirmation. However, using
> word_similarity() instead of <% on a 100k row table, without any RLS
> involved, doesn't make use of the index, while using <% does. Obviously,
> adding the RLS doesn't make that any better. Any idea what might be the
> cause?
Just to be clear, you should be looking at pg_operator (oprcode) to
determine the function that is under the operator that you wish to
change to being leakproof.
Thanks for that pointer. 
Note that the selectivity functions are associated with the operator,
not the function itself.
That was the missing piece, thanks. How come operators get optimized but functions don't?
Quick summary:
The text similarity/full text search/like operators are not marked as leakproof, which stops them from having access to table statistics. When combined with row level security, operators that aren't leakproof can't get pushed down and therefore happen after the RLS check, preventing use of GIN/GIST indexes. A workaround is marking the underlying function as leakproof but that is only reasonable because our particular setup makes it acceptable if information leaks via database error messages.  
To resolve:
- Lookup function associated with operator being used via the pg_operator table
- Check if that function is leakproof based on info in pg_proc table
- ALTER FUNCTION func LEAKPROOF
- Use original operator in code - the underlying function doesn't get optimized and bypasses the index
While those steps work on my local machine, unfortunately we're deployed on AWS Aurora which doesn't allow marking functions as leakproof. Functions are owned by the rdsadmin user and controlled by AWS. In practice, that appears to mean that fuzzy search/full text search with reasonable performance isn't compatible with RLS on Amazon Aurora. We may end up setting up Elasticsearch to support text search. In any case, we need to separate search from checking who is allowed to see the results.
Thanks for the help from everyone!