Обсуждение: Index selection issues with RLS using expressions

Поиск
Список
Период
Сортировка

Index selection issues with RLS using expressions

От
Alastair McKinley
Дата:
Hi all,

I am running in to an issue with RLS and index selection in my queries.  I created a toy example to try to illustrate the issue below.  Postgres version is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit.

Is there some subtle reason as to why the role "new_user" cannot seem to generate a query plan that uses the gin index?

Best regards,

Alastair

-- Example below

create table test as
select array_agg(v order by v) a,
    (random()*4)::bigint as n
from (
    select (random()*250)::bigint as v ,
        (random()*100000)::bigint as g
    from generate_series(1,1000000)
) s group by g;

create index on test using gin(a);

create or replace function has_permission(n bigint) returns boolean as
$$
    select n in (1,2);
$$ language sql stable leakproof;

alter table test enable row level security;

create role new_user;
grant select on test to new_user;

grant execute on function has_permission(bigint) to new_user;

create policy new_user_select on test for select to new_user using ( has_permission(test.n) );

set role new_user;

explain select count(*) from test where a && array[100::bigint];
-- Aggregate  (cost=3233.94..3233.95 rows=1 width=8)
--   ->  Seq Scan on test  (cost=0.00..3228.93 rows=2005 width=0)
--         Filter: ((n = ANY ('{1,2}'::bigint[])) AND (a && '{100}'::bigint[]))


set role postgres;

explain select count(*) from test where a && array[100::bigint];
-- Aggregate  (cost=1833.21..1833.22 rows=1 width=8)
--   ->  Bitmap Heap Scan on test  (cost=43.41..1823.07 rows=4053 width=0)
--         Recheck Cond: (a && '{100}'::bigint[])
--         ->  Bitmap Index Scan on test_a_idx  (cost=0.00..42.40 rows=4053 width=0)
--               Index Cond: (a && '{100}'::bigint[])


-- even with the has_permission() function the postgres user gets a bitmap index scan

explain select count(*) from test where a && array[100::bigint] and has_permission(test.n);
                                    QUERY PLAN                                    
-- -----------------------------------------------------------------------------------
--  Aggregate  (cost=1837.71..1837.72 rows=1 width=8)
--    ->  Bitmap Heap Scan on test  (cost=42.90..1832.69 rows=2005 width=0)
--          Recheck Cond: (a && '{100}'::bigint[])
--          Filter: (n = ANY ('{1,2}'::bigint[]))
--          ->  Bitmap Index Scan on test_a_idx  (cost=0.00..42.40 rows=4053 width=0)
--                Index Cond: (a && '{100}'::bigint[])

Re: Index selection issues with RLS using expressions

От
Tom Lane
Дата:
Alastair McKinley <a.mckinley@analyticsengines.com> writes:
> I am running in to an issue with RLS and index selection in my queries.  I created a toy example to try to illustrate
theissue below.  Postgres version is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc
(Debian8.3.0-6) 8.3.0, 64-bit. 

> Is there some subtle reason as to why the role "new_user" cannot seem to generate a query plan that uses the gin
index?

The && operator is not marked leakproof, so it can't be applied till
after the RLS filter, making an indexscan with it impossible when
RLS is active.

Perhaps arrayoverlap() itself could be proven leakproof, but the
underlying type-specific equality operator might or might not be.
We don't have enough infrastructure to handle indirect leakproofness
requirements like that, so you lose :-(

            regards, tom lane



Re: Index selection issues with RLS using expressions

От
Alastair McKinley
Дата:
Hi Tom,

Thanks for looking at this!  It seems like there are quite a few performance gotchas around leaky operators and RLS, this is my second encounter with this issue in the last few weeks.

What would you recommend as a reasonable workaround? 

I have a large table with a gin index that I would like to use RLS on and use the @@ text search operator.  My initial thought is to use a security definer set-returning function that implements the RLS policy explicitly.  Would a security barrier view also potentially work?

Best regards and thanks again,

Alastair



From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: 31 March 2020 20:18
To: Alastair McKinley <a.mckinley@analyticsengines.com>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Index selection issues with RLS using expressions
 
Alastair McKinley <a.mckinley@analyticsengines.com> writes:
> I am running in to an issue with RLS and index selection in my queries.  I created a toy example to try to illustrate the issue below.  Postgres version is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit.

> Is there some subtle reason as to why the role "new_user" cannot seem to generate a query plan that uses the gin index?

The && operator is not marked leakproof, so it can't be applied till
after the RLS filter, making an indexscan with it impossible when
RLS is active.

Perhaps arrayoverlap() itself could be proven leakproof, but the
underlying type-specific equality operator might or might not be.
We don't have enough infrastructure to handle indirect leakproofness
requirements like that, so you lose :-(

                        regards, tom lane

Re: Index selection issues with RLS using expressions

От
Alastair McKinley
Дата:
Hi Tom,

This is the solution I went with.

create policy X on tableX for select to new_role using ( has_table_read_permission(tableX.column) );

This covers all usage of the table and then for APIs that utilise leaky operators:

create function with_leaky_operator(args) returns setof tableX as
$$
    select * from tableX where column @@ $1 and has_table_read_permission(tableX.column);
$$ language sql security definer;

Best regards,

Alastair



From: Alastair McKinley <a.mckinley@analyticsengines.com>
Sent: 31 March 2020 22:09
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Index selection issues with RLS using expressions
 
Hi Tom,

Thanks for looking at this!  It seems like there are quite a few performance gotchas around leaky operators and RLS, this is my second encounter with this issue in the last few weeks.

What would you recommend as a reasonable workaround? 

I have a large table with a gin index that I would like to use RLS on and use the @@ text search operator.  My initial thought is to use a security definer set-returning function that implements the RLS policy explicitly.  Would a security barrier view also potentially work?

Best regards and thanks again,

Alastair



From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: 31 March 2020 20:18
To: Alastair McKinley <a.mckinley@analyticsengines.com>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Index selection issues with RLS using expressions
 
Alastair McKinley <a.mckinley@analyticsengines.com> writes:
> I am running in to an issue with RLS and index selection in my queries.  I created a toy example to try to illustrate the issue below.  Postgres version is PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit.

> Is there some subtle reason as to why the role "new_user" cannot seem to generate a query plan that uses the gin index?

The && operator is not marked leakproof, so it can't be applied till
after the RLS filter, making an indexscan with it impossible when
RLS is active.

Perhaps arrayoverlap() itself could be proven leakproof, but the
underlying type-specific equality operator might or might not be.
We don't have enough infrastructure to handle indirect leakproofness
requirements like that, so you lose :-(

                        regards, tom lane

Re: Index selection issues with RLS using expressions

От
Stephen Frost
Дата:
Greetings,

We prefer to use in-line responses, please don't top-post on these
lists.

* Alastair McKinley (a.mckinley@analyticsengines.com) wrote:
> Thanks for looking at this!  It seems like there are quite a few performance gotchas around leaky operators and RLS,
thisis my second encounter with this issue in the last few weeks. 

It ends up being a choice between being correct (as in, implementing the
security that RLS is intended to provide) vs. being fast (better
performance).  If you aren't concerned with the security risk posed by
leaky functions, you can use regular views, but you need to realize that
users who can execute arbitrary SQL on the database system would very
likely be able to then circumvent your views and get access to all of
the data in the table.

> What would you recommend as a reasonable workaround?

The right solution is really to hack on PG to make it support having
the && operator (and equality ones, as needed, though many already are)
be marked leakproof, with enough intelligence to also pass that
information through to the parts that need to know, as Tom mentions.
That's not going to happen for v13 at this point as I don't think
anyone's actively working on that, but it could possibly be worked on
for v14.

> I have a large table with a gin index that I would like to use RLS on and use the @@ text search operator.  My
initialthought is to use a security definer set-returning function that implements the RLS policy explicitly.  Would a
securitybarrier view also potentially work? 

A security barrier view is also going to prevent non-leakproof functions
from being pushed down (that's what being a security barrier view
more-or-less exactly means).

Using a security-definer function could possible work, but you can't
just mark a function as leakproof that actually isn't leakproof or, as
Tom said before, you're creating a security hole.

Thanks,

Stephen

Вложения