Re: row filtering for logical replication

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: row filtering for logical replication
Дата
Msg-id 20181227190511.GI2528@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Ответы Re: row filtering for logical replication  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Список pgsql-hackers
Greetings,

* Petr Jelinek (petr.jelinek@2ndquadrant.com) wrote:
> On 14/12/2018 16:38, Stephen Frost wrote:
> > * Petr Jelinek (petr.jelinek@2ndquadrant.com) wrote:
> >> I do see the appeal here, if you consider logical replication to be a
> >> streaming select it probably applies well.
> >>
> >> But given that this is happening inside output plugin which does not
> >> have full executor setup and has catalog-only snapshot I am not sure how
> >> feasible it is to try to merge these two things. As per my previous
> >> email it's possible that we'll have to be stricter about what we allow
> >> in expressions here.
> >
> > I can certainly understand the concern about trying to combine the
> > implementation of this with that of RLS; perhaps that isn't a good fit
> > due to the additional constraints put on logical decoding.
> >
> > That said, I still think it might make sense to consider these filters
> > for logical decoding to be policies and, ideally, to allow users to use
> > the same policy for both.
>
> I am not against that as long as it's possible to have policy for
> logical replication without having it for RLS and vice versa.

RLS already is able to be enabled/disabled on a per-table basis.  I
could see how we might want to extend the existing policy system to have
a way to enable/disable individual policies for RLS but that should be
reasonably straight-forward to do, I would think.

> I also wonder if policies are flexible enough to allow for specifying
> OLD and NEW - the replication filtering deals with DML, not with what's
> visible, it might very well depend on differences between these (that's
> something the current patch is missing as well BTW).

The policy system already has the notion of a 'visible' check and a
'does the new row match this' check (USING vs. WITH CHECK policies).
Perhaps if you could outline the specific use-cases that you're thinking
about, we could discuss them and make sure that they fit within those
mechanisms- or, if not, discuss if such a use-case would make sense for
RLS as well and, if so, figure out a way to support that for both.

> > In the end, the idea of having to build a single large and complex
> > 'create publication' command which has a bunch of tables, each with
> > their own filter clauses, just strikes me as pretty painful.
> >
> >> The other issue with merging this is that the use-case for filtering out
> >> the data in logical replication is not necessarily about security, but
> >> often about sending only relevant data. So it makes sense to have filter
> >> on publication without RLS enabled on table and if we'd force that, we'd
> >> limit usefulness of this feature.
> >
> > I definitely have a serious problem if we are going to say that you
> > can't use this filtering for security-sensitive cases.
>
> I am saying it should not be tied to only security sensitive cases,
> because it has use cases that have nothing to do with security (ie, I
> don't want this to depend on RLS being enabled for a table).

I'm fine with this being able to be independently enabled/disabled,
apart from RLS.

> >> We definitely want to eventually create subscriptions as non-superuser
> >> but that has zero effect on this as everything here is happening on
> >> different server than where subscription lives (we already allow
> >> creation of publications with just CREATE privilege on database and
> >> ownership of the table).
> >
> > What I wasn't clear about above was the idea that we might allow a user
> > other than the table owner to publish a given table, but that such a
> > publication should certanily only be allowed to include the rows which
> > that user has access to- as regulated by RLS.  If the RLS policy is too
> > complex to allow that then I would think we'd simply throw an error at
> > the create publication time and the would-be publisher would need to
> > figure that out with the table owner.
>
> My opinion is that this is useful, but not necessarily something v1
> patch needs to solve. Having too many publications and subscriptions to
> various places is not currently practical anyway due to decoding
> duplicating all the work for every connection.

I agree that supporting this could be done in a later patch, however, I
do feel that when we go to add support for non-owners to create
publications then RLS needs to be supported at that point (and by more
than just 'throw an error').  I can agree with incremental improvements
but I don't want to get to a point where we've got a bunch of
independent things only half of which work with other parts of the
system.

> > * Euler Taveira (euler@timbira.com.br) wrote:
> >> Em sex, 23 de nov de 2018 às 11:40, Petr Jelinek
> >> <petr.jelinek@2ndquadrant.com> escreveu:
> >
> >>> The other issue with merging this is that the use-case for filtering out
> >>> the data in logical replication is not necessarily about security, but
> >>> often about sending only relevant data. So it makes sense to have filter
> >>> on publication without RLS enabled on table and if we'd force that, we'd
> >>> limit usefulness of this feature.
> >>
> >> Use the same infrastructure as RLS could be a good idea but use RLS
> >> for row filtering is not. RLS is complex.
> >
> > Right, this was along the lines I was thinking of- using the
> > infrastructure and the policy system, in particular.
>
> Yeah that part is definitely worth investigating.

Glad to hear that.

Thanks!

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: row filtering for logical replication