Re: row filtering for logical replication

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: row filtering for logical replication
Дата
Msg-id 70ccb3a2-48d2-c923-71c3-7bd5346a64aa@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
Hi,

On 27/12/2018 20:05, Stephen Frost wrote:
> Greetings,
> 
> * Petr Jelinek (petr.jelinek@2ndquadrant.com) wrote:
>> On 14/12/2018 16:38, Stephen Frost wrote:
>>> * Petr Jelinek (petr.jelinek@2ndquadrant.com) wrote:
>>>> I do see the appeal here, if you consider logical replication to be a
>>>> streaming select it probably applies well.
>>>>
>>>> But given that this is happening inside output plugin which does not
>>>> have full executor setup and has catalog-only snapshot I am not sure how
>>>> feasible it is to try to merge these two things. As per my previous
>>>> email it's possible that we'll have to be stricter about what we allow
>>>> in expressions here.
>>>
>>> I can certainly understand the concern about trying to combine the
>>> implementation of this with that of RLS; perhaps that isn't a good fit
>>> due to the additional constraints put on logical decoding.
>>>
>>> That said, I still think it might make sense to consider these filters
>>> for logical decoding to be policies and, ideally, to allow users to use
>>> the same policy for both.
>>
>> I am not against that as long as it's possible to have policy for
>> logical replication without having it for RLS and vice versa.
> 
> RLS already is able to be enabled/disabled on a per-table basis.  I
> could see how we might want to extend the existing policy system to have
> a way to enable/disable individual policies for RLS but that should be
> reasonably straight-forward to do, I would think.

Sure, I was mostly referring to having ability of enable/disable this
independently of enabling/disabling RLS which you are okay with based on
bellow so no issue there from my side.

> 
>> I also wonder if policies are flexible enough to allow for specifying
>> OLD and NEW - the replication filtering deals with DML, not with what's
>> visible, it might very well depend on differences between these (that's
>> something the current patch is missing as well BTW).
> 
> The policy system already has the notion of a 'visible' check and a
> 'does the new row match this' check (USING vs. WITH CHECK policies).
> Perhaps if you could outline the specific use-cases that you're thinking
> about, we could discuss them and make sure that they fit within those
> mechanisms- or, if not, discuss if such a use-case would make sense for
> RLS as well and, if so, figure out a way to support that for both.

So we'd use USING for old row images (UPDATE/DELETE) and WITH CHECK for
new ones (UPDATE/INSERT)? I think OLD/NEW is somewhat more natural
naming of this as there is no "SELECT" part of operation here, but as
long as the functionality is there I don't mind syntax that much.

> 
>>> In the end, the idea of having to build a single large and complex
>>> 'create publication' command which has a bunch of tables, each with
>>> their own filter clauses, just strikes me as pretty painful.
>>>
>>>> The other issue with merging this is that the use-case for filtering out
>>>> the data in logical replication is not necessarily about security, but
>>>> often about sending only relevant data. So it makes sense to have filter
>>>> on publication without RLS enabled on table and if we'd force that, we'd
>>>> limit usefulness of this feature.
>>>
>>> I definitely have a serious problem if we are going to say that you
>>> can't use this filtering for security-sensitive cases.
>>
>> I am saying it should not be tied to only security sensitive cases,
>> because it has use cases that have nothing to do with security (ie, I
>> don't want this to depend on RLS being enabled for a table).
> 
> I'm fine with this being able to be independently enabled/disabled,
> apart from RLS.
> 

Cool.

>>>> We definitely want to eventually create subscriptions as non-superuser
>>>> but that has zero effect on this as everything here is happening on
>>>> different server than where subscription lives (we already allow
>>>> creation of publications with just CREATE privilege on database and
>>>> ownership of the table).
>>>
>>> What I wasn't clear about above was the idea that we might allow a user
>>> other than the table owner to publish a given table, but that such a
>>> publication should certanily only be allowed to include the rows which
>>> that user has access to- as regulated by RLS.  If the RLS policy is too
>>> complex to allow that then I would think we'd simply throw an error at
>>> the create publication time and the would-be publisher would need to
>>> figure that out with the table owner.
>>
>> My opinion is that this is useful, but not necessarily something v1
>> patch needs to solve. Having too many publications and subscriptions to
>> various places is not currently practical anyway due to decoding
>> duplicating all the work for every connection.
> 
> I agree that supporting this could be done in a later patch, however, I
> do feel that when we go to add support for non-owners to create
> publications then RLS needs to be supported at that point (and by more
> than just 'throw an error').  I can agree with incremental improvements
> but I don't want to get to a point where we've got a bunch of
> independent things only half of which work with other parts of the
> system.

Yes, using RLS infrastructure now will make it easier to add support for
publishing without being owner at some later point, just let's please
not make publishing without being owner part of requirements for this.


-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Poor buildfarm coverage of strong-random alternatives
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: could recovery_target_timeline=latest be the default in standbymode?