Re: bogus: logical replication rows/cols combinations

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: bogus: logical replication rows/cols combinations
Дата
Msg-id 454b8599-1a50-f1f2-eea1-8ef00025c63c@enterprisedb.com
обсуждение исходный текст
Ответ на Re: bogus: logical replication rows/cols combinations  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers

On 5/6/22 15:40, Amit Kapila wrote:
> On Fri, May 6, 2022 at 5:56 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>>>
>>>> I could think of below two options:
>>>> 1. Forbid any case where column list is different for the same table
>>>> when combining publications.
>>>> 2. Forbid if the column list and row filters for a table are different
>>>> in the set of publications we are planning to combine. This means we
>>>> will allow combining column lists when row filters are not present or
>>>> when column list is the same (we don't get anything additional by
>>>> combining but the idea is we won't forbid such cases) and row filters
>>>> are different.
>>>>
>>>> Now, I think the points in favor of (1) are that the main purpose of
>>>> introducing a column list are: (a) the structure/schema of the
>>>> subscriber is different from the publisher, (b) want to hide sensitive
>>>> columns data. In both cases, it should be fine if we follow (1) and
>>>> from Peter E.'s latest email [1] he also seems to be indicating the
>>>> same. If we want to be slightly more relaxed then we can probably (2).
>>>> We can decide on something else as well but I feel it should be such
>>>> that it is easy to explain.
>>>
>>> I also think it makes sense to add a restriction like (1). I am planning to
>>> implement the restriction if no one objects.
>>>
>>
>> I'm not going to block that approach if that's the consensus here,
>> though I'm not convinced.
>>
>> Let me point out (1) does *not* work for data redaction use case,
>> certainly not the example Alvaro and me presented, because that relies
>> on a combination of row filters and column filters.
>>
> 
> This should just forbid the case presented by Alvaro in his first
> email in this thread [1].
> 
>> Requiring all column
>> lists to be the same (and not specific to row filter) prevents that
>> example from working. Yes, you can create multiple subscriptions, but
>> that brings it's own set of challenges too.
>>
>> I doubt forcing users to use the more complex setup is good idea, and
>> combining the column lists per [1] seems sound to me.
>>
>> That being said, the good thing is this restriction seems it might be
>> relaxed in the future to work per [1], without causing any backwards
>> compatibility issues.
>>
> 
> These are my thoughts as well. Even, if we decide to go via the column
> list merging approach (in selective cases), we need to do some
> performance testing of that approach as it does much more work per
> tuple. It is possible that the impact is not much but still worth
> evaluating, so let's try to see the patch to prohibit combining the
> column lists then we can decide.
> 

Surely we could do some performance testing now. I doubt it's very
expensive - sure, you can construct cases with many row filters / column
lists, but how likely is that in practice?

Moreover, it's not like this would affect existing setups, so even if
it's a bit expensive, we may interpret that as cost of the feature.

>> Should we do something similar for row filters, though? It seems quite
>> weird we're so concerned about unexpected behavior due to combining
>> column lists (despite having a patch that makes it behave sanely), and
>> at the same time wave off similarly strange behavior due to combining
>> row filters because "that's what you get if you define the publications
>> in a strange way".
>>
> 
> During development, we found that we can't combine the row-filters for
> 'insert' and 'update'/'delete' because of replica identity
> restrictions, so we have kept them separate. But if we came across
> other such things then we can either try to fix those or forbid them.
> 

I understand how we got to the current state. I'm just saying that this
allows defining separate publications for insert, update and delete
actions, and set different row filters for each of them. Which results
in behavior that is hard to explain/understand, especially when it comes
to tablesync.

It seems quite strange to prohibit merging column lists because there
might be some strange behavior that no one described, and allow setups
with different row filters that definitely have strange behavior.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: make MaxBackends available in _PG_init
Следующее
От: Tom Lane
Дата:
Сообщение: Re: make MaxBackends available in _PG_init