Re: row filtering for logical replication

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: row filtering for logical replication
Дата
Msg-id 86dcff6e-9b5f-0f9a-8602-69a5602a30c9@enterprisedb.com
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: row filtering for logical replication
Re: row filtering for logical replication
Re: row filtering for logical replication
Список pgsql-hackers
On 7/13/21 5:44 PM, Jeff Davis wrote:
> On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
>> to do. AFAIU, the main things we want to prohibit in the filter are:
>> (a) it doesn't refer to any relation other than catalog in where
>> clause,
> 
> Right, because the walsender is using a historical snapshot.
> 
>> (b) it doesn't use UDFs in any way (in expressions, in
>> user-defined operators, user-defined types, etc.),
> 
> Is this a reasonable requirement? Postgres has a long history of
> allowing UDFs nearly everywhere that a built-in is allowed. It feels
> wrong to make built-ins special for this feature.
> 

Well, we can either prohibit UDF or introduce a massive foot-gun.

The problem with functions in general (let's ignore SQL functions) is 
that they're black boxes, so we don't know what's inside. And if the 
function gets broken after an object gets dropped, the replication is 
broken and the only way to fix it is to recover the subscription.

And this is not hypothetical issue, we've seen this repeatedly :-(

So as much as I'd like to see support for UDFs here, I think it's better 
to disallow them - at least for now. And maybe relax that restriction 
later, if possible.

>> (c) the columns
>> referred to in the filter should be part of PK or Replica Identity.
> 
> Why?
> 

I'm not sure either.

> 
> Also:
> 
> * Andres also mentioned that the function should not leak memory.
> * One use case for this feature is when sharding a table, so the
> expression should allow things like "hashint8(x) between ...". I'd
> really like to see this problem solved, as well.
> 

I think built-in functions should be fine, because generally don't get 
dropped etc. (And if you drop built-in function, well - sorry.)

Not sure about the memory leaks - I suppose we'd free memory for each 
row, so this shouldn't be an issue I guess ...

>> I think in the long run one idea to allow UDFs is probably by
>> explicitly allowing users to specify whether the function is
>> publication predicate safe and if so, then we can allow such
>> functions
>> in the filter clause.
> 
> This sounds like a better direction. We probably need some kind of
> catalog information here to say what functions/operators are "safe" for
> this kind of purpose. There are a couple questions:
> 

Not sure. It's true it's a bit like volatile/stable/immutable categories 
where we can't guarantee those labels are correct, and it's up to the 
user to keep the pieces if they pick the wrong category.

But we can achieve the same goal by introducing a simple GUC called 
dangerous_allow_udf_in_decoding, I think.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ranier Vilela
Дата:
Сообщение: Re: printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: row filtering for logical replication