Re: Column Filtering in Logical Replication

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Column Filtering in Logical Replication
Дата
Msg-id 773e96e5-77ef-ad26-ad37-03ca41a0ad25@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Column Filtering in Logical Replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: Column Filtering in Logical Replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers

On 12/17/21 22:07, Alvaro Herrera wrote:
> So I've been thinking about this as a "security" item (you can see my
> comments to that effect sprinkled all over this thread), in the sense
> that if a publication "hides" some column, then the replica just won't
> get access to it.  But in reality that's mistaken: the filtering that
> this patch implements is done based on the queries that *the replica*
> executes at its own volition; if the replica decides to ignore the list
> of columns, it'll be able to get all columns.  All it takes is an
> uncooperative replica in order for the lot of data to be exposed anyway.
> 

Interesting, I haven't really looked at this as a security feature. And 
in my experience if something is not carefully designed to be secure 
from the get go, it's really hard to add that bit later ...

You say it's the replica making the decisions, but my mental model is 
it's the publisher decoding the data for a given list of publications 
(which indeed is specified by the subscriber). But the subscriber can't 
tweak the definition of publications, right? Or what do you mean by 
queries executed by the replica? What are the gap?

> If the server has a *separate* security mechanism to hide the columns
> (per-column privs), it is that feature that will protect the data, not
> the logical-replication-feature to filter out columns.
> 

Right. Although I haven't thought about how logical decoding interacts 
with column privileges. I don't think logical decoding actually checks 
column privileges - I certainly don't recall any ACL checks in 
src/backend/replication ...

AFAIK we only really check privileges during initial sync (when creating 
the slot and copying data), but then we keep replicating data even if 
the privilege gets revoked for the table/column. In principle the 
replication role is pretty close to superuser.

> 
> This led me to realize that the replica-side code in tablesync.c is
> totally oblivious to what's the publication through which a table is
> being received from in the replica.  So we're not aware of a replica
> being exposed only a subset of columns through some specific
> publication; and a lot more hacking is needed than this patch does, in
> order to be aware of which publications are being used.
> 
> I'm going to have a deeper look at this whole thing.
> 

Does that mean we currently sync all the columns in the initial sync, 
and only start filtering columns later while decoding transactions?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Adding CI to our tree
Следующее
От: Greg Stark
Дата:
Сообщение: Re: WIP: WAL prefetch (another approach)