Re: Column Filtering in Logical Replication

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: Column Filtering in Logical Replication
Дата
Msg-id 202201141338.je5bniwn7oay@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: Column Filtering in Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Column Filtering in Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 2022-Jan-14, Amit Kapila wrote:

> 1. Replica Identity handling: Currently the column filter patch gives
> an error during create/alter subscription if the specified column list
> is invalid (Replica Identity columns are missing). It also gives an
> error if the user tries to change the replica identity. However, it
> doesn't deal with cases where the user drops and adds a different
> primary key that has a different set of columns which can lead to
> failure during apply on the subscriber.

Hmm, yeah, I suppose we should check that the primary key is compatible
with the column list in all publications.  (I wonder what happens in the
interim, that is, what happens to tuples modified after the initial PK
is dropped and before the new PK is installed.  Are these considered to
have "replica identiy nothing"?)

> I think another issue w.r.t column filter patch is that even while
> creating publication (even for 'insert' publications) it should check
> that all primary key columns must be part of published columns,
> otherwise, it can fail while applying on subscriber as it will try to
> insert NULL for the primary key column.

I'm not so sure about the primary key aspects, actually; keep in mind
that the replica can have a different table definition, and it might
have even a completely different primary key.  I think this part is up
to the user to set up correctly; we have enough with just trying to make
the replica identity correct.

> 2. Handling of partitioned tables vs. Replica Identity (RI): When
> adding a partitioned table with a column list to the publication (with
> publish_via_partition_root = false), we should check the Replica
> Identity of all its leaf partition as the RI on the partition is the
> one actually takes effect when publishing DML changes. We need to
> check RI while attaching the partition as well, as the newly added
> partitions will automatically become part of publication if the
> partitioned table is part of the publication. If we don't do this the
> later deletes/updates can fail.

Hmm, yeah.

> 3. Tablesync.c handling: Ideally, it would be good if we have a single
> query to fetch both row filters and column filters but even if that is
> not possible in the first version, the behavior should be same for
> both queries w.r.t partitioned tables, For ALL Tables and For All
> Tables In Schema cases.
> 
> Currently, the column filter patch doesn't seem to respect For ALL
> Tables and For All Tables In Schema cases, basically, it just copies
> the columns it finds through some of the publications even if one of
> the publications is defined as For All Tables. The row filter patch
> ignores the row filters if one of the publications is defined as For
> ALL Tables and For All Tables In Schema.

Oh, yeah, if a table appears in two publications and one of them is ALL
TABLES [IN SCHEMA], then we don't consider it as an all-columns
publication.  You're right, that should be corrected.

> We have done some testing w.r.t above cases with both patches and my
> colleague will share the results.

Great, thanks.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: A test for replay of regression tests
Следующее
От: Amit Langote
Дата:
Сообщение: Re: generic plans and "initial" pruning