Re: Column Filtering in Logical Replication
От | Amit Kapila |
---|---|
Тема | Re: Column Filtering in Logical Replication |
Дата | |
Msg-id | CAA4eK1+1DMkCip9SB3B0_u0Q6fGf-D3vgqQodkLfur0qkL482g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Column Filtering in Logical Replication (Justin Pryzby <pryzby@telsasoft.com>) |
Ответы |
RE: Column Filtering in Logical Replication
("tanghy.fnst@fujitsu.com" <tanghy.fnst@fujitsu.com>)
Re: Column Filtering in Logical Replication (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
Список | pgsql-hackers |
On Wed, Jan 12, 2022 at 2:40 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > Is there any coordination between the "column filter" patch and the "row > filter" patch ? Are they both on track for PG15 ? Has anybody run them > together ? > The few things where I think we might need to define some common behavior are as follows: 1. Replica Identity handling: Currently the column filter patch gives an error during create/alter subscription if the specified column list is invalid (Replica Identity columns are missing). It also gives an error if the user tries to change the replica identity. However, it doesn't deal with cases where the user drops and adds a different primary key that has a different set of columns which can lead to failure during apply on the subscriber. I think another issue w.r.t column filter patch is that even while creating publication (even for 'insert' publications) it should check that all primary key columns must be part of published columns, otherwise, it can fail while applying on subscriber as it will try to insert NULL for the primary key column. 2. Handling of partitioned tables vs. Replica Identity (RI): When adding a partitioned table with a column list to the publication (with publish_via_partition_root = false), we should check the Replica Identity of all its leaf partition as the RI on the partition is the one actually takes effect when publishing DML changes. We need to check RI while attaching the partition as well, as the newly added partitions will automatically become part of publication if the partitioned table is part of the publication. If we don't do this the later deletes/updates can fail. All these cases are dealt with in row filter patch because of the on-the-fly check which means we check the validation of columns in row filters while actual operation update/delete via CheckCmdReplicaIdentity and cache the result of same for future use. This is inline with existing checks of RI vs. operations on tables. The primary reason for this was we didn't want to handle validation of row filters at so many places. 3. Tablesync.c handling: Ideally, it would be good if we have a single query to fetch both row filters and column filters but even if that is not possible in the first version, the behavior should be same for both queries w.r.t partitioned tables, For ALL Tables and For All Tables In Schema cases. Currently, the column filter patch doesn't seem to respect For ALL Tables and For All Tables In Schema cases, basically, it just copies the columns it finds through some of the publications even if one of the publications is defined as For All Tables. The row filter patch ignores the row filters if one of the publications is defined as For ALL Tables and For All Tables In Schema. For row filter patch, if the publication contains a partitioned table, the publication parameter publish_via_partition_root determines if it uses the partition row filter (if the parameter is false, the default) or the root partitioned table row filter and this is taken care of even during the initial tablesync. For column filter patch, if the publication contains a partitioned table, it seems that it finds all columns that the tables in its partition tree specified in the publications, whether publish_via_partition_root is true or false. We have done some testing w.r.t above cases with both patches and my colleague will share the results. -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления: