Re: Column Filtering in Logical Replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Column Filtering in Logical Replication
Дата
Msg-id CAA4eK1JG9sg8FEx=gKVdtKNA6UpGzqkW0DEftcByP_DoSAHN4Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Column Filtering in Logical Replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: Column Filtering in Logical Replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
On Mon, Dec 27, 2021 at 10:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> Determining that an array has a NULL element seems convoluted.  I ended
> up with this query, where comparing the result of array_positions() with
> an empty array does that.  If anybody knows of a simpler way, or any
> situations in which this fails, I'm all ears.
>
> with published_cols as (
>         select case when
>                 pg_catalog.array_positions(pg_catalog.array_agg(unnest), null) <> '{}' then null else
>                 pg_catalog.array_agg(distinct unnest order by unnest) end AS attrs
>         from pg_catalog.pg_publication p join
>                 pg_catalog.pg_publication_rel pr on (p.oid = pr.prpubid) left join
>                 unnest(prattrs) on (true)
>         where prrelid = 38168 and p.pubname in ('pub1', 'pub2')
> )
> SELECT a.attname,
>        a.atttypid,
>        a.attnum = ANY(i.indkey)
>   FROM pg_catalog.pg_attribute a
>   LEFT JOIN pg_catalog.pg_index i
>        ON (i.indexrelid = pg_get_replica_identity_index(38168)),
>      published_cols
>  WHERE a.attnum > 0::pg_catalog.int2
>    AND NOT a.attisdropped and a.attgenerated = ''
>    AND a.attrelid = 38168
>    AND (published_cols.attrs IS NULL OR attnum = ANY(published_cols.attrs))
>  ORDER BY a.attnum;
>
> This returns all columns if at least one publication has a NULL prattrs,
> or only the union of columns listed in all publications, if all
> publications have a list of columns.
>

Considering this, don't we need to deal with "For All Tables" and "For
All Tables In Schema .." Publications in this query? The row filter
patch deal with such cases. The row filter patch handles the NULL case
via C code which makes the query relatively simpler. I am not sure if
the same logic can be used here but having a simple query here have
merit that if we want to use a single query to fetch both column and
row filters then we should be able to enhance it without making it
further complicated.

> (I was worried about obtaining the list of publications, but it turns
> out that it's already as a convenient list of OIDs in the MySubscription
> struct.)
>
> With this, we can remove the second query added by Rahila's original patch to
> filter out nonpublished columns.
>
> I still need to add pg_partition_tree() in order to search for
> publications containing a partition ancestor.  I'm not yet sure what
> happens (and what *should* happen) if an ancestor is part of a
> publication and the partition is also part of a publication, and the
> column lists differ.
>

Shouldn't we try to have a behavior similar to the row filter patch
for this case? The row filter patch behavior is as follows: "If your
publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row
filter (if the parameter is false, the default) or the root
partitioned table row filter. During initial tablesync, it doesn't do
any special handling for partitions.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: pl/pgsql feature request: shorthand for argument and local variable references
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: Add 64-bit XIDs into PostgreSQL 15