Обсуждение: pg_publication_tables: return NULL attnames when no column list is specified

Поиск

Список

Период

Сортировка

pg_publication_tables: return NULL attnames when no column list is specified

От

Roberto Mello

Дата:

25 марта, 22:50:45

Hi,

When a table is added to a publication without a column list:

ALTER PUBLICATION my_pub ADD TABLE my_table;

Any new column added to my_table will automatically be replicated. The
underlying catalog (pg_publication_rel.prattrs) stores NULL for this case.

When a table is added with an explicit column list:

ALTER PUBLICATION my_pub ADD TABLE my_table (id, name, status);

New columns will NOT be replicated until the publication is explicitly altered.
The catalog stores an int2vector of the specified attribute numbers.

The problem is that pg_get_publication_tables() in pg_publication.c (the set-returning
function backing the pg_publication_tables view) erases this distinction.
When prattrs is NULL, it opens the table, iterates all eligible attributes,
and builds a synthetic int2vector of all current columns. The view then shows
the same attnames output for both cases.

| Scenario | prattrs in catalog | attrs from SRF | attnames in view |
|--------------------------|--------------------|---------------------|------------------|
| column list (a, b) | {1, 2 } | {1,2} | {a, b} |
| No column list | NULL | {1,2,3,...} (synth) | {a, b, c, ...} |
| FOR ALL TABLES/IN SCHEMA | no catalog row | {1,2,3,...} (synth) | {a, b, c, ...} |

This is a problem for operations: schema migrations that add columns to published
tables may or may not replicate depending on how the publication was originally
defined, and the only way to check is querying pg_publication_rel directly.

Additionally, tablesync.c has a workaround heuristic that tries to reverse
this synthesis by comparing array_length(gpt.attrs, 1) against c.relnatts.
This heuristic is buggy: relnatts counts all user attributes including dropped
columns, but the synthesized list excludes dropped columns and conditionally
excludes generated columns. So for any table with a dropped column, the
heuristic incorrectly treats "no column list" as if an explicit list were
specified.

The problematic code is the result of commits that incrementally
built on each other, with later ones introducing workarounds for side effects
of earlier ones:

1. fd0b9dcebda (06/2022) Amit Kapila (author: Hou Zhijie)
Prohibit combining publications with different column lists

Added the subscriber-side check that prevents combining publications with
different column lists for the same table. The subscriber runs
SELECT DISTINCT gpt.attrs and expects at most one row. To handle the fact
that pg_get_publication_tables() returned NULL for no-column-list
publications (which would always be distinct from any explicit list), this
commit introduced the relnatts heuristic in tablesync.c:

CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)
THEN NULL ELSE gpt.attrs END

The idea was: if the SRF returns a synthesized list whose length equals
relnatts, it must be the "all columns" case, so collapse it back to NULL.
But at this point the SRF didn't synthesize anything yet, so the heuristic was anticipating the next commit.

2. b7ae0395369 (01/2023) Amit Kapila (author: Shi yu)
Ignore dropped and generated columns from the column list

This is the commit that introduced the synthesis loop. It "solved" a problem
where a table in two publications (one with a column list naming all columns,
one without a column list) was erroring with "cannot use different column lists"
because one returned NULL and the other returned an int2vector. The fix was
to synthesize a column list when none was specified, filtering out dropped
and generated columns, so that both publications would produce identical
int2vectors and DISTINCT would collapse them to one row.

But this synthesis made it impossible to distinguish "all columns" from
"explicit list of all columns" in the view. And the relnatts heuristic from
fd0b9dcebda - which was supposed to reverse the synthesis - was broken from
the start because relnatts includes dropped columns while the synthesized list
excludes them.

The synthesis in b7ae0395369 tried to make "no column list" and "explicit list
of all columns" look identical. But they have genuinely different
semantics:

- No column list (NULL): all current and future columns are replicated.
ALTER TABLE ADD COLUMN automatically replicates the new column.
- Explicit full list: only the named columns are replicated. New columns
are NOT replicated until the publication is explicitly altered.

By making them indistinguishable, the synthesis hid a real conflict from users
who had a table in two publications with different column semantics on the
same subscription. I am proposing a fix that restores the distinction and correctly
(IMO) surfaces this conflict.

The fix is to stop synthesizing the full column list in pg_get_publication_tables().
When prattrs is NULL in the catalog, let attrs remain NULL in the SRF output.
Remove the buggy CASE WHEN heuristic in tablesync.c since it is no longer
needed.

There is one scenario where there is a change for users: one pub no list + one pub explicit all columns. Anyone with that specific configuration will see a new error on the

next tablesync or subscription refresh after upgrading. The fix for those users is to

either remove the explicit column list from the second publication (making both "all

columns"), or keep the difference and use separate subscriptions.

AFAICT this is safe except for the change in behavior I describe above. psql and pg_dump query pg_publication_rel directly and not the affected view. New subscriber

talking to old publisher still works (old pub synthesizes a list, new sub handles it).

Old subscriber talking to new publisher also works (the old CASE WHEN evaluates

array_length(NULL, 1) which returns NULL, falling through to the ELSE branch

returning NULL).

Attached is a patch with the fix, including updates to documentation, 7 updated

regression tests where existing expected outputs where attnames changes from

{a} to NULL, and added new test cases that verify both cases are distinguishable in

the same query.

Roberto Mello

Snowflake

Вложения

v1-0001-Fix-pg_publication_tables-to-return-NULL-attnames.patch

Re: pg_publication_tables: return NULL attnames when no column list is specified

От

Roberto Mello

Дата:

26 марта, 03:48:34

On Wed, Mar 25, 2026 at 5:32 PM Peter Smith <smithpb2250@gmail.com> wrote:

FYI - the patch failed to apply for me because of publication.out issues.

There appear to be some missing trailing spaces in the result table headers

Apologies.

Generated a v2 patch. Applied cleanly and tests passed.

Roberto Mello

Snowflake

Вложения

v2-0001-Fix-pg_publication_tables-to-return-NULL-attnames.patch

Re: pg_publication_tables: return NULL attnames when no column list is specified

От

Álvaro Herrera

Дата:

26 марта, 15:37:19

On 2026-Mar-25, Roberto Mello wrote:

> The problem is that pg_get_publication_tables() in pg_publication.c
> (the set-returning function backing the pg_publication_tables view)
> erases this distinction.  When prattrs is NULL, it opens the table,
> iterates all eligible attributes, and builds a synthetic int2vector of
> all current columns. The view then shows the same attnames output for
> both cases.

I agree that this is wrong.  This distinction was explicitly discussed
when the column-list feature was developed.  I don't think we can
backpatch the fix though, out of fear that we would break something for
existing users; but we should definitely fix it for pg19.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"But static content is just dynamic content that isn't moving!"
                http://smylers.hates-software.com/2007/08/15/fe244d0c.html

Re: pg_publication_tables: return NULL attnames when no column list is specified

От

Ajin Cherian

Дата:

27 марта, 07:06:54

On Thu, Mar 26, 2026 at 11:49 AM Roberto Mello <roberto.mello@gmail.com> wrote:
>
> On Wed, Mar 25, 2026 at 5:32 PM Peter Smith <smithpb2250@gmail.com> wrote:
>>
>> FYI - the patch failed to apply for me because of publication.out issues.
>>
>> There appear to be some missing trailing spaces in the result table headers
>
>
> Apologies.
>
> Generated a v2 patch. Applied cleanly and tests passed.


Hello,

Tested the patch and the patch fixes the bug as described. My tests below:


postgres=# CREATE TABLE my_table (id int PRIMARY KEY, name text, status text);
CREATE TABLE
postgres=# ALTER TABLE my_table ADD COLUMN old_col text;
ALTER TABLE
postgres=# ALTER TABLE my_table DROP COLUMN old_col;
ALTER TABLE
postgres=# CREATE PUBLICATION pub_no_list FOR TABLE my_table;
CREATE PUBLICATION pub_explicit_all FOR TABLE my_table (id, name, status);
CREATE PUBLICATION
CREATE PUBLICATION

=========================
On head - without patch:
=========================

postgres=# SELECT pubname, attnames
FROM pg_publication_tables
WHERE tablename = 'my_table'
ORDER BY pubname;
     pubname      |     attnames
------------------+------------------
 pub_explicit_all | {id,name,status}
 pub_no_list      | {id,name,status}
(2 rows)

postgres=# SELECT p.pubname, pr.prattrs
FROM pg_publication p
LEFT JOIN pg_publication_rel pr
       ON pr.prpubid = p.oid
JOIN pg_class c ON c.oid = pr.prrelid
WHERE c.relname = 'my_table'
ORDER BY p.pubname;
     pubname      | prattrs
------------------+---------
 pub_explicit_all | 1 2 3
 pub_no_list      |
(2 rows)

postgres=# SELECT
    p.pubname,
    pr.prattrs                          AS raw_catalog_attrs,
    gpt.attnames                        AS view_attnames,
    c.relnatts,
    array_length(gpt.attnames, 1)       AS synth_len,
    -- Replicating the buggy CASE WHEN heuristic from tablesync.c
    CASE WHEN array_length(gpt.attnames, 1) = c.relnatts
         THEN NULL
         ELSE gpt.attnames
    END AS heuristic_result
FROM pg_publication_tables gpt
JOIN pg_class c      ON c.relname = gpt.tablename
JOIN pg_publication p ON p.pubname = gpt.pubname
LEFT JOIN pg_publication_rel pr
       ON pr.prrelid = c.oid
      AND pr.prpubid = p.oid
WHERE gpt.tablename = 'my_table'
ORDER BY p.pubname;
     pubname      | raw_catalog_attrs |  view_attnames   | relnatts |
synth_len | heuristic_result
------------------+-------------------+------------------+----------+-----------+------------------
 pub_explicit_all | 1 2 3             | {id,name,status} |        4 |
       3 | {id,name,status}
 pub_no_list      |                   | {id,name,status} |        4 |
       3 | {id,name,status}
(2 rows)

---------------------------------------------------------------------

============
After patch:
============

postgres=# SELECT pubname, attnames
FROM pg_publication_tables
WHERE tablename = 'my_table'
ORDER BY pubname;
     pubname      |     attnames
------------------+------------------
 pub_explicit_all | {id,name,status}
 pub_no_list      |
(2 rows)

postgres=# SELECT p.pubname, pr.prattrs
FROM pg_publication p
LEFT JOIN pg_publication_rel pr ON pr.prpubid = p.oid
JOIN pg_class c ON c.oid = pr.prrelid
WHERE c.relname = 'my_table'
ORDER BY p.pubname;
     pubname      | prattrs
------------------+---------
 pub_explicit_all | 1 2 3
 pub_no_list      |
(2 rows)

postgres=# SELECT
    p.pubname,
    pr.prattrs                          AS raw_catalog_attrs,
    gpt.attnames                        AS view_attnames,
    c.relnatts,
    array_length(gpt.attnames, 1)       AS synth_len,
    -- Replicating the buggy CASE WHEN heuristic from tablesync.c
    CASE WHEN array_length(gpt.attnames, 1) = c.relnatts
         THEN NULL
         ELSE gpt.attnames
    END AS heuristic_result
FROM pg_publication_tables gpt
JOIN pg_class c      ON c.relname = gpt.tablename
JOIN pg_publication p ON p.pubname = gpt.pubname
LEFT JOIN pg_publication_rel pr
       ON pr.prrelid = c.oid
      AND pr.prpubid = p.oid
WHERE gpt.tablename = 'my_table'
ORDER BY p.pubname;
     pubname      | raw_catalog_attrs |  view_attnames   | relnatts |
synth_len | heuristic_result
------------------+-------------------+------------------+----------+-----------+------------------
 pub_explicit_all | 1 2 3             | {id,name,status} |        4 |
       3 | {id,name,status}
 pub_no_list      |                   |                  |        4 |
         |
(2 rows)


One observation from reviewing the patch: the test suite covers the
partial column list and dropped column cases well, but is missing a
test for the two-publication conflict scenario (one pub with no list +
one pub with an explicit list of all columns on the same table). That
is the breaking change called out in the original report and probably
deserves its own regression test with a comment explaining the
expected behavior change for users in that configuration.
Patch LGTM otherwise.

regards,
Ajin Cherian
Fujitsu Australia

Re: pg_publication_tables: return NULL attnames when no column list is specified

От

Roberto Mello

Дата:

27 марта, 20:27:16

On Thu, Mar 26, 2026 at 10:07 PM Ajin Cherian <itsajin@gmail.com> wrote:

Tested the patch and the patch fixes the bug as described. My tests below:

Thank you for the review.

<snip>

One observation from reviewing the patch: the test suite covers the
partial column list and dropped column cases well, but is missing a
test for the two-publication conflict scenario (one pub with no list +
one pub with an explicit list of all columns on the same table). That
is the breaking change called out in the original report and probably
deserves its own regression test with a comment explaining the
expected behavior change for users in that configuration.
Patch LGTM otherwise.

Good point. I know I worked on that, so I must have accidentally dropped it from the patch.

Will prepare a revision and submit.

Thanks again.

Roberto Mello

Snowflake

P.S.: I forgot to acknowledge and thank Greg Mullane for the initial review on the patch, so doing that now.

Re: pg_publication_tables: return NULL attnames when no column list is specified

От

Roberto Mello

Дата:

27 марта, 22:53:45

On Fri, Mar 27, 2026 at 11:27 AM Roberto Mello <roberto.mello@gmail.com> wrote:

On Thu, Mar 26, 2026 at 10:07 PM Ajin Cherian <itsajin@gmail.com> wrote:

One observation from reviewing the patch: the test suite covers the
partial column list and dropped column cases well, but is missing a
test for the two-publication conflict scenario (one pub with no list +
one pub with an explicit list of all columns on the same table). That
is the breaking change called out in the original report and probably
deserves its own regression test with a comment explaining the
expected behavior change for users in that configuration.
Patch LGTM otherwise.

Good point. I know I worked on that, so I must have accidentally dropped it from the patch.

Will prepare a revision and submit.

Here's v3 of the patch with the missing test.

Roberto Mello

Snowflake

Вложения

v3-0001-Fix-pg_publication_tables-to-return-NULL-attnames.patch

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: pg_publication_tables: return NULL attnames when no column list is specified

pg_publication_tables: return NULL attnames when no column list is specified

Вложения

Re: pg_publication_tables: return NULL attnames when no column list is specified

Вложения

Re: pg_publication_tables: return NULL attnames when no column list is specified

Re: pg_publication_tables: return NULL attnames when no column list is specified

Re: pg_publication_tables: return NULL attnames when no column list is specified

Re: pg_publication_tables: return NULL attnames when no column list is specified

Вложения