Обсуждение: bug, ALTER TABLE call ATPostAlterTypeCleanup twice for the same relation
hi.
bug demo:
drop table if exists gtest25;
CREATE TABLE gtest25 (a0 int, a int, b int GENERATED ALWAYS AS (a * 2
+ a0) STORED);
alter table gtest25 add constraint cc check (b > 0);
alter table gtest25 alter column b set data type int8, ALTER COLUMN b
SET EXPRESSION AS (a * 3 + a0);
ERROR: cache lookup failed for constraint 18432
ALTER COLUMN SET EXPRESSION only in 17, so it's a 17 and up related bug.
The reason is ATRewriteCatalogs->ATPostAlterTypeCleanup is called
twice for the same relation.
the second time you call it, the constraint cc is already dropped,
then the "cache lookup failed" error will happen.
While at it, maybe we can also polish the comment below in ATRewriteCatalogs.
/*
* After the ALTER TYPE or SET EXPRESSION pass, do cleanup work
* (this is not done in ATExecAlterColumnType since it should be
* done only once if multiple columns of a table are altered).
*/
but I didn't do it...
Вложения
On Sat, Sep 27, 2025 at 5:54 PM jian he <jian.universality@gmail.com> wrote: > > While at it, maybe we can also polish the comment below in ATRewriteCatalogs. > /* > * After the ALTER TYPE or SET EXPRESSION pass, do cleanup work > * (this is not done in ATExecAlterColumnType since it should be > * done only once if multiple columns of a table are altered). > */ > but I didn't do it... I add one sentence: Note: ATPostAlterTypeCleanup must be called only once per relation! I also polished the regress tests.
Вложения
On Sat, 27 Sept 2025 at 21:54, jian he <jian.universality@gmail.com> wrote:
> if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION)
> - ATPostAlterTypeCleanup(wqueue, tab, lockmode);
> + {
> + if (!list_member_oid(relids, tab->relid))
> + {
> + ATPostAlterTypeCleanup(wqueue, tab, lockmode);
> + relids = lappend_oid(relids, tab->relid);
> + }
> + }
I've not studied this long enough to know what the correct fix should
be, but I have studied it long enough to know what you're proposing
isn't it.
You can't just forego the subsequent call to ATPostAlterTypeCleanup()
because you've done it during the AT_PASS_ALTER_TYPE pass as that
doesn't account for the fact that there might be more work to do (i.e
more constraints added since AT_PASS_ALTER_TYPE) in the
AT_PASS_SET_EXPRESSION pass.
With your patch applied, if I adapt your example case to alter the
type of an unrelated column so that the constraint related to that
column is adjusted first and the constraint related to the expression
column is only added to the changedConstraintOids list after the first
call to ATPostAlterTypeCleanup(), you'll see that "cc" isn't recreated
because your code thinks nothing further needs to be done:
drop table if exists gtest25;
CREATE TABLE gtest25 (a0 int, z int, a int, b int GENERATED ALWAYS AS
(a * 2 + a0) STORED);
alter table gtest25 add constraint check_z check (z > 0);
alter table gtest25 add constraint cc check (b > 0);
select oid, conname,conbin from pg_constraint where
conrelid='gtest25'::regclass;
alter table gtest25 alter column z set data type numeric, ALTER
COLUMN b SET EXPRESSION AS (z + 0);
select oid, conname,conbin from pg_constraint where
conrelid='gtest25'::regclass;
and that's certainly wrong as if I separate the ALTER TABLE into two
separate commands, then "cc" does get recreated.
I do wonder what the exact reason was that AT_PASS_ALTER_TYPE and
AT_PASS_SET_EXPRESSION are separate passes. I'd have expected that's
maybe so that it's possible to alter the type of a column used within
a generated column, but looking at the following error message makes
me think I might not be correct in that thinking:
create table test1 (a int, b int generated always as (a + 1) stored);
alter table test1 alter column a set data type bigint;
ERROR: cannot alter type of a column used by a generated column
DETAIL: Column "a" is used by generated column "b".
There's probably some good explanation for the separate pass, but I'm
not sure of it for now. I'm including in the author and committer of
the patch which added this code to see if we can figure this out.
David
On Sep 29, 2025, at 05:36, David Rowley <dgrowleyml@gmail.com> wrote:On Sat, 27 Sept 2025 at 21:54, jian he <jian.universality@gmail.com> wrote:if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION)
- ATPostAlterTypeCleanup(wqueue, tab, lockmode);
+ {
+ if (!list_member_oid(relids, tab->relid))
+ {
+ ATPostAlterTypeCleanup(wqueue, tab, lockmode);
+ relids = lappend_oid(relids, tab->relid);
+ }
+ }
drop table if exists gtest25;
CREATE TABLE gtest25 (a0 int, z int, a int, b int GENERATED ALWAYS AS
(a * 2 + a0) STORED);
alter table gtest25 add constraint check_z check (z > 0);
alter table gtest25 add constraint cc check (b > 0);
select oid, conname,conbin from pg_constraint where
conrelid='gtest25'::regclass;
alter table gtest25 alter column z set data type numeric, ALTER
COLUMN b SET EXPRESSION AS (z + 0);
select oid, conname,conbin from pg_constraint where
conrelid='gtest25'::regclass;
and that's certainly wrong as if I separate the ALTER TABLE into two
separate commands, then "cc" does get recreated.
I do wonder what the exact reason was that AT_PASS_ALTER_TYPE and
AT_PASS_SET_EXPRESSION are separate passes. I'd have expected that's
maybe so that it's possible to alter the type of a column used within
a generated column, but looking at the following error message makes
me think I might not be correct in that thinking:
create table test1 (a int, b int generated always as (a + 1) stored);
alter table test1 alter column a set data type bigint;
ERROR: cannot alter type of a column used by a generated column
DETAIL: Column "a" is used by generated column "b".
There's probably some good explanation for the separate pass, but I'm
not sure of it for now. I'm including in the author and committer of
the patch which added this code to see if we can figure this out.
Based on the comment of ATPostAlterTypeCleanup(), it says “cleanup after we’ve finished all the ALTER TYPE or SET EXPRESSION operations for a particular relation”, I tried this dirty fix:
```
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index fc89352b661..5c9c6d2a7db 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -5296,6 +5296,8 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
AlterTableUtilityContext *context)
{
ListCell *ltab;
+ bool needAlterTypeCleanup = false;
+ AlteredTableInfo *tabToCleanup = NULL;
/*
* We process all the tables "in parallel", one pass at a time. This is
@@ -5313,6 +5315,13 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
List *subcmds = tab->subcmds[pass];
ListCell *lcmd;
+ if (pass == AT_PASS_OLD_INDEX && needAlterTypeCleanup)
+ {
+ ATPostAlterTypeCleanup(wqueue, tabToCleanup, lockmode);
+ needAlterTypeCleanup = false;
+ tabToCleanup = NULL;
+ }
+
if (subcmds == NIL)
continue;
@@ -5334,7 +5343,10 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode,
* done only once if multiple columns of a table are altered).
*/
if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION)
- ATPostAlterTypeCleanup(wqueue, tab, lockmode);
+ {
+ needAlterTypeCleanup = true;
+ tabToCleanup = tab;
+ }
if (tab->rel)
{
```
It postpones the cleanup to after all “alter type” and “set expression”. With this fix, David’s test case also passes:
```
evantest=# select oid, conname,conbin from pg_constraint where
evantest-# conrelid='gtest25'::regclass;
oid | conname | conbin
-------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
32778 | check_z | {OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0 :varnullingrels (b) :varlevelsup 0 :varreturningtype 0 :varnosyn 1 :varattnosyn 2 :location -1} {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnull false :location -1 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]}) :location -1}
32779 | cc | {OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 4 :vartype 23 :vartypmod -1 :varcollid 0 :varnullingrels (b) :varlevelsup 0 :varreturningtype 0 :varnosyn 1 :varattnosyn 4 :location -1} {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnull false :location -1 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]}) :location -1}
(2 rows)
evantest=# alter table gtest25 alter column z set data type numeric, ALTER
evantest-# COLUMN b SET EXPRESSION AS (z + 0);
ALTER TABLE
evantest=# select oid, conname,conbin from pg_constraint where
evantest-# conrelid='gtest25'::regclass;
oid | conname | conbin
-------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
32781 | check_z | {OPEXPR :opno 1756 :opfuncid 1720 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 2 :vartype 1700 :vartypmod -1 :varcollid 0 :varnullingrels (b) :varlevelsup 0 :varreturningtype 0 :varnosyn 1 :varattnosyn 2 :location -1} {FUNCEXPR :funcid 1740 :funcresulttype 1700 :funcretset false :funcvariadic false :funcformat 2 :funccollid 0 :inputcollid 0 :args ({CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnull false :location -1 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]}) :location -1}) :location -1}
32782 | cc | {OPEXPR :opno 521 :opfuncid 147 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR :varno 1 :varattno 4 :vartype 23 :vartypmod -1 :varcollid 0 :varnullingrels (b) :varlevelsup 0 :varreturningtype 0 :varnosyn 1 :varattnosyn 4 :location -1} {CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnull false :location -1 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]}) :location -1}
(2 rows)
evantest=#
```
We can see both check_z and check_cc got rebuilt.
This fix uses the last “tab” with the cleanup function. In that way, TBH, I haven’t dig deeper enough to see if anything is missed in cleanup.
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
HighGo Software Co., Ltd.
https://www.highgo.com/
On Mon, Sep 29, 2025 at 5:36 AM David Rowley <dgrowleyml@gmail.com> wrote: > > I do wonder what the exact reason was that AT_PASS_ALTER_TYPE and > AT_PASS_SET_EXPRESSION are separate passes. I'd have expected that's > maybe so that it's possible to alter the type of a column used within > a generated column, but looking at the following error message makes > me think I might not be correct in that thinking: > > create table test1 (a int, b int generated always as (a + 1) stored); > alter table test1 alter column a set data type bigint; > ERROR: cannot alter type of a column used by a generated column > DETAIL: Column "a" is used by generated column "b". > > There's probably some good explanation for the separate pass, but I'm > not sure of it for now. I'm including in the author and committer of > the patch which added this code to see if we can figure this out. > I found the explanation link: https://www.postgresql.org/message-id/CAAJ_b96TYsRrqm%2BveXCBb6CJuHJh0opmAvnhw8iMvhCMFKO-Tg%40mail.gmail.com AT_PASS_SET_EXPRESSION should come after AT_PASS_ADD_COL, otherwise cases like below would fail. create table t1 (a int, b int generated always as (a + 1) stored); alter table t1 add column c int, alter column b set expression as (a + c); >AT_PASS_ALTER_TYPE and AT_PASS_ADD_COL cannot be together, the ALTER TYPE > cannot see that column. I guess this is the reason why we have two seperate PASS. please also check the attached patch. The idea is that if both generation expression and type are being changed, only call ATPostAlterTypeCleanup while the current pass is AT_PASS_SET_EXPRESSION.
Вложения
Hi Jian,
On Mon, Sep 29, 2025 at 3:43 PM jian he <jian.universality@gmail.com> wrote:
please also check the attached patch.
The idea is that if both generation expression and type are being changed,
only call ATPostAlterTypeCleanup while the current pass is
AT_PASS_SET_EXPRESSION.
I think your implementation is similar to my previous dirty fix, the main idea is to postpone the cleanup to after AT_PASS_SET_EXPRESSION.
But I feel we don't need an extra loop to find tabs that have both ALTER TYPE and SET EXPRESSION, and do the if-else check, the logic is a bit difficult to understand.
I am attaching my version and please see if you like it. My version just records tabs to cleanup in an array, then runs the cleanup after the AT_PASS_SET_EXPRESSION pass.
A nit comment is:
```
@@ -15578,6 +15601,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
free_object_addresses(objects);
+
+
free_object_addresses(objects);
+
+
```
Why add two newlines here? Seems a typo.
Вложения
On Mon, Sep 29, 2025 at 5:21 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi Jian,
>
> On Mon, Sep 29, 2025 at 3:43 PM jian he <jian.universality@gmail.com> wrote:
>>
>>
>> please also check the attached patch.
>> The idea is that if both generation expression and type are being changed,
>> only call ATPostAlterTypeCleanup while the current pass is
>> AT_PASS_SET_EXPRESSION.
>
>
> I think your implementation is similar to my previous dirty fix, the main idea is to postpone the cleanup to after
AT_PASS_SET_EXPRESSION.
>
> But I feel we don't need an extra loop to find tabs that have both ALTER TYPE and SET EXPRESSION, and do the if-else
check,the logic is a bit difficult to understand.
>
> I am attaching my version and please see if you like it. My version just records tabs to cleanup in an array, then
runsthe cleanup after the AT_PASS_SET_EXPRESSION pass.
>
hi.
we can simply change from
if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION)
ATPostAlterTypeCleanup(wqueue, tab, lockmode);
to
if (pass == AT_PASS_SET_EXPRESSION)
ATPostAlterTypeCleanup(wqueue, tab, lockmode);
else if (pass == AT_PASS_ALTER_TYPE &&
tab->subcmds[AT_PASS_SET_EXPRESSION] == NIL)
ATPostAlterTypeCleanup(wqueue, tab, lockmode);
On Wed, Oct 1, 2025 at 11:04 AM jian he <jian.universality@gmail.com> wrote:
>
> we can simply change from
>
> if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION)
> ATPostAlterTypeCleanup(wqueue, tab, lockmode);
>
> to
> if (pass == AT_PASS_SET_EXPRESSION)
> ATPostAlterTypeCleanup(wqueue, tab, lockmode);
> else if (pass == AT_PASS_ALTER_TYPE &&
> tab->subcmds[AT_PASS_SET_EXPRESSION] == NIL)
> ATPostAlterTypeCleanup(wqueue, tab, lockmode);
That will not work if AT_PASS_ALTER_TYPE triggers the creation of a new
AT_PASS_SET_EXPRESSION entry.
For example, consider:
CREATE TABLE gtest25 (a int, b int GENERATED ALWAYS AS (a * 2) STORED);
If we alter the type of column "a", the column b’s generation
expression would need
to be rebuilt (which is currently not supported). In that case, the current
logic would not be able to handle it.
I think the correct fix is within ATPostAlterTypeCleanup.
at the end of ATPostAlterTypeCleanup, we already delete these changed
objects via
``performMultipleDeletions(objects, DROP_RESTRICT, PERFORM_DELETION_INTERNAL);``
we can just list_free these (the below) objects too:
tab->changedConstraintOids
tab->changedConstraintDefs
tab->changedIndexOids
tab->changedIndexDefs
tab->changedStatisticsOids
tab->changedStatisticsDefs
since this information would become stale if those objects have
already been dropped.
Вложения
> On Oct 13, 2025, at 16:08, jian he <jian.universality@gmail.com> wrote: > > On Wed, Oct 1, 2025 at 11:04 AM jian he <jian.universality@gmail.com> wrote: >> >> we can simply change from >> >> if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); >> >> to >> if (pass == AT_PASS_SET_EXPRESSION) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); >> else if (pass == AT_PASS_ALTER_TYPE && >> tab->subcmds[AT_PASS_SET_EXPRESSION] == NIL) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); > > That will not work if AT_PASS_ALTER_TYPE triggers the creation of a new > AT_PASS_SET_EXPRESSION entry. > For example, consider: > CREATE TABLE gtest25 (a int, b int GENERATED ALWAYS AS (a * 2) STORED); > > If we alter the type of column "a", the column b’s generation > expression would need > to be rebuilt (which is currently not supported). In that case, the current > logic would not be able to handle it. > > I think the correct fix is within ATPostAlterTypeCleanup. > at the end of ATPostAlterTypeCleanup, we already delete these changed > objects via > ``performMultipleDeletions(objects, DROP_RESTRICT, PERFORM_DELETION_INTERNAL);`` > > we can just list_free these (the below) objects too: > tab->changedConstraintOids > tab->changedConstraintDefs > tab->changedIndexOids > tab->changedIndexDefs > tab->changedStatisticsOids > tab->changedStatisticsDefs > > since this information would become stale if those objects have > already been dropped. > <v5-0001-avoid-call-ATPostAlterTypeCleanup-twice.patch> I think the correct solution should be my proposed v4 that has a fundamental difference from your current change is that: * The current implementation as well as your current change, they do ATPostAlterTypeCleanup() in either ALTER_TYPE pass orSET_EXPRESSION pass. Say a table has columns a and b, and a constraint (a+b>5), then I alter both columns types in a singlecommand, then it will alter a’s type, then rebuild the constant, and alter b’s type, then rebuild the constraint again.For the first rebuild, because c’s type has not been updated, the rebuild may potentially fail. * My v4 defers ATPostAlterTypeCleanup() to a later pass, which allows all "set data type” and “set expression” to finish,then start to rebuild constraints. My v4 does the cleanup in next pass of SET_EXPRESSION, which might not be the bestsolution, instead, it’s just a quick PoC to demo my idea. Perhaps, we may add a new pass. But to handle the complicated case that I described above, v4 is still not enough. All “changedXXXXX” stuffs should be movedto an upper level of tab. Still using the same example, the constrain depends on both columns an and b, so we shouldremember the constraint only once. WDYT? Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
> On Oct 13, 2025, at 16:08, jian he <jian.universality@gmail.com> wrote: > > On Wed, Oct 1, 2025 at 11:04 AM jian he <jian.universality@gmail.com> wrote: >> >> we can simply change from >> >> if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); >> >> to >> if (pass == AT_PASS_SET_EXPRESSION) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); >> else if (pass == AT_PASS_ALTER_TYPE && >> tab->subcmds[AT_PASS_SET_EXPRESSION] == NIL) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); > > That will not work if AT_PASS_ALTER_TYPE triggers the creation of a new > AT_PASS_SET_EXPRESSION entry. > For example, consider: > CREATE TABLE gtest25 (a int, b int GENERATED ALWAYS AS (a * 2) STORED); > > If we alter the type of column "a", the column b’s generation > expression would need > to be rebuilt (which is currently not supported). In that case, the current > logic would not be able to handle it. > > I think the correct fix is within ATPostAlterTypeCleanup. > at the end of ATPostAlterTypeCleanup, we already delete these changed > objects via > ``performMultipleDeletions(objects, DROP_RESTRICT, PERFORM_DELETION_INTERNAL);`` > > we can just list_free these (the below) objects too: > tab->changedConstraintOids > tab->changedConstraintDefs > tab->changedIndexOids > tab->changedIndexDefs > tab->changedStatisticsOids > tab->changedStatisticsDefs > > since this information would become stale if those objects have > already been dropped. > <v5-0001-avoid-call-ATPostAlterTypeCleanup-twice.patch> Now I fully understand what your concern is, so that I understand your solution also. So, I take back my previous reply andre-reply. Your concern comes from an assumed solution to support altering a column’s type when it has a dependent generated column.Your assumed solution is like: 1. SQL: ALTER TABLE gtest25 ALTER COLUMN a TYPE bigint; # generated column depends on a 2. Before alter column a’s type, b’s expression is remembered 3. After alter column a’s type, ATPostAlterTypeClean() is called as there is no SET EXPRESSION. ATPostAlterTypeClean() willadd a SET EXPRESSION subcmd to rebuild b 4. In SET_EXPRESSION PASS, b is recreated 5. ATPostAlterTypeClean() is called again, if we don’t cleanup those tab->changedXXX lists, then remembered objects willbe rebuilt again, which may lead to errors. What’s why your solution of cleaning up table->changedXXX works. In my opinion, in step 2, we don’t have follow the same mechanism to remember generated expressions as constraints. We candirectly check if b has SET EXPRESSION cmd, if yes, we don’t need to do anything; otherwise, we get the b’s expressionand inject a SET EXPRESSION subcmd for b. In this case, your worried problem will not exist. With this approach,ATPostAlterTypeClean() will always be called only once, so that execution path is simpler. But anyway, we haven’t decided to support that yet. So for the current bug, your solution of: >> if (pass == AT_PASS_SET_EXPRESSION) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); >> else if (pass == AT_PASS_ALTER_TYPE && >> tab->subcmds[AT_PASS_SET_EXPRESSION] == NIL) >> ATPostAlterTypeCleanup(wqueue, tab, lockmode); should work. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/