Обсуждение: [Bug]Vacuum full silently NULL out fast default columns
Hi Hackers,
VACUUM FULL silently turns columns added via ALTER TABLE ... ADD COLUMN ... DEFAULT <const> into NULL
on all pre-existing rows. The issue exists for other operations like CLUSTER, REPACK.
Repro:
CREATE TABLE t (id int PRIMARY KEY);
INSERT INTO t SELECT generate_series(1,3);
ALTER TABLE t ADD COLUMN x int DEFAULT 42;
SELECT * FROM t; -- (1,42),(2,42),(3,42)
VACUUM FULL t;
SELECT * FROM t; -- (1,NULL),(2,NULL),(3,NULL)
If the column is NOT NULL, the value becomes the type's zero value
instead of NULL, silently bypassing both NOT NULL and any CHECK
constraint declared on it.
Root Cause: fast path in reform_tuple() in heapam_handler.c returns a copy
of the source tuple when no dropped columns need fixing up. The check
doesn't account for short tuples (HeapTupleHeaderGetNatts(t) <
relnatts) that rely on attmissingval to materialize the default. After
the rewrite, finish_heap_swap() calls RelationClearMissing(), clearing
the only source of those values, and the short tuples then read as
NULL.
Fix: force reform when the source tuple is shorter than the new tuple
descriptor.
Patch attached. Added a regression test in fast_default.sql covering
VACUUM FULL, CLUSTER, and REPACK on a table with fast-default columns
including a NOT NULL CHECK column.
Thanks,
Satya
on all pre-existing rows. The issue exists for other operations like CLUSTER, REPACK.
Repro:
CREATE TABLE t (id int PRIMARY KEY);
INSERT INTO t SELECT generate_series(1,3);
ALTER TABLE t ADD COLUMN x int DEFAULT 42;
SELECT * FROM t; -- (1,42),(2,42),(3,42)
VACUUM FULL t;
SELECT * FROM t; -- (1,NULL),(2,NULL),(3,NULL)
If the column is NOT NULL, the value becomes the type's zero value
instead of NULL, silently bypassing both NOT NULL and any CHECK
constraint declared on it.
Root Cause: fast path in reform_tuple() in heapam_handler.c returns a copy
of the source tuple when no dropped columns need fixing up. The check
doesn't account for short tuples (HeapTupleHeaderGetNatts(t) <
relnatts) that rely on attmissingval to materialize the default. After
the rewrite, finish_heap_swap() calls RelationClearMissing(), clearing
the only source of those values, and the short tuples then read as
NULL.
Fix: force reform when the source tuple is shorter than the new tuple
descriptor.
Patch attached. Added a regression test in fast_default.sql covering
VACUUM FULL, CLUSTER, and REPACK on a table with fast-default columns
including a NOT NULL CHECK column.
Thanks,
Satya
Вложения
SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> writes:
> VACUUM FULL silently turns columns added via ALTER TABLE ... ADD COLUMN ...
> DEFAULT <const> into NULL
> on all pre-existing rows. The issue exists for other operations like
> CLUSTER, REPACK.
That is a seriously awful bug. Fortunately it is not in any shipping
release. A quick bisect run agrees that it broke here:
28d534e2ae0ac888b5460f977a10cd9bb017ef98 is the first bad commit
commit 28d534e2ae0ac888b5460f977a10cd9bb017ef98 (HEAD)
Author: Álvaro Herrera <alvherre@kurilemu.de>
Date: Mon Apr 6 21:55:08 2026 +0200
Add CONCURRENTLY option to REPACK
> Patch attached. Added a regression test in fast_default.sql covering
> VACUUM FULL, CLUSTER, and REPACK on a table with fast-default columns
> including a NOT NULL CHECK column.
I don't know if this is the best code fix (I don't like putting extra
checks into a loop condition like this). But I agree we need some
more tests covering this area.
regards, tom lane
Hi,
On Mon, May 4, 2026 at 6:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> writes:
> VACUUM FULL silently turns columns added via ALTER TABLE ... ADD COLUMN ...
> DEFAULT <const> into NULL
> on all pre-existing rows. The issue exists for other operations like
> CLUSTER, REPACK.
That is a seriously awful bug. Fortunately it is not in any shipping
release. A quick bisect run agrees that it broke here:
28d534e2ae0ac888b5460f977a10cd9bb017ef98 is the first bad commit
commit 28d534e2ae0ac888b5460f977a10cd9bb017ef98 (HEAD)
Author: Álvaro Herrera <alvherre@kurilemu.de>
Date: Mon Apr 6 21:55:08 2026 +0200
Add CONCURRENTLY option to REPACK
> Patch attached. Added a regression test in fast_default.sql covering
> VACUUM FULL, CLUSTER, and REPACK on a table with fast-default columns
> including a NOT NULL CHECK column.
I don't know if this is the best code fix (I don't like putting extra
checks into a loop condition like this). But I agree we need some
more tests covering this area.
Thanks for reviewing! Attached v2 patch. Agreed, tried to optimize LOC in V1. Before the change
loop was not breaking early, I fixed that as well in V2.
Thanks,
Satya
Вложения
On 2026-May-04, SATYANARAYANA NARLAPURAM wrote: > On Mon, May 4, 2026 at 6:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > A quick bisect run agrees that it broke here: > > > > 28d534e2ae0ac888b5460f977a10cd9bb017ef98 is the first bad commit > > commit 28d534e2ae0ac888b5460f977a10cd9bb017ef98 (HEAD) > > Author: Álvaro Herrera <alvherre@kurilemu.de> > > Date: Mon Apr 6 21:55:08 2026 +0200 > > > > Add CONCURRENTLY option to REPACK Right. > Thanks for reviewing! Attached v2 patch. Agreed, tried to optimize LOC in > V1. Before the change loop was not breaking early, I fixed that as > well in V2. Yeah, this seems a good approach to me. I propose some more comment updates though, and I also thought it'd be a good idea to add a test for REPACK CONCURRENTLY while at it. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Вложения
Hi,
On Mon, May 4, 2026 at 10:41 AM Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2026-May-04, SATYANARAYANA NARLAPURAM wrote:
> On Mon, May 4, 2026 at 6:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > A quick bisect run agrees that it broke here:
> >
> > 28d534e2ae0ac888b5460f977a10cd9bb017ef98 is the first bad commit
> > commit 28d534e2ae0ac888b5460f977a10cd9bb017ef98 (HEAD)
> > Author: Álvaro Herrera <alvherre@kurilemu.de>
> > Date: Mon Apr 6 21:55:08 2026 +0200
> >
> > Add CONCURRENTLY option to REPACK
Right.
> Thanks for reviewing! Attached v2 patch. Agreed, tried to optimize LOC in
> V1. Before the change loop was not breaking early, I fixed that as
> well in V2.
Yeah, this seems a good approach to me. I propose some more comment
updates though, and I also thought it'd be a good idea to add a test for
REPACK CONCURRENTLY while at it.
This patch LGTM. It applied cleanly and all the tests passed.
Thanks,
Satya
On 2026-May-04, SATYANARAYANA NARLAPURAM wrote: > This patch LGTM. It applied cleanly and all the tests passed. OK, thanks, pushed. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "The problem with the future is that it keeps turning into the present" (Hobbes)