Обсуждение: further clarification: alter table alter column set not null - table scan is skipped
further clarification: alter table alter column set not null - table scan is skipped
От
PG Doc comments form
Дата:
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/17/docguide.html Description: Hello there, I like PostgreSQL a lot so this is my way of giving back. The "table scan is skipped" optimization can use some clarification https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-SET-DROP-NOT-NULL My proposal is "then the table scan is skipped if the alter statement doesn't drop the constraint." The reason behind the proposal is documented here https://dev.to/andrewpsy/the-set-not-null-downtime-trap-in-postgresql-1o71 Thank you for working on such an awesome project. Cheers, Andrew
Re: further clarification: alter table alter column set not null - table scan is skipped
От
"David G. Johnston"
Дата:
On Wed, Jul 30, 2025, 13:55 PG Doc comments form <noreply@postgresql.org> wrote:
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/17/docguide.html
Description:
Hello there,
I like PostgreSQL a lot so this is my way of giving back.
The "table scan is skipped" optimization can use some clarification
https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-SET-DROP-NOT-NULL
My proposal is "then the table scan is skipped if the alter statement
doesn't drop the constraint."
The reason behind the proposal is documented here
https://dev.to/andrewpsy/the-set-not-null-downtime-trap-in-postgresql-1o71
Thank you for working on such an awesome project.
Cheers,
Andrew
I'm kinda hoping this is actually just a fixable bug...
Otherwise I'd probably go for a parenthetical:
(however, the constraint used must not be dropped in the same alter table command).
David J.
Re: further clarification: alter table alter column set not null - table scan is skipped
От
Álvaro Herrera
Дата:
On 2025-Jul-30, David G. Johnston wrote: > On Wed, Jul 30, 2025, 13:55 PG Doc comments form <noreply@postgresql.org> > wrote: > > The "table scan is skipped" optimization can use some clarification > > > > https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-SET-DROP-NOT-NULL > > My proposal is "then the table scan is skipped if the alter statement > > doesn't drop the constraint." > I'm kinda hoping this is actually just a fixable bug... I don't think so -- it's just the way ALTER TABLE is designed to work. We don't promise that the subcommands are going to be executed in the order that they are given, and thus this sort of thing can happen. I suspect a mechanism that would throw an error at trying to drop the constraint would be too complicated / brittle / laborious to write. It's possible that there are other combinations that are similarly affected, but I suspect the majority of them would just give an error rather than silently wasting a lot of time; so I agree that this subcommand specifically could use a small note. While writing it I realized we failed to note that the addition of NOT VALID changes behavior. So, how about like this: <para> <literal>SET NOT NULL</literal> may only be applied to a column provided none of the records in the table contain a <literal>NULL</literal> value for the column. Ordinarily this is checked during the <literal>ALTER TABLE</literal> by scanning the - entire table; however, if a valid <literal>CHECK</literal> constraint is - found which proves no <literal>NULL</literal> can exist, then the - table scan is skipped. + entire table, unless <literal>NOT VALID</literal> is specified; + however, if a valid <literal>CHECK</literal> constraint is + found which proves no <literal>NULL</literal> can exist (and is not + dropped in the same command), then the table scan is skipped. If a column has an invalid not-null constraint, <literal>SET NOT NULL</literal> validates it. </para> (This is correct for 18; for 17 and earlier, the mention of NOT VALID needs to be removed.) Of course, in 18 you'd rely on ADD NOT NULL NOT VALID instead of using a separate CHECK constraint. Not sure if this reads better: if a valid <literal>CHECK</literal> constraint is found (and is not dropped in the same command) which proves no <literal>NULL</literal> can exist, then -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Most hackers will be perfectly comfortable conceptualizing users as entropy sources, so let's move on." (Nathaniel Smith) https://mail.gnu.org/archive/html/monotone-devel/2007-01/msg00080.html
further clarification: alter table alter column set not null - table scan is skipped
От
"David G. Johnston"
Дата:
On Thursday, July 31, 2025, Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2025-Jul-30, David G. Johnston wrote:
> On Wed, Jul 30, 2025, 13:55 PG Doc comments form <noreply@postgresql.org>
> wrote:
> > The "table scan is skipped" optimization can use some clarification
> >
> > https://www.postgresql.org/docs/current/sql-altertable.html# SQL-ALTERTABLE-DESC-SET-DROP- NOT-NULL
> > My proposal is "then the table scan is skipped if the alter statement
> > doesn't drop the constraint."
> I'm kinda hoping this is actually just a fixable bug...
I don't think so -- it's just the way ALTER TABLE is designed to work.
We don't promise that the subcommands are going to be executed in the
order that they are given, and thus this sort of thing can happen.
I suspect a mechanism that would throw an error at trying to drop the
constraint would be too complicated / brittle / laborious to write.
I wouldn’t want an error. At the start of the command the constraint existed and its presence then would be enough. It is immaterial that it went away during the command. But it’s definitely not something that seems worth spending a non-trivial amount of effort on.
(This is correct for 18; for 17 and earlier, the mention of NOT VALID
needs to be removed.) Of course, in 18 you'd rely on ADD NOT NULL NOT
VALID instead of using a separate CHECK constraint.
Yeah, the main question here is whether we want to document for v17 and earlier what the article points out regarding locks.
Not sure if this reads better:
if a valid <literal>CHECK</literal> constraint is
found (and is not dropped in the same command) which
proves no <literal>NULL</literal> can exist, then
If a valid check constraint exists (and is not dropped in the same command) which proves the absence of NULLs, then
I do agree the parenthetical should appear closer to the word constraint.
David J.
I like your versions that emphasize: don’t drop the constraint in the same alter table set no null command.
Similar to David’s point, I spent some time trying to figure out a simple refactoring to carry the optimization all the way to the end but it might require executing “set not null” sooner which has a big impact. Another option is only implement a special treatment for this specific use case but it is a code smell to me. I believe a small clarification for the doc entry is the most efficient thing.
On Thursday, July 31, 2025, 09:01, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thursday, July 31, 2025, Álvaro Herrera <alvherre@kurilemu.de> wrote:On 2025-Jul-30, David G. Johnston wrote:
> On Wed, Jul 30, 2025, 13:55 PG Doc comments form <noreply@postgresql.org>
> wrote:
> > The "table scan is skipped" optimization can use some clarification
> >
> > https://www.postgresql.org/doc s/current/sql-altertable.html# SQL-ALTERTABLE-DESC-SET-DROP- NOT-NULL
> > My proposal is "then the table scan is skipped if the alter statement
> > doesn't drop the constraint."
> I'm kinda hoping this is actually just a fixable bug...
I don't think so -- it's just the way ALTER TABLE is designed to work.
We don't promise that the subcommands are going to be executed in the
order that they are given, and thus this sort of thing can happen.
I suspect a mechanism that would throw an error at trying to drop the
constraint would be too complicated / brittle / laborious to write.I wouldn’t want an error. At the start of the command the constraint existed and its presence then would be enough. It is immaterial that it went away during the command. But it’s definitely not something that seems worth spending a non-trivial amount of effort on.
(This is correct for 18; for 17 and earlier, the mention of NOT VALID
needs to be removed.) Of course, in 18 you'd rely on ADD NOT NULL NOT
VALID instead of using a separate CHECK constraint.Yeah, the main question here is whether we want to document for v17 and earlier what the article points out regarding locks.
Not sure if this reads better:
if a valid <literal>CHECK</literal> constraint is
found (and is not dropped in the same command) which
proves no <literal>NULL</literal> can exist, thenIf a valid check constraint exists (and is not dropped in the same command) which proves the absence of NULLs, thenI do agree the parenthetical should appear closer to the word constraint.David J.
Re: further clarification: alter table alter column set not null - table scan is skipped
От
Álvaro Herrera
Дата:
On 2025-Jul-31, Shuyu Pan wrote: > I like your versions that emphasize: don’t drop the constraint in the > same alter table set no null command. Similar to David’s point, I > spent some time trying to figure out a simple refactoring to carry the > optimization all the way to the end but it might require executing > “set not null” sooner which has a big impact. Another option is only > implement a special treatment for this specific use case but it is a > code smell to me. Oh yeah, delaying the drop is much more likely to break other things. I was more thinking along the lines of maintaining a list of columns that are known non-null at the start of the command (a bitmapset actually). This could be computed in ALTER TABLE phase 1, and used later to determine that no scans are needed. But this is a lot of mechanism which is useless 99% of the time, and moreso now that you can directly add the NOT NULL constraints as NOT VALID to start with, which saves having to mess with a separate CHECK constraint. > I believe a small clarification for the doc entry is the most efficient thing. Okay, I've pushed the change to all branches using David Johnston's suggested wording. Thank you all! -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Thanks a lot Álvaro for preparing the clarification so quickly. Looking forward to the release.
If we mark a column to skip table scan during drop (phase 0) and later skip the table scan when setting attributes (phase 7), we will not risk data corruption if the Access Exclusive lock is never released between phase 0 and 7.
On Monday, August 4, 2025, 04:32, Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2025-Jul-31, Shuyu Pan wrote:
> I like your versions that emphasize: don’t drop the constraint in the
> same alter table set no null command. Similar to David’s point, I
> spent some time trying to figure out a simple refactoring to carry the
> optimization all the way to the end but it might require executing
> “set not null” sooner which has a big impact. Another option is only
> implement a special treatment for this specific use case but it is a
> code smell to me.
Oh yeah, delaying the drop is much more likely to break other things. I
was more thinking along the lines of maintaining a list of columns that
are known non-null at the start of the command (a bitmapset actually).
This could be computed in ALTER TABLE phase 1, and used later to
determine that no scans are needed. But this is a lot of mechanism
which is useless 99% of the time, and moreso now that you can directly
add the NOT NULL constraints as NOT VALID to start with, which saves
having to mess with a separate CHECK constraint.
> I believe a small clarification for the doc entry is the most efficient thing.
Okay, I've pushed the change to all branches using David Johnston's
suggested wording.
Thank you all!
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/