Обсуждение: Re: using index to speedup add not null constraints to a table
Can you please rebase this? It stopped applying a week ago. Thanks! -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Sallah, I said NO camels! That's FIVE camels; can't you count?" (Indiana Jones)
On Fri, Oct 17, 2025 at 3:57 AM Álvaro Herrera <alvherre@kurilemu.de> wrote: > Can you please rebase this? It stopped applying a week ago. > hi. rebase and minor polishment.
Вложения
Hello I rebased this patch to review it. Overall, I think this is a great idea. Here are some comments on the patch, which I hope are not anything major. I'm not really sure that I agree with the way ATRewriteTable works now. It seems that we scan the list of columns, and verify each one for nulls, but abort midway if a column is generated. Surely we should check for generated columns first, to avoid wasting time verifying earlier columns in case a later column is generated? That said, we do check for notnull_virtual_attrs to be NIL. It seems to me that this implies that the checked columns are not generated. In other words, the test for whether columns are generated is unnecessary and confusing, and in which case you don't have to change anything, just remove the "if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)" inner block. However, if we do this, then I think computing notnull_attrs is pointless. So we should only change the order: do this new check first, and if we find that any new not-null column is on a generated column, then we compute both notnull_virtual_attrs and notnull_attrs. No? The separation of labor between index_check_notnull() and index_check_notnull_internal() seems a bit pointless. Why not have a single routine that does both things? Though, to be honest, it does look like the former should live in tablecmds.c instead of execIndexing.c. (But if you do split things that way, then you need to make index_check_notnull_internal extern). The tests look a bit excessive. Why do we need an isolation test for this? And it looks to me like the other test could be in src/test/regress rather than be a TAP test. After all, that's what you have the ereport(DEBUG1) there, isn't it? "veritify" doesn't exist. The word is "verify". -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Syntax error: function hell() needs an argument. Please choose what hell you want to involve.
Вложения
On 2026-Jan-13, Álvaro Herrera wrote: > However, if we do this, then I think computing notnull_attrs is > pointless. So we should only change the order: do this new check > first, and if we find that any new not-null column is on a generated > column, then we compute both notnull_virtual_attrs and notnull_attrs. > No? Oh, another thing we should do is have a first pass that verifies whether all columns have an appropriate index, without scanning any of them; only if we verify that they all have one (and no generated column is involved) then we start scanning the indexes. Otherwise we waste time scanning one index and verify that it contains no null values, only to realize that the next column does not have an appropriate index to use, and thus we must scan the table. Then the first index scan is wasted work. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom." (Isaac Asimov)
hi.
---------------------
create unlogged table t2_copy(col1 int, col2 int, col3 int, col4 int,
col5 int) with (autovacuum_enabled = off, vacuum_index_cleanup=off);
insert into t2_copy select g, g, g, g,g from generate_series(1, 10_000_000) g;
set enable_seqscan to off;
set enable_bitmapscan to off;
set max_parallel_workers=0;
set max_parallel_workers_per_gather=0;
create index t2_copy_idx on t2_copy(col1, col2, col3, col4, col5);
explain (costs off, analyze) select from t2_copy where col1 is NULL or
col2 is NULL or col3 is NULL or col4 is null or col5 is null \watch
i=0.1 c=10
drop index t2_copy_idx;
create index t2_copy_idx1 on t2_copy(col1);
create index t2_copy_idx2 on t2_copy(col2);
create index t2_copy_idx3 on t2_copy(col3);
create index t2_copy_idx4 on t2_copy(col4);
create index t2_copy_idx5 on t2_copy(col5);
explain (costs off, analyze) select from t2_copy where col1 is NULL or
col2 is NULL or col3 is NULL or col4 is null or col5 is null \watch
i=0.1 c=10
------------------------------------------
By comparing the above two EXPLAIN, I found out that there will be regression
for using one multiple column indexes fast verify mutiple not-null constraints.
create unlogged table t3_copy(col1 int, col2 int, col3 int, col4 int);
create index t3_copy_idx on t3_copy(col1, col2, col3, col4);
alter table t3_copy add not null col1, add not null col2, add not null
col3, add not null col4;
In this case, scanning the t3_copy_idx index to check these four NOT NULL
constraints is actually slower than a full table scan.
Therefore, I further restricted the use of the index-scan mechanism for fast NOT
NULL verification to scenarios where the attribute being checked is the leading
column of the index.
So, for the above example, to let
alter table t3_copy add not null col1, add not null col2, add not null
col3, add not null col4;
utilize indexscan mechanism, we require four indexes, each index leading column
is corresponding to the not-null constraint attribute.
I have not yet addressed all of your other comments, as this
represents a more of a
major change. Therefore, I am submitting the attached patch first.
index_check_notnull function, in v8 it is:
+ /* collect index attnums while loop */
+ for (int i = 0; i < index->indnkeyatts; i++)
+ {
+ attr = TupleDescAttr(tupdesc, (index->indkey.values[i] - 1));
+
+
+ }
in v9, it's:
+ /* collect index attnums while loop */
+ attr = TupleDescAttr(tupdesc, (index->indkey.values[0] - 1));
+
I am not sure that regression tests would be helpful, since the test hinges on
whether ereport(DEBUG1, ...) is reached.
I am not sure about the concurrency implications; therefore, isolation tests are
attached. maybe it's not necessary.
--
jian
https://www.enterprisedb.com