Обсуждение: split tablecmds.c
This file has over 22,000 lines and is too large to be included in GitHub's code search results [0]. It appears to have been given its current form in 2002 by commit 71dc300. Previously, it was named command.c, which dates back to the 80s. Is it time to split it into a few different files, similar to what was done to copy.c in 2020 by commit c532d15? After briefly skimming through it, some areas that seem like they could potentially be moved out are partitions, constraints, permission checks, inheritance, foreign keys, column expressions, table rewriting, attribute merging, TRUNCATE, and CREATE TABLE. This is far from a concrete proposal, but I first wanted to gauge interest in $SUBJECT. [0] https://docs.github.com/en/search-github/github-code-search/about-github-code-search#limitations -- nathan
On 2025-Dec-01, Nathan Bossart wrote: > This file has over 22,000 lines and is too large to be included in GitHub's > code search results [0]. It appears to have been given its current form in > 2002 by commit 71dc300. Previously, it was named command.c, which dates > back to the 80s. Is it time to split it into a few different files, > similar to what was done to copy.c in 2020 by commit c532d15? > > After briefly skimming through it, some areas that seem like they could > potentially be moved out are partitions, constraints, permission checks, > inheritance, foreign keys, column expressions, table rewriting, attribute > merging, TRUNCATE, and CREATE TABLE. This is far from a concrete proposal, > but I first wanted to gauge interest in $SUBJECT. I think it makes sense. It's our largest source file at 690kB the second being pg_dump.c (at 625kB) and also a candidate for splitting. The third one, ruleutils.c, is slightly above half size, 381kB! My first thought would be to move code that deals with catalog changes to files in catalog/. Also a couple of functions related to tablespaces could be perhaps be moved to commands/tablespace.c. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte" (Ijon Tichy en Viajes, Stanislaw Lem)
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
> On 2025-Dec-01, Nathan Bossart wrote:
>> This file has over 22,000 lines and is too large to be included in GitHub's
>> code search results [0]. It appears to have been given its current form in
>> 2002 by commit 71dc300. Previously, it was named command.c, which dates
>> back to the 80s. Is it time to split it into a few different files,
>> similar to what was done to copy.c in 2020 by commit c532d15?
> I think it makes sense. It's our largest source file at 690kB
> the second being pg_dump.c (at 625kB) and also a candidate for
> splitting. The third one, ruleutils.c, is slightly above half size,
> 381kB!
A lot of the bloat seems to be of recent vintage, too:
$ ls -l REL*/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 125011 Jan 14 2014 REL7_4/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 183924 Jan 14 2014 REL8_0/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 195463 Jan 14 2014 REL8_1/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 211853 Jan 14 2014 REL8_2/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 207113 Aug 31 2013 REL8_3/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 240714 May 6 2014 REL8_4/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 250682 Aug 9 2014 REL9_0/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 294495 Dec 22 2015 REL9_1/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 325781 Aug 9 2017 REL9_2/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 329854 Oct 1 2018 REL9_3/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 355629 Feb 3 2020 REL9_4/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 379490 Jul 14 2020 REL9_5/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 380527 Jul 14 2020 REL9_6/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 437331 Jan 6 2022 REL_10/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 512821 Oct 16 2023 REL_11/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 559329 Nov 8 2024 REL_12/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 579305 Nov 4 10:59 REL_13/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 605939 Nov 10 12:43 REL_14/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 628963 Nov 10 12:43 REL_15/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 628950 Nov 10 12:43 REL_16/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 645181 Nov 10 12:43 REL_17/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 708183 Nov 10 12:43 REL_18/src/backend/commands/tablecmds.c
-rw-rw-r--. 1 postgres postgres 706254 Nov 10 12:43 HEAD/src/backend/commands/tablecmds.c
I didn't do any math about it, but that's got to be a far faster rate
of expansion than the overall PG code base. Maybe partitioning is
largely to blame? Perhaps analyzing what functionality got added
here in the past dozen or so years would yield some ideas for how to
split it.
+1 for a split, if we can figure out a good plan.
regards, tom lane
On Mon, Dec 1, 2025, at 3:18 PM, Álvaro Herrera wrote: > My first thought would be to move code that deals with catalog changes > to files in catalog/. Also a couple of functions related to tablespaces > could be perhaps be moved to commands/tablespace.c. > As Tom said partitioning has a big chunk of lines of code. I bet that's the biggest portion. It seems a good candidate to be moved to a new file (partitioning and inheritance). Besides your suggestion, I would add things that set properties (RLS, reloptions, AM, replica identity, generated columns) to another file (tableutils.c?). -- Euler Taveira EDB https://www.enterprisedb.com/
On 2025-Dec-01, Tom Lane wrote:
> I didn't do any math about it, but that's got to be a far faster rate
> of expansion than the overall PG code base. Maybe partitioning is
> largely to blame? Perhaps analyzing what functionality got added
> here in the past dozen or so years would yield some ideas for how to
> split it.
Since 2015, with a cutoff of at least 50 lines added or removed:
199 lines added by: bdc3d7fa237 (Return ObjectAddress in many ALTER TABLE sub-routines, 2015-03-25)
61 lines added by: e42375fc812 (Retain comments on indexes and constraints at ALTER TABLE ... TYPE ..., 2015-07-14)
1313 lines added by: f0e44751d71 (Implement table partitioning., 2016-12-07)
58 lines added by: 3957b58b888 (Fix ALTER TABLE / SET TYPE for irregular inheritance, 2017-01-09)
74 lines removed by 2f5c9d9c9ce (Tweak catalog indexing abstraction for upcoming WARM, 2017-01-31)
283 lines added by: 32173270536 (Identity columns, 2017-04-06)
55 lines removed by 3ec76ff1f2c (Don't explicitly mark range partitioning columns NOT NULL., 2017-05-18)
58 lines removed by b08df9cab77 (Teach predtest.c about CHECK clauses to fix partitioning bugs., 2017-06-14)
131 lines added by: 6f6b99d1335 (Allow a partitioned table to have a default partition., 2017-09-08)
66 lines added by: af20e2d728e (Fix ALTER TABLE code to update domain constraints when needed., 2017-11-01)
59 lines removed by ef6087ee5fa (Minor preparatory refactoring for UPDATE row movement., 2018-01-04)
531 lines added by: 8b08f7d4820 (Local partitioned indexes, 2018-01-19)
67 lines added by: eb7ed3f3063 (Allow UNIQUE indexes on partitioned tables, 2018-02-19)
136 lines added by: 86f575948c7 (Allow FOR EACH ROW triggers on partitioned tables, 2018-03-23)
121 lines added by: 3de241dba86 (Foreign keys on partitioned tables, 2018-04-04)
86 lines added by: 5dfd1e5a669 (Logical decoding of TRUNCATE, 2018-04-07)
54 lines added by: dfa60814198 (Fix tablespace handling for partitioned indexes, 2018-11-03)
237 lines removed by 578b229718e (Remove WITH OIDS support, change oid catalog column visibility., 2018-11-20)
77 lines added by: 3b174b1a355 (Fix missing values when doing ALTER TABLE ALTER COLUMN TYPE, 2019-01-10)
306 lines added by: 03afae201f0 (Move CloneForeignKeyConstraints to tablecmds.c, 2019-01-18)
67 lines added by: 0464fdf07f6 (Create action triggers when partitions are detached, 2019-01-21)
69 lines added by: bbb96c3704c (Allow ALTER TABLE .. SET NOT NULL to skip provably unnecessary scans., 2019-03-13)
84 lines removed by d25f519107b (tableam: relation creation, VACUUM FULL/CLUSTER, SET TABLESPACE., 2019-03-28)
131 lines added by: fc22b6623b6 (Generated columns, 2019-03-30)
588 lines added by: f56f8f8da6a (Support foreign keys that reference partitioned tables, 2019-04-03)
82 lines added by: f4a3fdfbdcd (Avoid order-of-execution problems with ALTER TABLE ADD PRIMARY KEY., 2019-04-23)
61 lines added by: 1fa846f1c9a (Fix cloning of row triggers to sub-partitions, 2020-01-02)
159 lines added by: f595117e24a (ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSION, 2020-01-14)
280 lines added by: 1281a5c907b (Restructure ALTER TABLE execution to fix assorted bugs., 2020-01-15)
64 lines added by: afccd76f1cc (Fix detaching partitions with cloned row triggers, 2020-04-21)
51 lines added by: 086ffddf365 (Fix several DDL issues of generated columns versus inheritance, 2020-05-06)
55 lines removed by f1fcf2d3b2e (Fix timing issue with ALTER TABLE's validate constraint, 2020-07-14)
84 lines added by: 50289819230 (Fix handling of CREATE TABLE LIKE with inheritance., 2020-08-21)
229 lines added by: bbe0a81db69 (Allow configurable LZ4 TOAST compression., 2021-03-19)
352 lines added by: 71f4c8c6f74 (ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY, 2021-03-25)
92 lines added by: a4d75c86bf1 (Extended statistics on expressions, 2021-03-26)
125 lines added by: 8ff1c94649f (Allow TRUNCATE command to truncate foreign tables., 2021-04-08)
60 lines added by: a970edbed30 (Fix ALTER TABLE / INHERIT with generated columns, 2021-05-04)
120 lines added by: 6f70d7ca1d1 (Have ALTER CONSTRAINT recurse on partitioned tables, 2021-05-05)
105 lines added by: 2ed532ee8c4 (Improve error messages about mismatching relkind, 2021-07-08)
60 lines added by: b0483263dda (Add support for SET ACCESS METHOD in ALTER TABLE, 2021-07-28)
85 lines added by: d6f96ed94e7 (Allow specifying column list for foreign key ON DELETE SET actions, 2021-12-08)
283 lines added by: f4566345cf4 (Create foreign key triggers in partitioned tables too, 2022-01-05)
668 lines added by: e056c557aef (Catalog NOT NULL constraints, 2023-04-07)
668 lines removed by 9ce04b50e12 (Revert "Catalog NOT NULL constraints" and fallout, 2023-04-12)
753 lines added by: b0e96f31198 (Catalog not-null constraints, 2023-08-25)
102 lines added by: 04e485273b5 (Move BuildDescForRelation() from tupdesc.c to tablecmds.c, 2023-10-05)
137 lines added by: 5d06e99a3cf (ALTER TABLE command to change generation expression, 2024-01-04)
159 lines added by: 69958631570 (Support identity columns in partitioned tables, 2024-01-16)
56 lines added by: f7cf9494bad (Split some code out from MergeAttributes(), 2024-01-26)
146 lines added by: 34768ee3616 (Add temporal FOREIGN KEY contraints, 2024-03-24)
105 lines added by: 374c7a22904 (Allow specifying an access method for partitioned tables, 2024-03-25)
320 lines added by: 1adf16b8fba (Implement ALTER TABLE ... MERGE PARTITIONS ... command, 2024-04-07)
411 lines added by: 87c21bb9412 (Implement ALTER TABLE ... SPLIT PARTITION ... command, 2024-04-07)
51 lines added by: d9f686a72ee (Fix restore of not-null constraints with inheritance, 2024-04-18)
828 lines removed by 6f8bb7c1e96 (Revert structural changes to not-null constraints, 2024-05-13)
150 lines removed by 8aee330af55 (Revert temporal primary keys and foreign keys, 2024-05-16)
785 lines removed by 3890d90c150 (Revert support for ALTER TABLE ... MERGE/SPLIT PARTITION(S) commands, 2024-08-24)
146 lines added by: 89f908a6d0a (Add temporal FOREIGN KEY contraints, 2024-09-17)
51 lines added by: d69a3f4d70b (Introduce ATT_PARTITIONED_TABLE in tablecmds.c, 2024-09-19)
167 lines added by: 53af9491a04 (Restructure foreign key handling code for ATTACH/DETACH, 2024-10-22)
390 lines added by: 14e87ffa5c5 (Add pg_constraint rows for not-null constraints, 2024-11-08)
52 lines added by: 9321d2fdf80 (Fix collation handling for foreign keys, 2024-11-15)
59 lines added by: 86374c9a0e3 (Split ATExecValidateConstraint into reusable pieces, 2025-01-16)
58 lines added by: b663b9436e7 (Allow NOT VALID foreign key constraints on partitioned tables, 2025-01-23)
120 lines added by: 83ea6c54025 (Virtual generated columns, 2025-02-07)
85 lines added by: f4e53e10b6c (Add ALTER TABLE ... ALTER CONSTRAINT ... SET [NO] INHERIT, 2025-03-05)
89 lines added by: 1d26c2d2c4b (refactor: Split tryAttachPartitionForeignKey(), 2025-03-11)
57 lines added by: 639238b978f (refactor: Split ATExecAlterConstraintInternal(), 2025-03-25)
309 lines added by: eec0040c4bc (Add support for NOT ENFORCED in foreign key constraints, 2025-04-02)
183 lines added by: a379061a22a (Allow NOT NULL constraints to be added as NOT VALID, 2025-04-07)
50 lines removed by 3231fd04552 (Stop creating constraints during DETACH CONCURRENTLY, 2025-10-11)
git log --since '01/01/2015' --format=reference --reverse src/backend/commands/tablecmds.c | while read line; do
sha1=$(echo$line | cut -d" " -f1); sz=$(git show $sha1:src/backend/commands/tablecmds.c | wc); prevsz=$currsz;
currsz=$(echo$sz | awk '{print $1}'); if [ ! -z "$prevsz" ]; then diff=$(($currsz - $prevsz)); if [ $diff -ge 50 ];
thenecho "$diff lines added by: $line" ; elif [ $diff -le -50 ]; then echo "$((-$diff)) lines removed by $line"; fi;
fi;done
(BTW this output makes it clear to me that commit titles do not end with
a period.)
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"You don't solve a bad join with SELECT DISTINCT" #CupsOfFail
https://twitter.com/connor_mc_d/status/1431240081726115845
Hi, On 2025-12-01 11:25:13 -0600, Nathan Bossart wrote: > This file has over 22,000 lines and is too large to be included in GitHub's > code search results [0]. It appears to have been given its current form in > 2002 by commit 71dc300. Previously, it was named command.c, which dates > back to the 80s. Is it time to split it into a few different files, > similar to what was done to copy.c in 2020 by commit c532d15? > > After briefly skimming through it, some areas that seem like they could > potentially be moved out are partitions, constraints, permission checks, > inheritance, foreign keys, column expressions, table rewriting, attribute > merging, TRUNCATE, and CREATE TABLE. This is far from a concrete proposal, > but I first wanted to gauge interest in $SUBJECT. Seems reasonable, however I think that splitting all or most of the pieces you listed into their own files would end up being pointlessly granular.... Greetings, Andres Freund
On Mon, Dec 01, 2025 at 01:59:01PM -0500, Tom Lane wrote: > I didn't do any math about it, but that's got to be a far faster rate > of expansion than the overall PG code base. Maybe partitioning is > largely to blame? Perhaps analyzing what functionality got added > here in the past dozen or so years would yield some ideas for how to > split it. > > +1 for a split, if we can figure out a good plan. I tried to move the partitioning-related code to a new file, and it wasn't too bad. Note that there are a couple of internal-to-tablecmds.c things that need to be exported. Besides that, the attached patch is still pretty rough, and I'm not sure I correctly placed the line in the sand when determining what stays and what goes, but this at least shows the general shape of what's needed. (BTW git was generating an atrocious diff for tablecmds.c. You might need to set the diff algorithm to "minimal" if you are similarly affected.) src/backend/commands/Makefile | 1 + src/backend/commands/meson.build | 1 + src/backend/commands/partcmds.c | 3377 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/backend/commands/tablecmds.c | 3456 +-------------------------------------------------------------------------------------------- src/backend/partitioning/partbounds.c | 1 + src/include/commands/partcmds.h | 53 ++ src/include/commands/tablecmds.h | 134 +++- 7 files changed, 3575 insertions(+), 3448 deletions(-) -- nathan
Вложения
> On Dec 2, 2025, at 06:43, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Dec 01, 2025 at 01:59:01PM -0500, Tom Lane wrote: >> I didn't do any math about it, but that's got to be a far faster rate >> of expansion than the overall PG code base. Maybe partitioning is >> largely to blame? Perhaps analyzing what functionality got added >> here in the past dozen or so years would yield some ideas for how to >> split it. >> >> +1 for a split, if we can figure out a good plan. > > I tried to move the partitioning-related code to a new file, and it wasn't > too bad. Note that there are a couple of internal-to-tablecmds.c things > that need to be exported. Besides that, the attached patch is still pretty > rough, and I'm not sure I correctly placed the line in the sand when > determining what stays and what goes, but this at least shows the general > shape of what's needed. (BTW git was generating an atrocious diff for > tablecmds.c. You might need to set the diff algorithm to "minimal" if you > are similarly affected.) > > src/backend/commands/Makefile | 1 + > src/backend/commands/meson.build | 1 + > src/backend/commands/partcmds.c | 3377 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > src/backend/commands/tablecmds.c | 3456 +-------------------------------------------------------------------------------------------- > src/backend/partitioning/partbounds.c | 1 + > src/include/commands/partcmds.h | 53 ++ > src/include/commands/tablecmds.h | 134 +++- > 7 files changed, 3575 insertions(+), 3448 deletions(-) > > -- > nathan > <v1-0001-move-partition-code-in-tablecmds.c-to-new-file.patch> I am just thinking if tablecmds deserves a subfolder given the large code size. If we just split the file into multiple andstill place them under src/backend/commands, then it would be not easy to identify alter-table related code. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On Mon, Dec 01, 2025 at 04:43:37PM -0600, Nathan Bossart wrote: > I tried to move the partitioning-related code to a new file, and it wasn't > too bad. Note that there are a couple of internal-to-tablecmds.c things > that need to be exported. Besides that, the attached patch is still pretty > rough, and I'm not sure I correctly placed the line in the sand when > determining what stays and what goes, but this at least shows the general > shape of what's needed. (BTW git was generating an atrocious diff for > tablecmds.c. You might need to set the diff algorithm to "minimal" if you > are similarly affected.) Moving all the partition-specific code into a different file makes sense here. Is partcmds.c as name the best fit though? Perhaps a tablecmds_partition.c, with other files named tablecmds_popo.c to indicate the sub-systems formerly in tablecmds.c? > src/backend/commands/Makefile | 1 + > src/backend/commands/meson.build | 1 + > src/backend/commands/partcmds.c | 3377 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > src/backend/commands/tablecmds.c | 3456 +-------------------------------------------------------------------------------------------- > src/backend/partitioning/partbounds.c | 1 + > src/include/commands/partcmds.h | 53 ++ > src/include/commands/tablecmds.h | 134 +++- > 7 files changed, 3575 insertions(+), 3448 deletions(-) The new contents of tablecmds.h don't have any strong dependency with tablecmds.h, so perhaps having the "internal" structures like the ones you are moving here into a new tablecmds_internal.h would be cleaner? Another sub-area of tablecmds.c that could be split is I think the rewrite logic. It has a lot of its own perks that become harder to figure out the more tablecmds.c gets bloated. -- Michael
Вложения
Hi, On Tue, Dec 2, 2025 at 9:04 AM Michael Paquier <michael@paquier.xyz> wrote: > On Mon, Dec 01, 2025 at 04:43:37PM -0600, Nathan Bossart wrote: > > I tried to move the partitioning-related code to a new file, and it wasn't > > too bad. Note that there are a couple of internal-to-tablecmds.c things > > that need to be exported. Besides that, the attached patch is still pretty > > rough, and I'm not sure I correctly placed the line in the sand when > > determining what stays and what goes, but this at least shows the general > > shape of what's needed. (BTW git was generating an atrocious diff for > > tablecmds.c. You might need to set the diff algorithm to "minimal" if you > > are similarly affected.) +1 to splitting tablecmds.c at long last. (I suppose I or someone could’ve proposed that back in Dec 2016 :-). We did create src/backend/partitioning in v11 to move code from then big catalog/partition.c, but this one has stayed untouched since then. Better late than never.) > Moving all the partition-specific code into a different file makes > sense here. Is partcmds.c as name the best fit though? Perhaps a > tablecmds_partition.c, with other files named tablecmds_popo.c to > indicate the sub-systems formerly in tablecmds.c? As Andres mentioned, it’s good to avoid slicing too granularly, but I also thought of the name tablecmds_partition.c as soon as I saw “split tablecmds.c” and “partitioning code.” That seems a reasonable first cut. > > src/backend/commands/Makefile | 1 + > > src/backend/commands/meson.build | 1 + > > src/backend/commands/partcmds.c | 3377 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > src/backend/commands/tablecmds.c | 3456 +-------------------------------------------------------------------------------------------- > > src/backend/partitioning/partbounds.c | 1 + > > src/include/commands/partcmds.h | 53 ++ > > src/include/commands/tablecmds.h | 134 +++- > > 7 files changed, 3575 insertions(+), 3448 deletions(-) > > The new contents of tablecmds.h don't have any strong dependency with > tablecmds.h, so perhaps having the "internal" structures like the ones > you are moving here into a new tablecmds_internal.h would be cleaner? +1 to introducing tablecmds_internal.h as well. > Another sub-area of tablecmds.c that could be split is I think the > rewrite logic. It has a lot of its own perks that become harder to > figure out the more tablecmds.c gets bloated. +1 on splitting out the rewrite logic separately too. -- Thanks, Amit Langote