Обсуждение: 10% drop in code line count in PG 17

Поиск
Список
Период
Сортировка

10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
While working on a talk, I studied the number of code line changes in
each major release, and found PG 17 surprisingly reduced code line count
by 10%. To get the code line count, I used /pgtop/src/tools/codelines,
which runs:

    find . -name '*.[chyl]' | xargs cat| wc -l

Any ideas on the cause of this decrease?  I skimmed the major release
notes but didn't see anything obvious.  I see removal of support for
OpenSSL 1.0.1 and AIX.

---------------------------------------------------------------------------

 version  |  reldate   | months | relnotes |  lines  | change  | % change
----------+------------+--------+----------+---------+---------+----------
 4.2      | 1994-03-17 |        |          |  250872 |         |         
 1.0      | 1995-09-05 |     18 |          |  172470 |  -78402 |      -31
 1.01     | 1996-02-23 |      6 |          |  179463 |    6993 |        4
 1.09     | 1996-11-04 |      8 |          |  178976 |    -487 |        0
 6.0      | 1997-01-29 |      3 |          |  189399 |   10423 |        5
 6.1      | 1997-06-08 |      4 |          |  200709 |   11310 |        5
 6.2      | 1997-10-02 |      4 |          |  225848 |   25139 |       12
 6.3      | 1998-03-01 |      5 |          |  260809 |   34961 |       15
 6.4      | 1998-10-30 |      8 |          |  297918 |   37109 |       14
 6.5      | 1999-06-09 |      7 |          |  331278 |   33360 |       11
 7.0      | 2000-05-08 |     11 |          |  383270 |   51992 |       15
 7.1      | 2001-04-13 |     11 |          |  410500 |   27230 |        7
 7.2      | 2002-02-04 |     10 |      250 |  394274 |  -16226 |       -3
 7.3      | 2002-11-27 |     10 |      305 |  453282 |   59008 |       14
 7.4      | 2003-11-17 |     12 |      263 |  508523 |   55241 |       12
 8.0      | 2005-01-19 |     14 |      230 |  654437 |  145914 |       28
 8.1      | 2005-11-08 |     10 |      174 |  630422 |  -24015 |       -3
 8.2      | 2006-12-05 |     13 |      215 |  684646 |   54224 |        8
 8.3      | 2008-02-04 |     14 |      214 |  762697 |   78051 |       11
 8.4      | 2009-07-01 |     17 |      314 |  939098 |  176401 |       23
 9.0      | 2010-09-20 |     15 |      237 |  999862 |   60764 |        6
 9.1      | 2011-09-12 |     12 |      203 | 1069547 |   69685 |        6
 9.2      | 2012-09-10 |     12 |      238 | 1148192 |   78645 |        7
 9.3      | 2013-09-09 |     12 |      177 | 1195627 |   47435 |        4
 9.4      | 2014-12-18 |     15 |      211 | 1261024 |   65397 |        5
 9.5      | 2016-01-07 |     13 |      193 | 1340005 |   78981 |        6
 9.6      | 2016-09-29 |      9 |      214 | 1380458 |   40453 |        3
 10       | 2017-10-05 |     12 |      189 | 1495196 |  114738 |        8
 11       | 2018-10-18 |     12 |      170 | 1562537 |   67341 |        4
 12       | 2019-10-03 |     11 |      180 | 1616912 |   54375 |        3
 13       | 2020-09-24 |     12 |      178 | 1656030 |   39118 |        2
 14       | 2021-09-30 |     12 |      220 | 1779777 |  123747 |        7
 15       | 2022-10-13 |     12 |      184 | 1815646 |   35869 |        2
 16       | 2023-09-14 |     11 |      206 | 1869401 |   53755 |        2
 17       | 2024-09-26 |     12 |      182 | 1673116 | -196285 |      -10
 18       | 2025-09-25 |     12 |      210 | 1750814 |   77698 |        4
 Averages |            |     11 |      215 |         |         |     5.89

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> While working on a talk, I studied the number of code line changes in
> each major release, and found PG 17 surprisingly reduced code line count
> by 10%. To get the code line count, I used /pgtop/src/tools/codelines,
> which runs:

>     find . -name '*.[chyl]' | xargs cat| wc -l

> Any ideas on the cause of this decrease?

My first thought was that it had to do with the conversion of
src/backend/nodes/ to be largely auto-generated code.  If you
are using codelines against just what is in git, that would look
like a decrease.  However, I see that came in during v16 not v17,
so that's not the explanation.  I'm betting it's some similar
effect though: code getting moved out of the set of files that
will match '*.[chyl]'.

Also ... are you in fact counting only what is in git?  Because
I get different answers:

$ git clean -dfxq
$ git checkout REL_17_0
HEAD is now at d7ec59a63d7 Stamp 17.0.
$ src/tools/codelines
 1664472
$ git checkout REL_16_0
HEAD is now at c372fbbd8e9 Doc: fix release date in release-16.sgml.
$ src/tools/codelines
 1595197

            regards, tom lane



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Wed, Nov 19, 2025 at 03:21:33PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > While working on a talk, I studied the number of code line changes in
> > each major release, and found PG 17 surprisingly reduced code line count
> > by 10%. To get the code line count, I used /pgtop/src/tools/codelines,
> > which runs:
> 
> >     find . -name '*.[chyl]' | xargs cat| wc -l
> 
> > Any ideas on the cause of this decrease?
> 
> My first thought was that it had to do with the conversion of
> src/backend/nodes/ to be largely auto-generated code.  If you
> are using codelines against just what is in git, that would look
> like a decrease.  However, I see that came in during v16 not v17,
> so that's not the explanation.  I'm betting it's some similar
> effect though: code getting moved out of the set of files that
> will match '*.[chyl]'.

Huh.

> Also ... are you in fact counting only what is in git?  Because
> I get different answers:
> 
> $ git clean -dfxq
> $ git checkout REL_17_0
> HEAD is now at d7ec59a63d7 Stamp 17.0.
> $ src/tools/codelines
>  1664472
> $ git checkout REL_16_0
> HEAD is now at c372fbbd8e9 Doc: fix release date in release-16.sgml.
> $ src/tools/codelines
>  1595197

No, I just followed the shell comment I wrote above the 'find' command
shown above:

    # This script is used to compute the total number of "C" lines in the
    # release This should be run from the top of the Git tree after a 'make
    # distclean'

And that tree has been built many times.  Should I change my procedure?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Wed, Nov 19, 2025 at 03:21:33PM -0500, Tom Lane wrote:
>> Also ... are you in fact counting only what is in git?  Because
>> I get different answers:

> No, I just followed the shell comment I wrote above the 'find' command
> shown above:

>     # This script is used to compute the total number of "C" lines in the
>     # release This should be run from the top of the Git tree after a 'make
>     # distclean'

> And that tree has been built many times.  Should I change my procedure?

Does "git status --ignored" show any leftover junk files?

I've found that "make distclean" isn't 100% reliable if you aren't
religious about doing it before every git pull or other change of
git HEAD.  The pull might bring in new makefiles with a different
idea of what needs to be cleaned.  For .c files I'd kind of expect
leftovers to be obvious because they won't get hidden by .gitignore
rules, but maybe you hit some case where they're still hidden.

I've largely migrated to using "git clean -dfxq", which has about
the same results in modern branches, but is faster and never (IME)
misses anything.

            regards, tom lane



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Wed, Nov 19, 2025 at 04:22:37PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Wed, Nov 19, 2025 at 03:21:33PM -0500, Tom Lane wrote:
> >> Also ... are you in fact counting only what is in git?  Because
> >> I get different answers:
> 
> > No, I just followed the shell comment I wrote above the 'find' command
> > shown above:
> 
> >     # This script is used to compute the total number of "C" lines in the
> >     # release This should be run from the top of the Git tree after a 'make
> >     # distclean'
> 
> > And that tree has been built many times.  Should I change my procedure?
> 
> Does "git status --ignored" show any leftover junk files?
> 
> I've found that "make distclean" isn't 100% reliable if you aren't
> religious about doing it before every git pull or other change of
> git HEAD.  The pull might bring in new makefiles with a different
> idea of what needs to be cleaned.  For .c files I'd kind of expect
> leftovers to be obvious because they won't get hidden by .gitignore
> rules, but maybe you hit some case where they're still hidden.
> 
> I've largely migrated to using "git clean -dfxq", which has about
> the same results in modern branches, but is faster and never (IME)
> misses anything.

I think you are right.  Attached is the difference between the output
for 16 & 17.  Let me do some more research and run all the versions
again and report back, thanks.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Вложения

Re: 10% drop in code line count in PG 17

От
Álvaro Herrera
Дата:
On 2025-Nov-19, Tom Lane wrote:

> > No, I just followed the shell comment I wrote above the 'find' command
> > shown above:
> 
> >     # This script is used to compute the total number of "C" lines in the
> >     # release This should be run from the top of the Git tree after a 'make
> >     # distclean'
> 
> > And that tree has been built many times.  Should I change my procedure?
> 
> Does "git status --ignored" show any leftover junk files?

Maybe it'd be better to use `git ls-files` to create the list of files.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"But static content is just dynamic content that isn't moving!"
                http://smylers.hates-software.com/2007/08/15/fe244d0c.html



Re: 10% drop in code line count in PG 17

От
David Rowley
Дата:
On Thu, 20 Nov 2025 at 10:58, Bruce Momjian <bruce@momjian.us> wrote:
> I think you are right.  Attached is the difference between the output
> for 16 & 17.  Let me do some more research and run all the versions
> again and report back, thanks.

Maybe you'd be better with git ls-files if you only want just what's
in the repo. Something like:

for b in "REL8_0_0" "REL8_1_0" "REL8_2_0" "REL8_3_0" "REL8_4_0"
"REL9_0_0" "REL9_1_0" "REL9_2_0" "REL9_3_0" "REL9_4_0" "REL9_5_0"
"REL9_6_0" "REL_10_0" "REL_11_0" "REL_12_0" "REL_13_0" "REL_14_0"
"REL_15_0" "REL_16_0" "REL_17_0" "REL_18_0" "master"; do git checkout
-f $b > /dev/null 2>&1 && echo -n "$b " && git ls-files -- '*.[chyl]'
| xargs cat | wc -l; done

Careful with the git checkout "-f" though.

David



Re: 10% drop in code line count in PG 17

От
Daniel Gustafsson
Дата:
> On 19 Nov 2025, at 20:59, Bruce Momjian <bruce@momjian.us> wrote:

> While working on a talk, I studied the number of code line changes in
> each major release,

This script will only pick up C, but will pick up C in src/test but not any
Perl code using the C modules in src/test etc.  These days we also have C++ and
some Python in the tree.  Maybe it's time to revise it for todays codebase
which is quite different from when it was written 20 years ago?

--
Daniel Gustafsson




Re: 10% drop in code line count in PG 17

От
Álvaro Herrera
Дата:
On 2025-Nov-20, David Rowley wrote:

> Maybe you'd be better with git ls-files if you only want just what's
> in the repo. Something like:
> 
> for b in "REL8_0_0" "REL8_1_0" "REL8_2_0" "REL8_3_0" "REL8_4_0"
> "REL9_0_0" "REL9_1_0" "REL9_2_0" "REL9_3_0" "REL9_4_0" "REL9_5_0"
> "REL9_6_0" "REL_10_0" "REL_11_0" "REL_12_0" "REL_13_0" "REL_14_0"
> "REL_15_0" "REL_16_0" "REL_17_0" "REL_18_0" "master"; do git checkout
> -f $b > /dev/null 2>&1 && echo -n "$b " && git ls-files -- '*.[chyl]'
> | xargs cat | wc -l; done

Maybe this should also consider .pl and .pm files ... we now have almost
90k lines of Perl code in branch master:

I perhan: master 0 0$ git ls-files -- '*.pl' | xargs cat | wc -l
77234
C perhan: master 0 0 0$ git ls-files -- '*.pm' | xargs cat | wc -l
10386

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php



Re: 10% drop in code line count in PG 17

От
Aleksander Alekseev
Дата:
Hi Bruce,

> While working on a talk, I studied the number of code line changes in
> each major release, and found PG 17 surprisingly reduced code line count
> by 10%. To get the code line count, I used /pgtop/src/tools/codelines,
> which runs:
>
>         find . -name '*.[chyl]' | xargs cat| wc -l

FWIW I get different results with `cloc`:

$ git checkout REL_18_STABLE
$ git clean -dfx # be careful! this will drop your local .clangd settings etc
$ cloc ./

github.com/AlDanial/cloc v 1.98  T=3.38 s (1448.6 files/s, 915951.4 lines/s)
---------------------------------------------------------------------------------------
Language                             files          blank
comment           code
---------------------------------------------------------------------------------------
C                                     1555         189668
393758         940984
PO File                                466         180914
221367         543216
SQL                                    791          30420
23631         124104
C/C++ Header                           973          18935
64368         114176
Perl                                   335          13402
12264          60254
XML                                      3              4
15          30922
... skipped ...
---------------------------------------------------------------------------------------
SUM:                                  4895         446873
728634        1919686
---------------------------------------------------------------------------------------

$ git checkout REL_17_STABLE
$ cloc ./

github.com/AlDanial/cloc v 1.98  T=2.68 s (1764.0 files/s, 1104266.3 lines/s)
---------------------------------------------------------------------------------------
Language                             files          blank
comment           code
---------------------------------------------------------------------------------------
C                                     1507         181725
376154         905987
PO File                                466         174902
211970         529317
SQL                                    754          28606
21625         115742
C/C++ Header                           943          18255
61771         100741
Perl                                   309          11882
10905          52974
XML                                      3              4
15          30922
... skipped ...
---------------------------------------------------------------------------------------
SUM:                                  4733         428393
695356        1839064
---------------------------------------------------------------------------------------

Overall, there is a 4% increase according to this tool. What is
convenient about `cloc` - you can count only what you want, e.g. code
without comments, etc.

-- 
Best regards,
Aleksander Alekseev



Re: 10% drop in code line count in PG 17

От
Aleksander Alekseev
Дата:
Hi,

> > While working on a talk, I studied the number of code line changes in
> > each major release, and found PG 17 surprisingly reduced code line count
> > by 10%. To get the code line count, I used /pgtop/src/tools/codelines,
> > which runs:
>
> [..]
> Overall, there is a 4% increase according to this tool. What is
> convenient about `cloc` - you can count only what you want, e.g. code
> without comments, etc.

Oops, I didn't notice that you were comparing PG16 and PG17. Still,
the result with `cloc` is similar, +4% approximately. Apologies for
the noise.

-- 
Best regards,
Aleksander Alekseev



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Thu, Nov 20, 2025 at 12:23:25PM +1300, David Rowley wrote:
> On Thu, 20 Nov 2025 at 10:58, Bruce Momjian <bruce@momjian.us> wrote:
> > I think you are right.  Attached is the difference between the output
> > for 16 & 17.  Let me do some more research and run all the versions
> > again and report back, thanks.
> 
> Maybe you'd be better with git ls-files if you only want just what's
> in the repo. Something like:
> 
> for b in "REL8_0_0" "REL8_1_0" "REL8_2_0" "REL8_3_0" "REL8_4_0"
> "REL9_0_0" "REL9_1_0" "REL9_2_0" "REL9_3_0" "REL9_4_0" "REL9_5_0"
> "REL9_6_0" "REL_10_0" "REL_11_0" "REL_12_0" "REL_13_0" "REL_14_0"
> "REL_15_0" "REL_16_0" "REL_17_0" "REL_18_0" "master"; do git checkout
> -f $b > /dev/null 2>&1 && echo -n "$b " && git ls-files -- '*.[chyl]'
> | xargs cat | wc -l; done

Yes, I like "git ls-files" since it gives the same count as Tom's
version but doesn't modify the git tree.  The old script pre-dates git
and I didn't consider "git" could give us a better solution.  Attached
is the applied patch.

And here are the updated line counts.  I went all the way back to 7.1
which is the last stasble git branch.

---------------------------------------------------------------------------

 version  |  reldate   | months | changes | C lines | C changes | % C change
----------+------------+--------+---------+---------+-----------+------------
 4.2      | 1994-03-17 |        |         |  250872 |           |
 1.0      | 1995-09-05 |     18 |         |  172470 |    -78402 |        -31
 1.01     | 1996-02-23 |      6 |         |  179463 |      6993 |          4
 1.09     | 1996-11-04 |      8 |         |  178976 |      -487 |          0
 6.0      | 1997-01-29 |      3 |         |  189399 |     10423 |          5
 6.1      | 1997-06-08 |      4 |         |  200709 |     11310 |          5
 6.2      | 1997-10-02 |      4 |         |  225848 |     25139 |         12
 6.3      | 1998-03-01 |      5 |         |  260809 |     34961 |         15
 6.4      | 1998-10-30 |      8 |         |  297918 |     37109 |         14
 6.5      | 1999-06-09 |      7 |         |  331278 |     33360 |         11
 7.0      | 2000-05-08 |     11 |         |  383270 |     51992 |         15
 7.1      | 2001-04-13 |     11 |         |  380642 |     -2628 |          0
 7.2      | 2002-02-04 |     10 |     250 |  425898 |     45256 |         11
 7.3      | 2002-11-27 |     10 |     305 |  439816 |     13918 |          3
 7.4      | 2003-11-17 |     12 |     263 |  522371 |     82555 |         18
 8.0      | 2005-01-19 |     14 |     230 |  586127 |     63756 |         12
 8.1      | 2005-11-08 |     10 |     174 |  625253 |     39126 |          6
 8.2      | 2006-12-05 |     13 |     215 |  684726 |     59473 |          9
 8.3      | 2008-02-04 |     14 |     214 |  765100 |     80374 |         11
 8.4      | 2009-07-01 |     17 |     314 |  817849 |     52749 |          6
 9.0      | 2010-09-20 |     15 |     237 |  870790 |     52941 |          6
 9.1      | 2011-09-12 |     12 |     203 |  932936 |     62146 |          7
 9.2      | 2012-09-10 |     12 |     238 |  987460 |     54524 |          5
 9.3      | 2013-09-09 |     12 |     177 | 1040813 |     53353 |          5
 9.4      | 2014-12-18 |     15 |     211 | 1096707 |     55894 |          5
 9.5      | 2016-01-07 |     13 |     193 | 1167110 |     70403 |          6
 9.6      | 2016-09-29 |      9 |     214 | 1219720 |     52610 |          4
 10       | 2017-10-05 |     12 |     189 | 1316447 |     96727 |          7
 11       | 2018-10-18 |     12 |     170 | 1369590 |     53143 |          4
 12       | 2019-10-03 |     11 |     180 | 1423215 |     53625 |          3
 13       | 2020-09-24 |     12 |     178 | 1473738 |     50523 |          3
 14       | 2021-09-30 |     12 |     220 | 1558178 |     84440 |          5
 15       | 2022-10-13 |     12 |     184 | 1587763 |     29585 |          1
 16       | 2023-09-14 |     11 |     206 | 1608031 |     20268 |          1
 17       | 2024-09-26 |     12 |     182 | 1673116 |     65085 |          4
 18       | 2025-09-25 |     12 |     210 | 1750814 |     77698 |          4
 Averages |            |     11 |     215 |         |           |       5.60


-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Вложения

Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Thu, Nov 20, 2025 at 11:42:39AM +0100, Álvaro Herrera wrote:
> On 2025-Nov-20, David Rowley wrote:
> 
> > Maybe you'd be better with git ls-files if you only want just what's
> > in the repo. Something like:
> > 
> > for b in "REL8_0_0" "REL8_1_0" "REL8_2_0" "REL8_3_0" "REL8_4_0"
> > "REL9_0_0" "REL9_1_0" "REL9_2_0" "REL9_3_0" "REL9_4_0" "REL9_5_0"
> > "REL9_6_0" "REL_10_0" "REL_11_0" "REL_12_0" "REL_13_0" "REL_14_0"
> > "REL_15_0" "REL_16_0" "REL_17_0" "REL_18_0" "master"; do git checkout
> > -f $b > /dev/null 2>&1 && echo -n "$b " && git ls-files -- '*.[chyl]'
> > | xargs cat | wc -l; done
> 
> Maybe this should also consider .pl and .pm files ... we now have almost
> 90k lines of Perl code in branch master:
> 
> I perhan: master 0 0$ git ls-files -- '*.pl' | xargs cat | wc -l
> 77234
> C perhan: master 0 0 0$ git ls-files -- '*.pm' | xargs cat | wc -l
> 10386

Well, I am trying to count only the code that is part of a cluster
install, or optionally an install for extensions.  Aren't most of the
Perl files testing?  Not sure we want to count that.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Thu, Nov 20, 2025 at 10:38:49AM +0100, Daniel Gustafsson wrote:
> > On 19 Nov 2025, at 20:59, Bruce Momjian <bruce@momjian.us> wrote:
> 
> > While working on a talk, I studied the number of code line changes in
> > each major release,
> 
> This script will only pick up C, but will pick up C in src/test but not any
> Perl code using the C modules in src/test etc.  These days we also have C++ and
> some Python in the tree.  Maybe it's time to revise it for todays codebase
> which is quite different from when it was written 20 years ago?

Yeah, that's part of a larger discussion.   In an email I just sent I
suggested we are trying to count files that are part of a cluster
install, rather than testing files, but again, needs discussion.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Thu, Nov 20, 2025 at 04:49:51PM +0300, Aleksander Alekseev wrote:
> Hi,
> 
> > > While working on a talk, I studied the number of code line changes in
> > > each major release, and found PG 17 surprisingly reduced code line count
> > > by 10%. To get the code line count, I used /pgtop/src/tools/codelines,
> > > which runs:
> >
> > [..]
> > Overall, there is a 4% increase according to this tool. What is
> > convenient about `cloc` - you can count only what you want, e.g. code
> > without comments, etc.
> 
> Oops, I didn't notice that you were comparing PG16 and PG17. Still,
> the result with `cloc` is similar, +4% approximately. Apologies for
> the noise.

Yes, that is another discussion we can have --- whether line count alone
is what we want.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
Daniel Gustafsson
Дата:
> On 20 Nov 2025, at 21:30, Bruce Momjian <bruce@momjian.us> wrote:

> Yeah, that's part of a larger discussion.   In an email I just sent I
> suggested we are trying to count files that are part of a cluster
> install, rather than testing files, but again, needs discussion.

Right, but that was sort of my point, you are counting lines which aren't part
of the cluster install since src/test has lot's of C code which is just tests.

 $ find src/test/ -name '*.[chyl]' | xargs cat|wc -l
   23587

And the cluster install does contain C++ which isn't counted for.

$ find . -name '*.cpp' | xargs cat|wc -l
    1485

Counting just lines in a cluster install is a valid use case but the script
might need some adaptations to match the current tree.

--
Daniel Gustafsson




Re: 10% drop in code line count in PG 17

От
David Rowley
Дата:
On Fri, 21 Nov 2025 at 09:27, Bruce Momjian <bruce@momjian.us> wrote:
> # This script is used to compute the total number of "C" lines in the release
> -# This should be run from the top of the Git tree after a 'make distclean'
> -find . -name '*.[chyl]' | xargs cat| wc -l
> +# This should be run from the top of the Git tree.
> +git ls-files -- '*.[chyl]' | xargs cat | wc -l

I think you need to keep the "top of the Git tree" comment as git
ls-files is context-based.

David



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Fri, Nov 21, 2025 at 10:16:56AM +1300, David Rowley wrote:
> On Fri, 21 Nov 2025 at 09:27, Bruce Momjian <bruce@momjian.us> wrote:
> > # This script is used to compute the total number of "C" lines in the release
> > -# This should be run from the top of the Git tree after a 'make distclean'
> > -find . -name '*.[chyl]' | xargs cat| wc -l
> > +# This should be run from the top of the Git tree.
    ---------------------------------------------------

> > +git ls-files -- '*.[chyl]' | xargs cat | wc -l
> 
> I think you need to keep the "top of the Git tree" comment as git
> ls-files is context-based.

Uh, the current file has this comment:

# This script is used to compute the total number of "C" lines in the release
# This should be run from the top of the Git tree.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
David Rowley
Дата:
On Fri, 21 Nov 2025 at 10:26, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Fri, Nov 21, 2025 at 10:16:56AM +1300, David Rowley wrote:
> > I think you need to keep the "top of the Git tree" comment as git
> > ls-files is context-based.
>
> Uh, the current file has this comment:

Oh. I misread the patch. Mistakenly thought you'd removed that entire
line. (I normally use a difftool, but didn't in this instance).

David



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Thu, Nov 20, 2025 at 03:30:15PM -0500, Bruce Momjian wrote:
> On Thu, Nov 20, 2025 at 10:38:49AM +0100, Daniel Gustafsson wrote:
> > > On 19 Nov 2025, at 20:59, Bruce Momjian <bruce@momjian.us> wrote:
> > 
> > > While working on a talk, I studied the number of code line changes in
> > > each major release,
> > 
> > This script will only pick up C, but will pick up C in src/test but not any
> > Perl code using the C modules in src/test etc.  These days we also have C++ and
> > some Python in the tree.  Maybe it's time to revise it for todays codebase
> > which is quite different from when it was written 20 years ago?
> 
> Yeah, that's part of a larger discussion.   In an email I just sent I
> suggested we are trying to count files that are part of a cluster
> install, rather than testing files, but again, needs discussion.

Actually, another discussion would be why we have src/tools/codelines in
the git tree at all.  I added it in 2005 to use in counting code lines,
and I thought we could consider it our standard method, but I am not
sure anyone aside from me even uses it, and it is clear there are
multiple methods people consider valid.  Should we just remove it?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: 10% drop in code line count in PG 17

От
Aleksander Alekseev
Дата:
Hi Bruce,

> Actually, another discussion would be why we have src/tools/codelines in
> the git tree at all.  I added it in 2005 to use in counting code lines,
> and I thought we could consider it our standard method, but I am not
> sure anyone aside from me even uses it, and it is clear there are
> multiple methods people consider valid.  Should we just remove it?

I think we should.

-- 
Best regards,
Aleksander Alekseev



Re: 10% drop in code line count in PG 17

От
Peter Eisentraut
Дата:
On 21.11.25 01:49, Bruce Momjian wrote:
> Actually, another discussion would be why we have src/tools/codelines in
> the git tree at all.  I added it in 2005 to use in counting code lines,
> and I thought we could consider it our standard method, but I am not
> sure anyone aside from me even uses it, and it is clear there are
> multiple methods people consider valid.  Should we just remove it?

I think so.



Re: 10% drop in code line count in PG 17

От
Bruce Momjian
Дата:
On Fri, Nov 21, 2025 at 01:13:50PM +0100, Peter Eisentraut wrote:
> On 21.11.25 01:49, Bruce Momjian wrote:
> > Actually, another discussion would be why we have src/tools/codelines in
> > the git tree at all.  I added it in 2005 to use in counting code lines,
> > and I thought we could consider it our standard method, but I am not
> > sure anyone aside from me even uses it, and it is clear there are
> > multiple methods people consider valid.  Should we just remove it?
> 
> I think so.

Removed.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.