Обсуждение: Re: split func.sgml to separated individual sgml files

Поиск
Список
Период
Сортировка

Re: split func.sgml to separated individual sgml files

От
Corey Huinker
Дата:
The following is step-by-step logic.


The end result (one file per section) seems good to me.

I suspect that reviewer burden may be the biggest barrier to going forward. Perhaps breaking up the changes so that each new sect1 file gets its own commit, allowing the reviewer to more easily (if not programmatically) verify that the text that moved out of func.sgml moved into func-sect-foo.sgml.

Granted, the committer will likely squash all of those commits down into one big one, but by the the hard work of reviewing is done by then.






Re: split func.sgml to separated individual sgml files

От
"David G. Johnston"
Дата:
On Wed, Nov 13, 2024 at 1:11 PM Corey Huinker <corey.huinker@gmail.com> wrote:
The following is step-by-step logic.


The end result (one file per section) seems good to me.

I suspect that reviewer burden may be the biggest barrier to going forward. Perhaps breaking up the changes so that each new sect1 file gets its own commit, allowing the reviewer to more easily (if not programmatically) verify that the text that moved out of func.sgml moved into func-sect-foo.sgml.

Granted, the committer will likely squash all of those commits down into one big one, but by the the hard work of reviewing is done by then.


Validation is pretty trivial.  I just built the before and after HTML files and confirmed they are exactly the same size.

I suppose we might have lost some comments or something that wouldn't end up visible in the HTML (seems unlikely) but this is basically one-and-done so long as you don't let other commits happen (that touch this area) while you extract and build HEAD and then compare it to the patched build results.  The git diff will let us know the script didn't affect any source files it wasn't supposed to.

In short, ready to commit (see last paragraph below however), but the committer will need to run the python script at the time of commit on the then-current tree.

In my recent patch touching filelist.sgml I would be placing this new %allfiles_func; line pairing at the top just beneath %allfiles; which is the first child element.  But the choice made here makes sense should this go in first.

There is little downside, though, to renaming the existing %allfiles; to %allfiles_ref; It's a local-only name.

David J.

Re: split func.sgml to separated individual sgml files

От
jian he
Дата:
hi.

after run the v2 python script and ``git apply
v2-0001-update-filelist.sgml-allfiles.sgml.no-cfbot``
git status -u
shows:

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   doc/src/sgml/filelist.sgml
        deleted:    doc/src/sgml/func.sgml

That means to verify the changes, we only need to verify html files
related to "functions".

I use GNU diff to compare the HTML output of doc/src/sgml/func.sgml generated
from the master branch against the HTML file produced by the patch.
For example, $DOC9 is the PATCH (split func.sgml) html file directory, $DOC5 is
the master branch html file directory.  and no message produced while running
diff, which means the patch (with the script) produced output is the
same as the master branch.

diff $DOC5/functions.html $DOC9/functions.html
diff $DOC5/functions-logical.html $DOC9/functions-logical.html
diff $DOC5/functions-comparison.html $DOC9/functions-comparison.html
diff $DOC5/functions-math.html $DOC9/functions-math.html
diff $DOC5/functions-string.html $DOC9/functions-string.html
diff $DOC5/functions-binarystring.html $DOC9/functions-binarystring.html
diff $DOC5/functions-matching.html $DOC9/functions-matching.html
diff $DOC5/functions-formatting.html $DOC9/functions-formatting.html
diff $DOC5/functions-datetime.html $DOC9/functions-datetime.html
diff $DOC5/functions-enum.html $DOC9/functions-enum.html
diff $DOC5/functions-geometry.html $DOC9/functions-geometry.html
diff $DOC5/functions-net.html $DOC9/functions-net.html
diff $DOC5/functions-textsearch.html $DOC9/functions-textsearch.html
diff $DOC5/functions-uuid.html $DOC9/functions-uuid.html
diff $DOC5/functions-xml.html $DOC9/functions-xml.html
diff $DOC5/functions-json.html $DOC9/functions-json.html
diff $DOC5/functions-sequence.html $DOC9/functions-sequence.html
diff $DOC5/functions-conditional.html $DOC9/functions-conditional.html
diff $DOC5/functions-array.html $DOC9/functions-array.html
diff $DOC5/functions-range.html $DOC9/functions-range.html
diff $DOC5/functions-aggregate.html $DOC9/functions-aggregate.html
diff $DOC5/functions-window.html $DOC9/functions-window.html
diff $DOC5/functions-merge-support.html $DOC9/functions-merge-support.html
diff $DOC5/functions-subquery.html $DOC9/functions-subquery.html
diff $DOC5/functions-comparisons.html $DOC9/functions-comparisons.html
diff $DOC5/functions-srf.html $DOC9/functions-srf.html
diff $DOC5/functions-info.html $DOC9/functions-info.html
diff $DOC5/functions-admin.html $DOC9/functions-admin.html
diff $DOC5/functions-trigger.html $DOC9/functions-trigger.html
diff $DOC5/functions-event-triggers.html $DOC9/functions-event-triggers.html
diff $DOC5/functions-statistics.html $DOC9/functions-statistics.html



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-07-29 Tu 2:15 AM, jian he wrote:
hi.

after run the v2 python script and ``git apply
v2-0001-update-filelist.sgml-allfiles.sgml.no-cfbot``
git status -u
shows:

Changes not staged for commit:  (use "git add/rm <file>..." to update what will be committed)  (use "git restore <file>..." to discard changes in working directory)        modified:   doc/src/sgml/filelist.sgml        deleted:    doc/src/sgml/func.sgml

That means to verify the changes, we only need to verify html files
related to "functions".

I use GNU diff to compare the HTML output of doc/src/sgml/func.sgml generated
from the master branch against the HTML file produced by the patch.
For example, $DOC9 is the PATCH (split func.sgml) html file directory, $DOC5 is
the master branch html file directory.  and no message produced while running
diff, which means the patch (with the script) produced output is the
same as the master branch.

[snip]


OK. I'm inclined to do this after the CF finishes, to avoid collisions with other patches. I assume it's going to make the CFbot fairly unhappy.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> OK. I'm inclined to do this after the CF finishes, to avoid collisions 
> with other patches. I assume it's going to make the CFbot fairly unhappy.

+1 for proceeding that way.  (I did not look at whether the proposed
changes are sane, but I agree that this'll inevitably break a lot of
pending patches.)

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-07-29 Tu 11:40 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> OK. I'm inclined to do this after the CF finishes, to avoid collisions
>> with other patches. I assume it's going to make the CFbot fairly unhappy.
> +1 for proceeding that way.  (I did not look at whether the proposed
> changes are sane, but I agree that this'll inevitably break a lot of
> pending patches.)
>
>             


Done.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Florents Tselai
Дата:


On 4 Aug 2025, at 4:09 PM, Andrew Dunstan <andrew@dunslane.net> wrote:


On 2025-07-29 Tu 11:40 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
OK. I'm inclined to do this after the CF finishes, to avoid collisions
with other patches. I assume it's going to make the CFbot fairly unhappy.
+1 for proceeding that way.  (I did not look at whether the proposed
changes are sane, but I agree that this'll inevitably break a lot of
pending patches.)




Done.


I discovered that when changing for func/func-aggregate.sgml, the HTML wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?

Вложения

Re: split func.sgml to separated individual sgml files

От
"Euler Taveira"
Дата:
On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
> While working on this https://commitfest.postgresql.org/patch/6020/
> I discovered that when changing for func/func-aggregate.sgml, the HTML
> wasn’t marked for update.
>
> IIUC the doc/Makefile should be updated as attached, right ?
>

Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
Вложения

Re: split func.sgml to separated individual sgml files

От
Florents Tselai
Дата:


On 1 Sep 2025, at 4:35 PM, Euler Taveira <euler@eulerto.com> wrote:

On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
While working on this https://commitfest.postgresql.org/patch/6020/
I discovered that when changing for func/func-aggregate.sgml, the HTML
wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?


Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too. 



Вложения

Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-09-01 Mo 11:44 AM, Florents Tselai wrote:


On 1 Sep 2025, at 4:35 PM, Euler Taveira <euler@eulerto.com> wrote:

On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
While working on this https://commitfest.postgresql.org/patch/6020/
I discovered that when changing for func/func-aggregate.sgml, the HTML
wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?


Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too. 




Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't appear to have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the Makefile back in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the discussion thread.[1]


cheers


andrew


[1] https://www.postgresql.org/message-id/F7102912-0BDA-42A3-BDCF-8A4CBD1CC688%40yesql.se

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Florents Tselai
Дата:



On Tue, Sep 2, 2025 at 5:54 PM Andrew Dunstan <andrew@dunslane.net> wrote:


On 2025-09-01 Mo 11:44 AM, Florents Tselai wrote:


On 1 Sep 2025, at 4:35 PM, Euler Taveira <euler@eulerto.com> wrote:

On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
While working on this https://commitfest.postgresql.org/patch/6020/
I discovered that when changing for func/func-aggregate.sgml, the HTML
wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?


Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too. 




Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't appear to have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the Makefile back in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the discussion thread.[1]


From the message and discussion  in 5b7da5c261d it looks like we do; 
and I've seen some messages here and there that people have indeed trouble applying patches due to spurious whitespace
and special chars. 
So I assume the better solution would be having such checks in meson too,

Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
> v3 does that.
> Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too.
>
> Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't
appearto have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the
Makefileback in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the
discussionthread.[1] 

I have been working on running these checks under the Meson build
system. To do this, I converted the checks into a Perl script
(sgml_syntax_check) and ran it against both the Makefile and Meson.
Test's name is 'sgml_syntax_check' in the Meson. One difference I
noticed: I could not find a way in Meson to create a test that does
not run by default. As a result, this syntax test runs every time you
run the 'meson test'. This behaviour differs from Autoconf, but I
think it is acceptable.

Additionally, some of the CI OSes were missing docbook-xml; but it has
now been installed.

I did not create a new thread for that, I can create one if you think
that it would be better.

CI run with the attached patch applied:
https://cirrus-ci.com/build/6610354173640704

--
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
>> v3 does that.
>> Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too.
>>
>> Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't
appearto have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the
Makefileback in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the
discussionthread.[1]
 
> I have been working on running these checks under the Meson build
> system.


Thanks for this!


> To do this, I converted the checks into a Perl script
> (sgml_syntax_check) and ran it against both the Makefile and Meson.
> Test's name is 'sgml_syntax_check' in the Meson. One difference I
> noticed: I could not find a way in Meson to create a test that does
> not run by default. As a result, this syntax test runs every time you
> run the 'meson test'. This behaviour differs from Autoconf, but I
> think it is acceptable.


Yes, I think so too.


>
> Additionally, some of the CI OSes were missing docbook-xml; but it has
> now been installed.
>
> I did not create a new thread for that, I can create one if you think
> that it would be better.
>
> CI run with the attached patch applied:
> https://cirrus-ci.com/build/6610354173640704
>

I am away this coming week, will check it out in detail when I return.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
>> v3 does that.
>> Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too.
>>
>> Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't
appearto have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the
Makefileback in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the
discussionthread.[1]
 
> I have been working on running these checks under the Meson build
> system. To do this, I converted the checks into a Perl script
> (sgml_syntax_check) and ran it against both the Makefile and Meson.
> Test's name is 'sgml_syntax_check' in the Meson. One difference I
> noticed: I could not find a way in Meson to create a test that does
> not run by default. As a result, this syntax test runs every time you
> run the 'meson test'. This behaviour differs from Autoconf, but I
> think it is acceptable.
>
> Additionally, some of the CI OSes were missing docbook-xml; but it has
> now been installed.
>
> I did not create a new thread for that, I can create one if you think
> that it would be better.
>
> CI run with the attached patch applied:
> https://cirrus-ci.com/build/6610354173640704


Hi Bilal,

This got preempted slightly by Tom's commit 170a8a3f460, but I think 
it's worth doing. I tried to simplify it some. See attached. There 
doesn't seem to me to be any point in using a different set of files for 
the tab tests and the NBSP tests. If we use the same set of files we can 
improve the efficiency easily by opening them only once. Here we just 
look for all the sgml files and all the xsl files and process them all.

WDYT?



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Вложения

Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:
>> Test's name is 'sgml_syntax_check' in the Meson. One difference I
>> noticed: I could not find a way in Meson to create a test that does
>> not run by default. As a result, this syntax test runs every time you
>> run the 'meson test'. This behaviour differs from Autoconf, but I
>> think it is acceptable.

Actually, I've been meaning to complain about the fact that these
checks aren't run by the default Makefile target.  I never remember
that there is a separate "check" target, and even if I did remember
it's mostly useless to me because I always want to look at the
rendered HTML.  So when I'm working on the docs I always just say
"make" in the doc/src/sgml directory.  It'd be helpful, at least to
me, if the default target ran the tabs and nbsp checks.  It already
does run xmllint, so that change could probably be integrated with
what you've done here without too much trouble.

> This got preempted slightly by Tom's commit 170a8a3f460, but I think 
> it's worth doing. I tried to simplify it some. See attached. There 
> doesn't seem to me to be any point in using a different set of files for 
> the tab tests and the NBSP tests. If we use the same set of files we can 
> improve the efficiency easily by opening them only once. Here we just 
> look for all the sgml files and all the xsl files and process them all.

+1 for merging those two checks into one pass, especially if we're
to run them by default.

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Hi Bilal,
>
> This got preempted slightly by Tom's commit 170a8a3f460, but I think
> it's worth doing. I tried to simplify it some. See attached. There
> doesn't seem to me to be any point in using a different set of files for
> the tab tests and the NBSP tests. If we use the same set of files we can
> improve the efficiency easily by opening them only once. Here we just
> look for all the sgml files and all the xsl files and process them all.
>
> WDYT?

It looks good to me. I made 2 changes to your patch:

1- Declaration of $line_no is lost, I re-added it.
2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
> >
> > Hi Bilal,
> >
> > This got preempted slightly by Tom's commit 170a8a3f460, but I think
> > it's worth doing. I tried to simplify it some. See attached. There
> > doesn't seem to me to be any point in using a different set of files for
> > the tab tests and the NBSP tests. If we use the same set of files we can
> > improve the efficiency easily by opening them only once. Here we just
> > look for all the sgml files and all the xsl files and process them all.
> >
> > WDYT?
>
> It looks good to me. I made 2 changes to your patch:
>
> 1- Declaration of $line_no is lost, I re-added it.
> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.

Two more minor changes that I missed in the v2:

1- I added $line_no and removed $_ from the tab check's warning
message. I think it is better this way, otherwise if the line only
contains tab character; $_ will print an empty looking line.
2- s/Tabsand/Tabs and/

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-10-01 We 8:27 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Hi Bilal,
>>>
>>> This got preempted slightly by Tom's commit 170a8a3f460, but I think
>>> it's worth doing. I tried to simplify it some. See attached. There
>>> doesn't seem to me to be any point in using a different set of files for
>>> the tab tests and the NBSP tests. If we use the same set of files we can
>>> improve the efficiency easily by opening them only once. Here we just
>>> look for all the sgml files and all the xsl files and process them all.
>>>
>>> WDYT?
>> It looks good to me. I made 2 changes to your patch:
>>
>> 1- Declaration of $line_no is lost, I re-added it.
>> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.
> Two more minor changes that I missed in the v2:
>
> 1- I added $line_no and removed $_ from the tab check's warning
> message. I think it is better this way, otherwise if the line only
> contains tab character; $_ will print an empty looking line.
> 2- s/Tabsand/Tabs and/
>

OK, thanks, looks good. How do we go about doing what Tom wants (i.e. 
running the tests by default) under meson. I think in the Makefile we 
could just add it to the html target.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Wed, 1 Oct 2025 at 23:02, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2025-10-01 We 8:27 AM, Nazir Bilal Yavuz wrote:
> > Hi,
> >
> > On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> >> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
> >>> Hi Bilal,
> >>>
> >>> This got preempted slightly by Tom's commit 170a8a3f460, but I think
> >>> it's worth doing. I tried to simplify it some. See attached. There
> >>> doesn't seem to me to be any point in using a different set of files for
> >>> the tab tests and the NBSP tests. If we use the same set of files we can
> >>> improve the efficiency easily by opening them only once. Here we just
> >>> look for all the sgml files and all the xsl files and process them all.
> >>>
> >>> WDYT?
> >> It looks good to me. I made 2 changes to your patch:
> >>
> >> 1- Declaration of $line_no is lost, I re-added it.
> >> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.
> > Two more minor changes that I missed in the v2:
> >
> > 1- I added $line_no and removed $_ from the tab check's warning
> > message. I think it is better this way, otherwise if the line only
> > contains tab character; $_ will print an empty looking line.
> > 2- s/Tabsand/Tabs and/
> >
>
> OK, thanks, looks good. How do we go about doing what Tom wants (i.e.
> running the tests by default) under meson. I think in the Makefile we
> could just add it to the html target.

I might be misunderstanding, but these syntax checks already run by
default under meson build with this patch. Would we just need to add
this test to the HTML target in the Makefile?

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-10-02 Th 2:58 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Wed, 1 Oct 2025 at 23:02, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> On 2025-10-01 We 8:27 AM, Nazir Bilal Yavuz wrote:
>>> Hi,
>>>
>>> On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>>>> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>>>>> Hi Bilal,
>>>>>
>>>>> This got preempted slightly by Tom's commit 170a8a3f460, but I think
>>>>> it's worth doing. I tried to simplify it some. See attached. There
>>>>> doesn't seem to me to be any point in using a different set of files for
>>>>> the tab tests and the NBSP tests. If we use the same set of files we can
>>>>> improve the efficiency easily by opening them only once. Here we just
>>>>> look for all the sgml files and all the xsl files and process them all.
>>>>>
>>>>> WDYT?
>>>> It looks good to me. I made 2 changes to your patch:
>>>>
>>>> 1- Declaration of $line_no is lost, I re-added it.
>>>> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.
>>> Two more minor changes that I missed in the v2:
>>>
>>> 1- I added $line_no and removed $_ from the tab check's warning
>>> message. I think it is better this way, otherwise if the line only
>>> contains tab character; $_ will print an empty looking line.
>>> 2- s/Tabsand/Tabs and/
>>>
>> OK, thanks, looks good. How do we go about doing what Tom wants (i.e.
>> running the tests by default) under meson. I think in the Makefile we
>> could just add it to the html target.
> I might be misunderstanding, but these syntax checks already run by
> default under meson build with this patch. Would we just need to add
> this test to the HTML target in the Makefile?
>

Oh, ok, I missed that about meson. I will adjust the Makefile.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Thu, 2 Oct 2025 at 15:27, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Oh, ok, I missed that about meson. I will adjust the Makefile.

I think there is one more problem that we need to think about. This
test runs when the xmllint is enabled but it also requires docbook
(docbook-xml on some OSes) to be installed, otherwise the test fails
with 'I/O error : Attempt to load network entity
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd'. I think that
we need to skip this test if the docbook can not be found in the
system. Otherwise that would be a hassle for most of the people and
buildfarm members. What do you think about this?

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-10-02 Th 8:52 AM, Nazir Bilal Yavuz wrote:
Hi,

On Thu, 2 Oct 2025 at 15:27, Andrew Dunstan <andrew@dunslane.net> wrote:
Oh, ok, I missed that about meson. I will adjust the Makefile.
I think there is one more problem that we need to think about. This
test runs when the xmllint is enabled but it also requires docbook
(docbook-xml on some OSes) to be installed, otherwise the test fails
with 'I/O error : Attempt to load network entity
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd'. I think that
we need to skip this test if the docbook can not be found in the
system. Otherwise that would be a hassle for most of the people and
buildfarm members. What do you think about this?


Oops, missed seeing this earlier. Yes, I think we need to skip the test in the meson case. Probably nothing more needed for the Makefile.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Peter Eisentraut
Дата:
On 01.10.25 22:02, Andrew Dunstan wrote:
> 
(Maybe these discussions could have been in a new thread and not hidden 
under some unrelated thing.)
> OK, thanks, looks good. How do we go about doing what Tom wants (i.e. 
> running the tests by default) under meson. I think in the Makefile we 
> could just add it to the html target.

-html: html-stamp
+html: check html-stamp

This is not a good solution.  This means the html target is never up to 
date.  Compare PostgreSQL 18:

$ make html
make: Nothing to be done for 'html'.
$ make -q html; echo $?
0

And master:

$ make html
perl ...
$ make -q html; echo $?
1

Also, consider the postgres-full.xml target:

# Run validation only once, common to all subsequent targets.  While
# we're at it, also resolve all entities (that is, copy all included
# files into one big file).  This helps tools that don't understand
# vpath builds (such as dbtoepub).
postgres-full.xml: postgres.sgml $(ALL_SGML)
     $(XMLLINT) $(XMLINCLUDE) --output $@ --noent --valid $<

Note that this already does validation.  The way this is structured now 
is that it runs the validation once when you create postgres-full.xml, 
which is than later input into the HTML generation, and then you run the 
validation again, on the already-processed input files, which doesn't 
make any sense.

I suspect what you're really after here is the functionality of the 
check-tabs and check-nbsp targets.  So the new Perl script really just 
has to cover those two and doesn't have to bother with xmllint.  And 
then you just call that script as part of the postgres-full.xml target.




Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Peter Eisentraut <peter@eisentraut.org> writes:
> I suspect what you're really after here is the functionality of the 
> check-tabs and check-nbsp targets.  So the new Perl script really just 
> has to cover those two and doesn't have to bother with xmllint.  And 
> then you just call that script as part of the postgres-full.xml target.

Yeah, that's what I was imagining: replace the xmllint call in
postgres-full.xml with this new script that will also run the
tab/nbsp checks.

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Thu, 2 Oct 2025 at 21:43, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2025-10-02 Th 8:52 AM, Nazir Bilal Yavuz wrote:
>
> I think there is one more problem that we need to think about. This
> test runs when the xmllint is enabled but it also requires docbook
> (docbook-xml on some OSes) to be installed, otherwise the test fails
> with 'I/O error : Attempt to load network entity
> http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd'. I think that
> we need to skip this test if the docbook can not be found in the
> system. Otherwise that would be a hassle for most of the people and
> buildfarm members. What do you think about this?
>
>
> Oops, missed seeing this earlier. Yes, I think we need to skip the test in the meson case. Probably nothing more
neededfor the Makefile.
 

Here is the patch which does that. It has a basic check for the
docbook and if the docbook can not be found, then meson skips the
test.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Thu, 2 Oct 2025 at 23:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter@eisentraut.org> writes:
> > I suspect what you're really after here is the functionality of the
> > check-tabs and check-nbsp targets.  So the new Perl script really just
> > has to cover those two and doesn't have to bother with xmllint.  And
> > then you just call that script as part of the postgres-full.xml target.
>
> Yeah, that's what I was imagining: replace the xmllint call in
> postgres-full.xml with this new script that will also run the
> tab/nbsp checks.

Does not this mean we can not run the syntax check by itself in the
make builds? If I understand correctly, we need to create
postgres-full.xml each time we want to run the syntax check, right?

I was under the impression that the sgml_syntax_check.pl test would be
a lightweight way to do a syntax check, so that we could easily use it
by itself or in the CI.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Peter Eisentraut
Дата:
On 03.10.25 13:48, Nazir Bilal Yavuz wrote:
> On Thu, 2 Oct 2025 at 23:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Peter Eisentraut <peter@eisentraut.org> writes:
>>> I suspect what you're really after here is the functionality of the
>>> check-tabs and check-nbsp targets.  So the new Perl script really just
>>> has to cover those two and doesn't have to bother with xmllint.  And
>>> then you just call that script as part of the postgres-full.xml target.
>>
>> Yeah, that's what I was imagining: replace the xmllint call in
>> postgres-full.xml with this new script that will also run the
>> tab/nbsp checks.
> 
> Does not this mean we can not run the syntax check by itself in the
> make builds? If I understand correctly, we need to create
> postgres-full.xml each time we want to run the syntax check, right?

If you look at this more closely, creating postgres-full.xml and running 
the syntax check perform the same operations, except that the latter 
throws away the output.  So it seems redundant to build a whole new code 
path for this.  I think you can make the check target dependent on 
postgres-full.xml and be done, kind of like this (starting from 
pre-b2922562726):

diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index b53b2694a6b..574ae7b3984 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -69,8 +69,12 @@ ALL_IMAGES := $(wildcard $(srcdir)/images/*.svg)
  # files into one big file).  This helps tools that don't understand
  # vpath builds (such as dbtoepub).
  postgres-full.xml: postgres.sgml $(ALL_SGML)
+    $(MAKE) check-tabs check-nbsp
      $(XMLLINT) $(XMLINCLUDE) --output $@ --noent --valid $<

+# Quick syntax check without style processing
+check: postgres-full.xml
+

  ##
  ## Man pages
@@ -195,15 +199,6 @@ MAKEINFO = makeinfo
      $(MAKEINFO) --enable-encoding --no-split --no-validate $< -o $@


-##
-## Check
-##
-
-# Quick syntax check without style processing
-check: postgres.sgml $(ALL_SGML) check-tabs check-nbsp
-    $(XMLLINT) $(XMLINCLUDE) --noout --valid $<
-
-
  ##
  ## Install
  ##




Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Peter Eisentraut <peter@eisentraut.org> writes:
> If you look at this more closely, creating postgres-full.xml and running 
> the syntax check perform the same operations, except that the latter 
> throws away the output.  So it seems redundant to build a whole new code 
> path for this.  I think you can make the check target dependent on 
> postgres-full.xml and be done, kind of like this (starting from 
> pre-b2922562726):

Would it be unreasonable to discard the "check" target altogether?
It made sense back in the day when actually building the html docs
took many minutes.  But I haven't used it in years, so I wonder
if anyone else has either.

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-10-03 Fr 10:41 AM, Tom Lane wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
If you look at this more closely, creating postgres-full.xml and running 
the syntax check perform the same operations, except that the latter 
throws away the output.  So it seems redundant to build a whole new code 
path for this.  I think you can make the check target dependent on 
postgres-full.xml and be done, kind of like this (starting from 
pre-b2922562726):
Would it be unreasonable to discard the "check" target altogether?
It made sense back in the day when actually building the html docs
took many minutes.  But I haven't used it in years, so I wonder
if anyone else has either.
			


I have no objection. We'll need to work out what we're doing on the meson side, which is kinda where we came in ...


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Fri, 3 Oct 2025 at 18:47, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2025-10-03 Fr 10:41 AM, Tom Lane wrote:
>
> Peter Eisentraut <peter@eisentraut.org> writes:
>
> If you look at this more closely, creating postgres-full.xml and running
> the syntax check perform the same operations, except that the latter
> throws away the output.  So it seems redundant to build a whole new code
> path for this.  I think you can make the check target dependent on
> postgres-full.xml and be done, kind of like this (starting from
> pre-b2922562726):
>
> Would it be unreasonable to discard the "check" target altogether?
> It made sense back in the day when actually building the html docs
> took many minutes.  But I haven't used it in years, so I wonder
> if anyone else has either.
>
> I have no objection. We'll need to work out what we're doing on the meson side, which is kinda where we came in ...

I can work on this but I want to clarify it first. Which one do you prefer:

1- We won't have any command to do syntax checks (including tab and
nbsp), these checks will automatically run when we generate docs.

2- We will have a 'check' target but it will only do tab and nbsp
checks; xmllint will run only when generating the docs.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Peter Eisentraut
Дата:
On 06.10.25 10:29, Nazir Bilal Yavuz wrote:
> Hi,
> 
> On Fri, 3 Oct 2025 at 18:47, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> On 2025-10-03 Fr 10:41 AM, Tom Lane wrote:
>>
>> Peter Eisentraut <peter@eisentraut.org> writes:
>>
>> If you look at this more closely, creating postgres-full.xml and running
>> the syntax check perform the same operations, except that the latter
>> throws away the output.  So it seems redundant to build a whole new code
>> path for this.  I think you can make the check target dependent on
>> postgres-full.xml and be done, kind of like this (starting from
>> pre-b2922562726):
>>
>> Would it be unreasonable to discard the "check" target altogether?
>> It made sense back in the day when actually building the html docs
>> took many minutes.  But I haven't used it in years, so I wonder
>> if anyone else has either.
>>
>> I have no objection. We'll need to work out what we're doing on the meson side, which is kinda where we came in ...
> 
> I can work on this but I want to clarify it first. Which one do you prefer:
> 
> 1- We won't have any command to do syntax checks (including tab and
> nbsp), these checks will automatically run when we generate docs.
> 
> 2- We will have a 'check' target but it will only do tab and nbsp
> checks; xmllint will run only when generating the docs.

I don't know, people have a lot of individual workflows, and they are 
not reading this thread.  I still don't know what we are actually trying 
to fix here, I just noticed that what was committed is flawed.

I would prefer that b2922562726 be reverted, and then someone start a 
new thread with a descriptive change proposal.




Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Mon, 6 Oct 2025 at 11:54, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 06.10.25 10:29, Nazir Bilal Yavuz wrote:
> >
> > I can work on this but I want to clarify it first. Which one do you prefer:
> >
> > 1- We won't have any command to do syntax checks (including tab and
> > nbsp), these checks will automatically run when we generate docs.
> >
> > 2- We will have a 'check' target but it will only do tab and nbsp
> > checks; xmllint will run only when generating the docs.
>
> I don't know, people have a lot of individual workflows, and they are
> not reading this thread.  I still don't know what we are actually trying
> to fix here, I just noticed that what was committed is flawed.

The problem was meson build doesn't have tab and nbsp checks [1]. We
were trying to enable these checks on meson build by moving these
checks to the perl script so that we can run this script on both build
systems.

> I would prefer that b2922562726 be reverted, and then someone start a
> new thread with a descriptive change proposal.

Sounds good to me. I can create a new thread if it gets reverted.

[1] https://www.postgresql.org/message-id/7020df24-1d5f-41e5-8948-2e8d5da57935%40dunslane.net

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-10-06 Mo 6:44 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Mon, 6 Oct 2025 at 11:54, Peter Eisentraut <peter@eisentraut.org> wrote:
>> On 06.10.25 10:29, Nazir Bilal Yavuz wrote:
>>> I can work on this but I want to clarify it first. Which one do you prefer:
>>>
>>> 1- We won't have any command to do syntax checks (including tab and
>>> nbsp), these checks will automatically run when we generate docs.
>>>
>>> 2- We will have a 'check' target but it will only do tab and nbsp
>>> checks; xmllint will run only when generating the docs.
>> I don't know, people have a lot of individual workflows, and they are
>> not reading this thread.  I still don't know what we are actually trying
>> to fix here, I just noticed that what was committed is flawed.
> The problem was meson build doesn't have tab and nbsp checks [1]. We
> were trying to enable these checks on meson build by moving these
> checks to the perl script so that we can run this script on both build
> systems.
>
>> I would prefer that b2922562726 be reverted, and then someone start a
>> new thread with a descriptive change proposal.
> Sounds good to me. I can create a new thread if it gets reverted.
>
> [1] https://www.postgresql.org/message-id/7020df24-1d5f-41e5-8948-2e8d5da57935%40dunslane.net



OK, reverted.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Bruce Momjian
Дата:
On Fri, Oct  3, 2025 at 10:41:56AM -0400, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
> > If you look at this more closely, creating postgres-full.xml and running 
> > the syntax check perform the same operations, except that the latter 
> > throws away the output.  So it seems redundant to build a whole new code 
> > path for this.  I think you can make the check target dependent on 
> > postgres-full.xml and be done, kind of like this (starting from 
> > pre-b2922562726):
> 
> Would it be unreasonable to discard the "check" target altogether?
> It made sense back in the day when actually building the html docs
> took many minutes.  But I haven't used it in years, so I wonder
> if anyone else has either.

I run 'make check' on the SGML every time I build the C code.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: split func.sgml to separated individual sgml files

От
Bruce Momjian
Дата:
On Mon, Oct  6, 2025 at 10:55:53AM -0400, Bruce Momjian wrote:
> On Fri, Oct  3, 2025 at 10:41:56AM -0400, Tom Lane wrote:
> > Peter Eisentraut <peter@eisentraut.org> writes:
> > > If you look at this more closely, creating postgres-full.xml and running 
> > > the syntax check perform the same operations, except that the latter 
> > > throws away the output.  So it seems redundant to build a whole new code 
> > > path for this.  I think you can make the check target dependent on 
> > > postgres-full.xml and be done, kind of like this (starting from 
> > > pre-b2922562726):
> > 
> > Would it be unreasonable to discard the "check" target altogether?
> > It made sense back in the day when actually building the html docs
> > took many minutes.  But I haven't used it in years, so I wonder
> > if anyone else has either.
> 
> I run 'make check' on the SGML every time I build the C code.

Uh, more accurately I run:

    make --silent postgres.sgml
    make --silent check
    make check-tabs

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> Uh, more accurately I run:

>     make --silent postgres.sgml
>     make --silent check
>     make check-tabs

If we included the tabs/nbsp checks in the normal build, then the
first of those would cover everything.  Even as it is, I don't
think the "make check" step is adding anything.

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Álvaro Herrera
Дата:
On 2025-Oct-03, Tom Lane wrote:

> Would it be unreasonable to discard the "check" target altogether?
> It made sense back in the day when actually building the html docs
> took many minutes.  But I haven't used it in years, so I wonder
> if anyone else has either.

I wouldn't particularly appreciate that.  Doing "make check" takes 0.6
seconds for me, while the HTML build is 28 seconds.  It's quite a
difference.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"This is a foot just waiting to be shot"                (Andrew Dunstan)



Re: split func.sgml to separated individual sgml files

От
Bruce Momjian
Дата:
On Mon, Oct  6, 2025 at 11:13:24AM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Uh, more accurately I run:
> 
> >     make --silent postgres.sgml
> >     make --silent check
> >     make check-tabs
> 
> If we included the tabs/nbsp checks in the normal build, then the
> first of those would cover everything.  Even as it is, I don't
> think the "make check" step is adding anything.

Looking at my test code, I do

    $ make postgres.sgml
    make: Nothing to be done for 'postgres.sgml'.

and my shell comment says it is so configure runs and can check that
works first, but it looks like it now does nothing.

I agree the "make --silent check-tabs" doesn't add anything because that
is already part of 'make check'.

I still would like to run checks without building the HTML.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-10-06 Mo 12:00 PM, Álvaro Herrera wrote:
> On 2025-Oct-03, Tom Lane wrote:
>
>> Would it be unreasonable to discard the "check" target altogether?
>> It made sense back in the day when actually building the html docs
>> took many minutes.  But I haven't used it in years, so I wonder
>> if anyone else has either.
> I wouldn't particularly appreciate that.  Doing "make check" takes 0.6
> seconds for me, while the HTML build is 28 seconds.  It's quite a
> difference.
>

OK, so I think that one's not going to fly. We could keep the check 
target and also run the checks as part of building postgres-full.sgml.

It's less clear to me how to do that in meson, though, since you can 
only have a single command in a custom target.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> OK, so I think that one's not going to fly. We could keep the check 
> target and also run the checks as part of building postgres-full.sgml.

Works for me.

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Andres Freund
Дата:
Hi,

On 2025-10-07 14:39:44 -0400, Andrew Dunstan wrote:
> It's less clear to me how to do that in meson, though, since you can only
> have a single command in a custom target.

Create a stamp file for the check success and make that a dependency of
the main build too.

Greetings,

Andres Freund



Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Mon, 6 Oct 2025 at 13:44, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> Sounds good to me. I can create a new thread if it gets reverted.

I created a new thread [1] and tried to apply recent feedback on this thread.

[1] https://postgr.es/m/CAN55FZ1qzoDcaKqsR3DwE%3DX6FL%2Bwpm%2B%3DKLvH6ahrRXNhjU53DQ%40mail.gmail.com

-- 
Regards,
Nazir Bilal Yavuz
Microsoft