Обсуждение: Improve docs syntax checking and enable it in the meson build

Поиск
Список
Период
Сортировка

Improve docs syntax checking and enable it in the meson build

От
Nazir Bilal Yavuz
Дата:
Hi,

The Meson build did not include tab and non-breaking space checks for
the docs. The attached patch adds these checks and includes a few
related improvements.

This topic was previously discussed towards end of the another thread
[1], but it was decided that it would be better to have a separate
thread for it, so I am continuing the discussion here.

These checks were previously done in the Makefile:

```
# tabs are harmless, but it is best to avoid them in SGML files
check-tabs:
    @( ! grep '    ' $(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml
$(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
    (echo "Tabs appear in SGML/XML files" 1>&2;  exit 1)

# Non-breaking spaces are harmless, but it is best to avoid them in SGML files.
# Use perl command because non-GNU grep or sed could not have hex
escape sequence.
check-nbsp:
    @ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END
{exit($$n>0)}' \
      $(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml
$(srcdir)/ref/*.sgml $(srcdir)/*.xsl $(srcdir)/images/*.xsl) ) || \
    (echo "Non-breaking spaces appear in SGML/XML files" 1>&2;  exit 1)
```

I moved these checks to a new Perl script called sgml_syntax_check.pl.
This script can also perform xmllint validation (when possible).

Here is a summary of the changes:

1 - A new sgml_syntax_check.pl script was added to handle tab, nbsp,
and xmllint validation checks.
1.1 - It is registered as the sgml_syntax_check test in the Meson build.
1.2 - These checks are run when executing 'make check' or 'meson test
sgml_syntax_check' commands.
1.3 - During the creation of postgres-full.xml, the script performs
tab and nbsp checks. The xmllint check is skipped there, since
validation is already handled by the --valid option. So, we do not run
the same check twice.

2 - The sgml_syntax_check test runs by default in the Meson build.
2.1 - Tab and nbsp checks always run.
2.2 - The xmllint validation and the test are skipped if the DocBook
can not be found. I was not able to achieve the same behavior in the
autoconf build, so the test is not run by default there. The Make
build continues to work as before, you can run the checks manually via
make check in doc/src/sgml.

[1]
https://www.postgresql.org/message-id/flat/CACJufxFgAh1--EMwOjMuANe%3DVTmjkNaZjH%2BAzSe04-8ZCGiESA%40mail.gmail.com

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: Improve docs syntax checking and enable it in the meson build

От
Nazir Bilal Yavuz
Дата:
Hi,

On Tue, 7 Oct 2025 at 16:12, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> The Meson build did not include tab and non-breaking space checks for
> the docs. The attached patch adds these checks and includes a few
> related improvements.

I have updated v6 to use a stamp file, as Andres suggested [1], to
ensure a dependency between the syntax check and the postgres-full.xml
file in the meson build.

[1] https://postgr.es/m/tcjetkmnm4vtuyxakqvkqokvow6csjokdwwtplc5nl4zbpyjoo%40jjfhsuqa6fno

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: Improve docs syntax checking and enable it in the meson build

От
Peter Eisentraut
Дата:
On 07.10.25 15:12, Nazir Bilal Yavuz wrote:
> 1 - A new sgml_syntax_check.pl script was added to handle tab, nbsp,
> and xmllint validation checks.
> 1.1 - It is registered as the sgml_syntax_check test in the Meson build.
> 1.2 - These checks are run when executing 'make check' or 'meson test
> sgml_syntax_check' commands.
> 1.3 - During the creation of postgres-full.xml, the script performs
> tab and nbsp checks. The xmllint check is skipped there, since
> validation is already handled by the --valid option. So, we do not run
> the same check twice.

I think including the xmllint support in the new sgml_syntax_check is 
overkill, since the normal build already runs xmllint, or you could 
alternatively just write it into the build description file (makefile or 
meson.build).  The build commands should be visible in the build 
description file, not layered into some other script.

I suggest the following approach:

- Change sgml_syntax_check.pl into a smaller script that just checks for 
tabs and nbsp.  (Maybe a different name then.)

- Add a call of that script to the build of postgres-full.xml.

- Change the "check" target to just depend on postgres-full.xml, without 
its own commands.

And then replicate that logic in meson.