XML: Single root element

Поиск
Список
Период
Сортировка
От Jürgen Purtz
Тема XML: Single root element
Дата
Msg-id f8b59177-1251-8813-541f-73383aa744f5@purtz.de
обсуждение исходный текст
Список pgsql-docs
Some time ago we upgraded our documentation from SGML to XML in a huge 
step. Most of the resulting files are well-formed - but not all. The 
well-formed criteria is violated by such files which contains more than 
one root element. You can locate such files with the command:

xmllint --noout *.sgml ref/*.sgml 2> >(grep Extra)

Actually this is not a serious problem. But for further XML processing 
(parsing, Docbook upgrade to version 5.x, use of an XML-editor, 
xinclude, xpath, namespaces, ... ) it is necessary - or at least very 
helpful - to change the content of every single file in a manual step to 
a *well-formed* XML file, especially with one single root element. The 
attached patch results from applying different strategies to achieve 
this aim.

Strategy 1: Move the element of the outer file where the 'calling' 
entity resides to the included file as an additional top-level element. 
Example 'legal.sgml':

Actual situation
================
postgres.sgml:
<book id="postgres">
  <title>PostgreSQL &version; Documentation</title>

  <bookinfo>  <corpauthor>The PostgreSQL Global Development 
Group</corpauthor>
   <productname>PostgreSQL</productname>
   <productnumber>&version;</productnumber>
   &legal;
  </bookinfo>
  ...


legal.sgml:
<date>2019</date>

<copyright>
  <year>1996-2019</year>
  <holder>The PostgreSQL Global Development Group</holder>
</copyright>

<legalnotice id="legalnotice">
...
</legalnotice>
-- End of File --

New situation
=============
postgres.sgml:
<book id="postgres">
  <title>PostgreSQL &version; Documentation</title>

  &legal;
  ...


legal.sgml:
<bookinfo>
  <corpauthor>The PostgreSQL Global Development Group</corpauthor>
  <productname>PostgreSQL</productname>
  <productnumber>&version;</productnumber>
  <date>2019</date>

  <copyright>
   <year>1996-2019</year>
   <holder>The PostgreSQL Global Development Group</holder>
  </copyright>

  <legalnotice id="legalnotice">
  ...
  </legalnotice>
</bookinfo>
-- End of File --

Some single files are changed but the intermediate file (respectively 
the main memory) after resolving all entities keeps unchanged. This file 
resp. main memory is the basis for all further steps like validation or 
output generation.

Strategy 2: The files of the release notes consists of many 
sect1-elements at the top level. To overcome this situation one can try 
to change sect1 to sect2, sect2 to sect3, ... and use a new sect1 
element as a cramp over the complete file. The chain of sect<n> sections 
is limited to 5 levels - and in some cases we use all of them. Therefore 
it's necessary to change the mark-up from sect<n>-elements to 
section-elements, which can be used recursively without limits.
This strategy leads to changes in the visual representation of the TOC, 
because every title-element shifts one level down. (In my opinion this 
is an improvement because a: after clicking to 'Release Notes' we 
actually have 372 items plus their sub-items. This will be reduced to 
one item per major release: 11, 10, 9.6, 9.5, ... and b: the 
acknowledgement-element is shown - as intended - per complete major 
release, not only with the very first version of a release.) Furthermore 
we have exactly one HTML file per major release for the standard HTML 
output.

Strategy 3: Split huge files into smaller files (contrib, xfunc) and/or 
shift some sections to the calling file. From the perspective of a git 
user or someone, who translates the documentation to a different 
language, this is not funny but I hope that it will be accepted.


PS_1: For tests don't forget the Make-target 'errcodes-table.sgml'
PS_2: The remaining files version.sgml, filelist.sgml and 
ref/allfiles.sgml, which contains nothing but entity definitions, will 
possibly change or get superfluous with the migration to Docbook 5.x.

Kind regards
Jürgen Purtz


Вложения

В списке pgsql-docs по дате отправления:

Предыдущее
От: Ioseph Kim
Дата:
Сообщение: Re: patch earthdistance.sgml (add geo_distance function description)
Следующее
От: PG Doc comments form
Дата:
Сообщение: Not working