Обсуждение: MacPorts xsltproc is very slow?
Hi, Does anyone know why I'd see this difference in "make docs" performance? 1. On macOS using Apple's /usr/bin/xsltproc (--version says libxml 20902, libxslt 10128 and libexslt 817), it builds in a few minutes but produces warnings like this: postgres.sgml:408: element biblioentry: validity error : ID ston89b already defined postgres.sgml:426: element biblioentry: validity error : ID ston90a already defined postgres.sgml:454: element biblioentry: validity error : ID ston90b already defined 2. On macOS using MacPorts' /opt/local/bin/xsltproc (--version says libxml 20907, libxslt 10132 and libexslt 820), the xsltproc step runs for *half an hour* on my laptop. I have no idea what it's doing. With dtruss I see a bunch of madvise(MADV_CAN_REUSE) half of which fail with EINVAL, and a bunch of stat64("/opt/local/lib/libxslt-plugins/nwalsh_com_xslt_ext_com_nwalsh_saxon_UnwrapLinks.so\0"). Applying Alexander Law's patch[1] makes most of the latter go away but doesn't fix the run time problem. I noticed that the Apple version is using libxslt 1.1.28 (for context, that's the same as Debian Jessie used; Stretch/Buster/Sid are on 1.1.29 -- I'm guessing many of you are using that?), whereas MacPorts is shipping libxslt 1.1.32. I know next to nothing about these tools but I wonder if something we're doing gets horribly slow in future libxslt versions that will come down the pipeline on other distributions. Or if the MacPorts port is just borked somehow. For now I've uninstalled it and am ignoring the warnings from the Apple version (my other car is a FreeBSD where I can't build the docs at all since commit d6376245 because it's stuck with DocBooks 1.76.1). Any clues would be gratefully received. [1] https://www.postgresql.org/message-id/bfce8c4e-e200-9617-791a-4e05a054e698%40gmail.com -- Thomas Munro http://www.enterprisedb.com
Hello Thomas,
25.11.2017 06:38, Thomas Munro wrote:
> Hi,
>
> Does anyone know why I'd see this difference in "make docs" performance?
Can you show the output of
make XSLTPROCFLAGS="--profile" -C doc/src/sgml/html
or
make XSLTPROCFLAGS="--profile" docs
?
------
Alexander Lakhin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Thomas Munro <thomas.munro@enterprisedb.com> writes: > Does anyone know why I'd see this difference in "make docs" performance? > 1. On macOS using Apple's /usr/bin/xsltproc (--version says libxml > 20902, libxslt 10128 and libexslt 817), it builds in a few minutes but > produces warnings like this: > postgres.sgml:408: element biblioentry: validity error : ID ston89b > already defined > postgres.sgml:426: element biblioentry: validity error : ID ston90a > already defined > postgres.sgml:454: element biblioentry: validity error : ID ston90b > already defined FWIW, I see no such warnings using the system xsltproc on either Sierra or High Sierra. Both of them show $ xsltproc --version Using libxml 20904, libxslt 10129 and libexslt 817 xsltproc was compiled against libxml 20904, libxslt 10129 and libexslt 817 libxslt 10129 was compiled against libxml 20904 libexslt 817 was compiled against libxml 20904 and build the HTML docs in about a minute and a half. > I noticed that the Apple version is using libxslt 1.1.28 (for context, > that's the same as Debian Jessie used; Stretch/Buster/Sid are on > 1.1.29 -- I'm guessing many of you are using that?), whereas MacPorts > is shipping libxslt 1.1.32. I know next to nothing about these tools > but I wonder if something we're doing gets horribly slow in future > libxslt versions that will come down the pipeline on other > distributions. Or if the MacPorts port is just borked somehow. This seems vaguely reminiscent of this recent discussion: https://www.postgresql.org/message-id/flat/8F33D0C7-E12E-4996-990C-3CF0C5ED0437%40filmlance.se in which the conclusion seemed to be that Apple's build of zlib was considerably faster than another build (of uncertain provenance, mind you, so maybe that is unrelated). I wonder whether Apple are using more aggressive optimization flags than other people. OTOH, while it would not surprise me if Apple put some work into making zlib go fast, it seems less likely that they'd expend effort or risk on xsltproc. regards, tom lane
On Sat, Nov 25, 2017 at 4:57 PM, Alexander Lakhin <exclusion@gmail.com> wrote: > Can you show the output of > make XSLTPROCFLAGS="--profile" -C doc/src/sgml/html > or > make XSLTPROCFLAGS="--profile" docs > ? Hi Alexander, please see attached. -- Thomas Munro http://www.enterprisedb.com
Вложения
I wrote: > ... I wonder whether Apple are using more > aggressive optimization flags than other people. OTOH, while it would > not surprise me if Apple put some work into making zlib go fast, it > seems less likely that they'd expend effort or risk on xsltproc. I checked Fedora 26 and found that it's using an identical source version of xsltproc: $ xsltproc --version Using libxml 20904, libxslt 10129 and libexslt 817 xsltproc was compiled against libxml 20904, libxslt 10129 and libexslt 817 libxslt 10129 was compiled against libxml 20904 libexslt 817 was compiled against libxml 20904 Its time to build the HTML docs ... 46 seconds. This is on a machine intermediate in speed between the Sierra and High Sierra machines whose timings I quoted before. So we can definitively discard the idea that Apple sprinkled any magic pixie dust on their xsltproc build. Your MacPorts build is the outlier, leaving us with the theories that it was built at -O0 or there's a performance bug in the newer source releases. I have no doubt that xmlsoft.org would be interested if you can narrow it down to the latter. regards, tom lane
25.11.2017 07:49, Thomas Munro wrote:
> On Sat, Nov 25, 2017 at 4:57 PM, Alexander Lakhin
> wrote:
>> Can you show the output of make XSLTPROCFLAGS="--profile" -C
>> doc/src/sgml/html or make XSLTPROCFLAGS="--profile" docs ?
> Hi Alexander, please see attached.
Thanks! Just to compare your results with my:
number match name mode Calls Tot
100us Avg 0 d:appendix label.markup 22503 70018988
3111 1 chunk-all-sections 1394 36361616 26084 2 d:chapter label.markup 24918
20792834 834 3 href.target 23895
18556477 776 4 footer.navigation 1394
5276429 3785 5 header.navigation 1394
5095535 3655 6 gentext.template 237071 857659 3
vs
number match name mode Calls Tot
100us Avg
0 gentext.template 247659 742878 2 1 chunk-all-sections 1394 705195 505 2 href.target 35446 663090 18
I wonder, what version of docbook-xsl are you using?
(I have 1.79.1+dfsg-1).
Can you check with 1.79+ (if yours is older)?
------
Alexander Lakhin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Sat, Nov 25, 2017 at 7:18 PM, Alexander Lakhin <exclusion@gmail.com> wrote: > Thanks! Just to compare your results with my: > > number match name mode Calls Tot 100us > Avg > 0 d:appendix label.markup 22503 70018988 > 3111 > 1 chunk-all-sections 1394 36361616 > 26084 > 2 d:chapter label.markup 24918 20792834 > 834 > 3 href.target 23895 18556477 > 776 > 4 footer.navigation 1394 5276429 > 3785 > 5 header.navigation 1394 5095535 > 3655 > 6 gentext.template 237071 857659 > 3 > > vs > > number match name mode Calls Tot 100us > Avg > > 0 gentext.template 247659 742878 > 2 > 1 chunk-all-sections 1394 705195 > 505 > 2 href.target 35446 663090 > 18 Hmm. Well, this is all new to me but I'd have expected the numbers in the "Calls" column to be entirely deterministic. Perhaps that business about conditional use of UnwrapLinks and other things like it change the numbers. It's interesting that "gentext.template" is in the same ballpark on our two systems in terms of calls and CPU time, but the top templates are massive outliers on my system. I have no idea what I'm even looking at really but I couldn't help noticing that templates with match="chapter" and match="appendix" appear in our tree in sgml/stylesheet-speedup-common.xsl with a comment "Performance-optimized versions of some upstream templates from common/ directory". Could it be that whatever performance-enhancing trick they perform doesn't work on 1.1.32, or alternatively they are not being reached so we're falling back to non-optimised versions instead of these? > I wonder, what version of docbook-xsl are you using? > (I have 1.79.1+dfsg-1). > Can you check with 1.79+ (if yours is older)? docbook-xsl version 1.79.2_1. -- Thomas Munro http://www.enterprisedb.com
25.11.2017 11:03, Thomas Munro wrote:
>
> Hmm. Well, this is all new to me but I'd have expected the numbers in
> the "Calls" column to be entirely deterministic.
I think, calls are depending on the XSL templates and it seems we have
different templates.
(I couldn't find 'd:appendix' in my docbook-xsl installation, that's why
I asked about version number.)
Maybe it's another case then, your version is new.
Now I see 'd:appendix' appeared in
https://github.com/docbook/xslt10-stylesheets/blob/master/xsl/html/chunktoc.xsl
> Perhaps that
> business about conditional use of UnwrapLinks and other things like it
> change the numbers. It's interesting that "gentext.template" is in
> the same ballpark on our two systems in terms of calls and CPU time,
> but the top templates are massive outliers on my system. I have no
> idea what I'm even looking at really but I couldn't help noticing that
> templates with match="chapter" and match="appendix" appear in our tree
> in sgml/stylesheet-speedup-common.xsl with a comment
> "Performance-optimized versions of some upstream templates from
> common/ directory". Could it be that whatever performance-enhancing
> trick they perform doesn't work on 1.1.32, or alternatively they are
> not being reached so we're falling back to non-optimised versions
> instead of these?
>
>> I wonder, what version of docbook-xsl are you using?
>> (I have 1.79.1+dfsg-1).
>> Can you check with 1.79+ (if yours is older)?
> docbook-xsl version 1.79.2_1.
I'll try to install 1.79.2 version and check the performance on my side.
------
Alexander Lakhin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
25.11.2017 11:21, Alexander Lakhin wrote:
> I wonder, what version of docbook-xsl are you using?
>>> (I have 1.79.1+dfsg-1).
>>> Can you check with 1.79+ (if yours is older)?
>> docbook-xsl version 1.79.2_1.
> I'll try to install 1.79.2 version and check the performance on my side.
I installed docbook-style-xsl-1.79.2-5 in Fedora 27 and didn't noticed
performance drop.
In fact in this version I see XSL templates without namespaces
("appendix" instead of "d:appendix").
I looked at the spec of the package and found that it's building from the
"https://github.com/docbook/xslt10-stylesheets/releases/download/release%2F{%version}/docbook-xsl-nons-%{version}.tar.bz2"
nons - is for "no namespace"
Indeed, you can see both variations of the source packages at:
https://github.com/docbook/xslt10-stylesheets/releases/tag/release%2F1.79.2
It seems that your package is built from "ns" version.
(I couldn't find docbook-xsl-nons in Macports.)
If it's the only available version for Mac, it seems we need to adjust
our XSL templates to work with namespaces too.
Best regards,
------
Alexander Lakhin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Thomas Munro <thomas.munro@enterprisedb.com> writes: > ... I couldn't help noticing that > templates with match="chapter" and match="appendix" appear in our tree > in sgml/stylesheet-speedup-common.xsl with a comment > "Performance-optimized versions of some upstream templates from > common/ directory". Could it be that whatever performance-enhancing > trick they perform doesn't work on 1.1.32, or alternatively they are > not being reached so we're falling back to non-optimised versions > instead of these? If you're suspicious of that, you could try removing those parts of stylesheet-speedup-common.xsl and see what happens ... regards, tom lane
On Sun, Nov 26, 2017 at 4:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> ... I couldn't help noticing that >> templates with match="chapter" and match="appendix" appear in our tree >> in sgml/stylesheet-speedup-common.xsl with a comment >> "Performance-optimized versions of some upstream templates from >> common/ directory". Could it be that whatever performance-enhancing >> trick they perform doesn't work on 1.1.32, or alternatively they are >> not being reached so we're falling back to non-optimised versions >> instead of these? > > If you're suspicious of that, you could try removing those parts of > stylesheet-speedup-common.xsl and see what happens ... Good idea. Removing them didn't help (though removing them makes Apple xsltproc similarly slow). Adding "d:" namespace to "chapter" and "appendix", declared in the xsl:stylesheet element as xmlns:d="http://docbook.org/ns/docbook" didn't help either. I suspect that could be made to work with some more tweaking, but I lack the XSL knowledge. I found another way forward though: On Sun, Nov 26, 2017 at 4:09 AM, Alexander Lakhin <exclusion@gmail.com> wrote: > It seems that your package is built from "ns" version. > (I couldn't find docbook-xsl-nons in Macports.) > If it's the only available version for Mac, it seems we need to adjust our > XSL templates to work with namespaces too. Aha, you're right! MacPorts does actually have two different ports (packages): docbook-xsl and docbook-xsl-ns, and the first one should be the no-namespace variant. But I can clearly see that the docbook-xsl packages installs stylesheets *with* namespaces. I compared this with a Debian system and found that common/labels.xsl (the file that defines the templates that our stylesheet-speedup-common.xsl seems to want to replace) has the "d:" prefix on "chapter" and "appendix", but doesn't on the Debian system. Presumably this interferes with the interposing technique. Perhaps that is a packaging error that should be reported upstream. That got me wondering... why does the Apple xsltproc in /usr/bin work then? Where is it even getting docbook-xsl from? I ran it with --profile and http://docbook.sourceforge.net instead of file:// URLs, and I could see outgoing connections with netstat. It believe that's because it doesn't find it locally in /etc/xml/catalog, whereas the MacPorts xsltproc looks in /opt/locl/etc/xml/catalog where it has been listed by the docbook-xsl package. So one solution is simply to uninstall the docbook-xsl package. That gets me back to fast documentation builds! Incidentally, uninstalling the docbooks-xsl package also works for FreeBSD which currently ships a too-old DocBook version. I believed until now that it couldn't build the PostgreSQL docs, so I'm very happy to discover that it can, but (1) it needs the network (2) it's using HTTP instead of HTTPS so Alice could mess with Bob's documentation. Thanks both for your help figuring this out. That's quite enough XML for one day. -- Thomas Munro http://www.enterprisedb.com
On 11/26/17 17:03, Thomas Munro wrote: > So one solution is simply to uninstall the docbook-xsl package. That > gets me back to fast documentation builds! Incidentally, uninstalling > the docbooks-xsl package also works for FreeBSD which currently ships > a too-old DocBook version. This is actually documented: https://www.postgresql.org/docs/devel/static/docguide-toolsets.html > I believed until now that it couldn't > build the PostgreSQL docs, so I'm very happy to discover that it can, > but (1) it needs the network (2) it's using HTTP instead of HTTPS so > Alice could mess with Bob's documentation. Good point. I have filed a bug about this. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Nov 26, 2017 at 5:03 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > That got me wondering... why does the Apple xsltproc in /usr/bin work > then? Where is it even getting docbook-xsl from? I ran it with > --profile and http://docbook.sourceforge.net instead of file:// URLs, > and I could see outgoing connections with netstat. It believe that's > because it doesn't find it locally in /etc/xml/catalog, whereas the > MacPorts xsltproc looks in /opt/locl/etc/xml/catalog where it has been > listed by the docbook-xsl package. > > So one solution is simply to uninstall the docbook-xsl package. That > gets me back to fast documentation builds! For me, the documentation build fails without docbook-xsl. I wonder why it works for you. It also fails for me if I follow the instructions in the documentation: If you use MacPorts, the following will get you set up: <programlisting> sudo port install docbook-xml-4.2 docbook-xsl fop </programlisting> I have to also 'port install libxslt'. Otherwise, /usr/bin/xsltproc is used, and it won't use files installed by MacPorts packages. And then it takes, like you reported originally, an insanely long time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Mar 3, 2018 at 8:51 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Nov 26, 2017 at 5:03 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> So one solution is simply to uninstall the docbook-xsl package. That >> gets me back to fast documentation builds! > > For me, the documentation build fails without docbook-xsl. I wonder > why it works for you. Recently I've been unable to build the documentation intermittently, and I think it's because sourceforge.net has become flakey. Let me try right now... $ make docs XSLTPROC=/usr/bin/xsltproc ...blah blah blah... warning: failed to load external entity "http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl" ...blah blah blah... So, yeah, it looks like we might need a local docbooks installation currently. > It also fails for me if I follow the instructions in the documentation: > > If you use MacPorts, the following will get you set up: > <programlisting> > sudo port install docbook-xml-4.2 docbook-xsl fop > </programlisting> > > I have to also 'port install libxslt'. Otherwise, /usr/bin/xsltproc > is used, and it won't use files installed by MacPorts packages. > > And then it takes, like you reported originally, an insanely long time. I think we should complain to the MacPorts packager about the namespace vs non-namespace stuff being possibly confused in the packages. -- Thomas Munro http://www.enterprisedb.com
On Sat, Mar 3, 2018 at 9:34 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > I think we should complain to the MacPorts packager about the > namespace vs non-namespace stuff being possibly confused in the > packages. https://trac.macports.org/ticket/55946 -- Thomas Munro http://www.enterprisedb.com
On Sat, Mar 03, 2018 at 09:34:58AM +1300, Thomas Munro wrote: > Recently I've been unable to build the documentation intermittently, > and I think it's because sourceforge.net has become flakey. Let me > try right now... > > $ make docs XSLTPROC=/usr/bin/xsltproc > ...blah blah blah... > warning: failed to load external entity > "http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl" > ...blah blah blah... I am seeing that as well lately on my Linux box as well with Debian. And the build failure consists in a mountain of warnings and errors where you need to dig up to the top to see the real problem. That's annoying :( -- Michael
Вложения
On Fri, Mar 2, 2018 at 3:34 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > I think we should complain to the MacPorts packager about the > namespace vs non-namespace stuff being possibly confused in the > packages. Any ideas about a workaround for the meantime? Having to wait half an hour for the documentation to build is pretty annoying. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 23, 2018 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 2, 2018 at 3:34 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> I think we should complain to the MacPorts packager about the >> namespace vs non-namespace stuff being possibly confused in the >> packages. > > Any ideas about a workaround for the meantime? Having to wait half an > hour for the documentation to build is pretty annoying. This worked for me: 1. Steal a working installation of docbook-xsl from some other system. Or clone the repo from github.com/docbook and figure out how to build it so that there is an xhtml directory (that was beyond my attention span). I just did this on a Debian system: tar czvf docbook-xsl.tgz -C /usr/share/xml/docbook/stylesheet/ docbook-xsl I could give you that file off-list if you want, but it's 2.2MB so I won't post it here. You could also grab the Debian (or other) package file and unpack that. I created a directory ~/docbook-stuff/ and then unpacked that tarball there. 2. Tell xsltproc to rewrite any references to sourceforge.net to use this local stuff instead. Either you can create/edit the system-wide /etc/xml/catalog file (or equivalent under /opt if you're using MacPorts tools), or you can create a new file somewhere and point to it with the environment variable XML_CATALOG_FILES. I went with the env variable approach because I didn't want to mess with system-wide configuration. So I created a file ~/docbook-stuff/catalog.xml with this in it: <?xml version="1.0" encoding="utf-8"?> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <!-- redirect references to docbook.sourceforge.net to local files --> <rewriteURI uriStartString="http://docbook.sourceforge.net/release/xsl/current/" rewritePrefix="file:///Users/munro/docbook-stuff/docbook-xsl/"/> <rewriteSystem systemIdStartString="http://docbook.sourceforge.net/release/xsl/current/" rewritePrefix="file:///Users/munro/docbook-stuff/docbook-xsl/"/> </catalog> Obviously that needs to be adjusted to point to wherever you unpacked that tarball. I did wonder about simply changing our documentation's sourceforge.net references to point to docbook.org instead, apparently the new home (?) of this stuff: -<xsl:import href="http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl"/> +<xsl:import href="https://cdn.docbook.org/release/xsl-nons/current/xhtml/chunk.xsl"/> (plus similar changes elsewhere). But it didn't seem to be able to fetch stylesheets that way. Perhaps my tools don't like speaking HTTPS. I didn't try to dig any further because I'd rather go and hack on C code today. -- Thomas Munro http://www.enterprisedb.com
On 3/22/18 18:56, Thomas Munro wrote: > I did wonder about simply changing our documentation's sourceforge.net > references to point to docbook.org instead, apparently the new home > (?) of this stuff: > > -<xsl:import href="http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl"/> > +<xsl:import href="https://cdn.docbook.org/release/xsl-nons/current/xhtml/chunk.xsl"/> That would affect those who are currently relying on catalog entries redirecting the old URLs to their local copy. I think we will at least need to wait for a new release with new catalogs before doing this. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Mar 3, 2018 at 3:53 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Sat, Mar 3, 2018 at 9:34 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> I think we should complain to the MacPorts packager about the >> namespace vs non-namespace stuff being possibly confused in the >> packages. > > https://trac.macports.org/ticket/55946 This was fixed yesterday. After "sudo port update" and "sudo port upgrade outdated" I see "docbook-xsl @1.79.2_3" being installed and then I get: $ time make docs ... real 1m47.777s user 1m43.627s sys 0m2.958s -- Thomas Munro http://www.enterprisedb.com