Re: Doc: typo in config.sgml
От | Peter Eisentraut |
---|---|
Тема | Re: Doc: typo in config.sgml |
Дата | |
Msg-id | 7491a14c-2215-46f0-87fe-ce30ae9eb4f6@eisentraut.org обсуждение исходный текст |
Ответ на | Re: Doc: typo in config.sgml (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: Doc: typo in config.sgml
|
Список | pgsql-hackers |
On 02.11.24 14:18, Bruce Momjian wrote: > On Sat, Nov 2, 2024 at 12:02:12PM +0900, Tatsuo Ishii wrote: >>> Yes, we _allow_ LATIN1 characters in the SGML docs, but I replaced the >>> LATIN1 characters we had with HTML entities, so there are none >>> currently. >>> >>> I think it is too easy for non-Latin1 UTF8 to creep into our SGML docs >>> so I added a cron job on my server to alert me when non-ASCII characters >>> appear. >> >> So you convert LATIN1 characters to HTML entities so that it's easier >> to detect non-LATIN1 characters is in the SGML docs? If my >> understanding is correct, it can be also achieved by using some tools >> like: >> >> iconv -t ISO-8859-1 -f UTF-8 release-17.sgml >> >> If there are some non-LATIN1 characters in release-17.sgml, >> it will complain like: >> >> iconv: illegal input sequence at position 175 >> >> An advantage of this is, we don't need to covert each LATIN1 >> characters to HTML entities and make the sgml file authors life a >> little bit easier. > > I might have misread the feedback. I know people didn't want a Makfile > rule to prevent it, but I though converting few UTF8's we had was > acceptable. Let me think some more and come up with a patch. The question of encoding characters as entities is orthogonal to the issue of only allowing Unicode characters that have a mapping to Latin 1. This patch seems to confuse these two issues, and I don't think it actually fixed the second one, which is the one that was complained about. I don't think anyone actually complained about the first one, which is the one that was actually patched. I think the iconv approach is an idea worth checking out. It's also not necessarily true that the set of characters provided by the built-in PDF fonts is exactly the set of characters in Latin 1. It appears to be close enough, but I'm not sure, and I haven't found any authoritative information on that. Another approach for a fix would be to get FOP produce the required warnings or errors more reliably. I know it has a bunch of logging settings (ultimately via log4j), so there might be some possibilities.
В списке pgsql-hackers по дате отправления: