Re: GB18030-2022 Support in PostgreSQL
От | John Naylor |
---|---|
Тема | Re: GB18030-2022 Support in PostgreSQL |
Дата | |
Msg-id | CANWCAZbBEUuby3pejOq6L0b3OhEa3B9XQz=EziYAYkNpOODsig@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: GB18030-2022 Support in PostgreSQL (Chao Li <li.evan.chao@gmail.com>) |
Ответы |
Re: GB18030-2022 Support in PostgreSQL
|
Список | pgsql-hackers |
On Wed, Aug 13, 2025 at 3:08 PM Chao Li <li.evan.chao@gmail.com> wrote: > Attached is the new patch. It downloads the UCM file in make: > After regenerating the map files, there is no change found in the map files. I can confirm, thanks. We split a patch into multiple patches, it's customary include all of them, since that process may result in unwelcome artifacts to sort out. (When the first step has architectural questions or change in behavior, we may treat it as independent, possibly with a separate thread, but that's not the case here.) I do have some comments already, though: -my $in_file = "gb-18030-2000.xml"; - +my $in_file = "gb-18030-2000.ucm"; -while (<$in>) -{ +while (<$in>) { -# The lines we care about in the source file look like +# The lines we care about in the source file look like: These are spurious changes, which we try to avoid. - next if (!m/<a u="([0-9A-F]+)" b="([0-9A-F ]+)"/); + if (/^<U([0-9A-Fa-f]+)>\s+((?:\\x[0-9A-Fa-f]{2})+)\s*\|(\d+)/) { This change in style caused extra whitespace-only churn. That obscures what the actual changes are. + # Match lines like: <UXXXX> \xYY[\xYY...] |n, and use only (|0) mappings This is missing an explanation of why we skip non-zero mappings. Code-wise, this only matters for the output in the follow-on patch for 2022, but one of these patches needs to include a brief explanation. I did not like the detailed description that was present in one of the earlier 2022 patches that told how many characters were flagged a certain way -- that's irrelevant detail and will likely get out of date in some future version anyway. +# and n is a flag indicating the type of mapping having +# a single value of 0. This seems weird when combined with the logic to filter out non-zero mappings. We need to think about when and where to show relevant information. + next if ($flag ne '0'); # non-0 flags This comment is just repeating what the code is doing, and it's very obvious what it's doing. BTW, it sounds like your proposed Makefile changes are needed for the follow-on patch with .map changes to work at all, is that right? https://www.postgresql.org/message-id/1CA8625F-AA41-4ED2-B60F-E28AC71F37DC@highgo.com -- John Naylor Amazon Web Services
В списке pgsql-hackers по дате отправления: