Re: GB18030-2022 Support in PostgreSQL
От | John Naylor |
---|---|
Тема | Re: GB18030-2022 Support in PostgreSQL |
Дата | |
Msg-id | CANWCAZZ129LpH3Z+i1q+aE-X6fNNg0FYF1fRK0pd2AEpSM8hmw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: GB18030-2022 Support in PostgreSQL (John Naylor <johncnaylorls@gmail.com>) |
Ответы |
Re: GB18030-2022 Support in PostgreSQL
Re: GB18030-2022 Support in PostgreSQL |
Список | pgsql-hackers |
On Tue, Aug 12, 2025 at 9:09 AM Chao Li <li.evan.chao@gmail.com> wrote: [bringing this back to the original thread] > So, I compared 2000 ucm with 2005 ucm also compared 2005 ucm with 2022 ucm. Then I found that some changed in 2005 is revertedin 2022, that why diff between 2000 and 2022 is small. For example, the following mappings Yes, this was mentioned in the "disruptive changes" document linked in my first email in this thread: "The 2005 edition included 6 characters with double mappings. The 2022 edition removes the double mappings. The 2005 edition included 9 characters from the CJK Compatibility Ideographs block. In Unicode/10646, these all have canonical decomposition mappings to characters in the URO. In the 2022 edition, these nine compatibility characters are removed." > So, for how to create patch 2, I think we have 3 options: > > 1. As planned, update to the latest version of 2000 ucm, then skip 2005 and directly upgrade to 2022 in patch 3. This way,we just honor 2000 ucm regardless that the change is actually introduced by 2005. > > 2. Skip the latest version of 2000 ucm and upgrade to 2005 ucm. This way will clearly show the upgrade path 2000->2005->2022.Downside is that 2005 introduced some changes that are reverted in 2022, which will cause some unnecessarychanges in map files. > > 3. Skip patch 2, directly go to patch 3. So that, patch 3 will include changes introduced by both 2005 and 2022. This waymakes minimum changes to map files. #3 is what I had in mind to begin with unless we found some reason not to. Minimizing churn is a lucky side effect that reinforces that choice. Before getting to that, I thought I'd bring this up to the community: +# Copyright (C) 2000-2009, International Business Machines Corporation and others. +# All Rights Reserved. The previous XML file didn't contain a copyright notice -- does anyone want to make a case for not checking unicode-org's source file into our tree because of this? The 2022 update changes it to # Copyright (C) 2016 and later: Unicode, Inc. and others. # License & terms of use: http://www.unicode.org/copyright.html # Copyright (C) 2000-2012, International Business Machines Corporation and others. # All Rights Reserved. ...and the above links to https://www.unicode.org/license.txt -- John Naylor Amazon Web Services
В списке pgsql-hackers по дате отправления: