Re: Built-in CTYPE provider

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Built-in CTYPE provider
Дата
Msg-id 67df0672-5bc0-4b2b-b9e0-00e12bdca601@eisentraut.org
обсуждение исходный текст
Ответ на Re: Built-in CTYPE provider  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Built-in CTYPE provider
Re: Built-in CTYPE provider
Список pgsql-hackers
On 12.01.24 03:02, Jeff Davis wrote:
> New version attached. Changes:
> 
>   * Named collation object PG_C_UTF8, which seems like a good idea to
> prevent name conflicts with existing collations. The locale name is
> still C.UTF-8, which still makes sense to me because it matches the
> behavior of the libc locale of the same name so closely.

I am catching up on this thread.  The discussions have been very 
complicated, so maybe I didn't get it all.

The patches look pretty sound, but I'm questioning how useful this 
feature is and where you plan to take it.

Earlier in the thread, the aim was summarized as

 > If the Postgres default was bytewise sorting+locale-agnostic
 > ctype functions directly derived from Unicode data files,
 > as opposed to libc/$LANG at initdb time, the main
 > annoyance would be that "ORDER BY textcol" would no
 > longer be the human-favored sort.

I think that would be a terrible direction to take, because it would 
regress the default sort order from "correct" to "useless".  Aside from 
the overall message this sends about how PostgreSQL cares about locales 
and Unicode and such.

Maybe you don't intend for this to be the default provider?  But then 
who would really use it?  I mean, sure, some people would, but how would 
you even explain, in practice, the particular niche of users or use cases?

Maybe if this new provider would be called "minimal", it might describe 
the purpose better.

I could see a use for this builtin provider if it also included the 
default UCA collation (what COLLATE UNICODE does now).  Then it would 
provide a "common" default behavior out of the box, and if you want more 
fine-tuning, you can go to ICU.  There would still be some questions 
about making sure the builtin behavior and the ICU behavior are 
consistent (different Unicode versions, stock UCA vs CLDR, etc.).  But 
for practical purposes, it might work.

There would still be a risk with that approach, since it would 
permanently marginalize ICU functionality, in the sense that only some 
locales would need ICU, and so we might not pay the same amount of 
attention to the ICU functionality.

I would be curious what your overall vision is here?  Is switching the 
default to ICU still your goal?  Or do you want the builtin provider to 
be the default?  Or something else?




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Anton Voloshin
Дата:
Сообщение: 039_end_of_wal: error in "xl_tot_len zero" test
Следующее
От: Andy Fan
Дата:
Сообщение: Re: the s_lock_stuck on perform_spin_delay