Re: Move defaults toward ICU in 16?

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Move defaults toward ICU in 16?
Дата
Msg-id 5ffeafe843d43a975f31479fbd211ab88404841b.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: Move defaults toward ICU in 16?  (Andres Freund <andres@anarazel.de>)
Ответы Re: Move defaults toward ICU in 16?  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Wed, 2023-02-08 at 18:22 -0800, Andres Freund wrote:
> On 2023-02-08 12:16:46 -0800, Jeff Davis wrote:
> > On Thu, 2023-02-02 at 18:10 -0500, Tom Lane wrote:
> > > Yeah.  I would be resistant to making ICU a required dependency,
> > > but it doesn't seem unreasonable to start moving towards it being
> > > our default collation support.
> >
> > Patch attached.
>
> Unfortunately this fails widely on CI, with both compile time and
> runtime

New patches attached.

  0001: build defaults to requiring ICU
  0002: initdb defaults to using ICU (if server built with ICU)

One CI test is failing: "Windows - Server 2019, VS 2019 - Meson &
ninja"; if I apply Andres patch (
https://github.com/anarazel/postgres/commit/dde7c68 ), then it works.

I ran into one annoyance with pg_upgrade, which is that a v15 cluster
initialized with the defaults requires that the v16 cluster is
initialized with --locale-provider=libc, because otherwise the old and
new cluster will have mismatching template databases. Simple to fix
once you see the error, but I wonder how many initdb scripts might be
broken? I suppose it's just the cost of changing a default? Would an
environment variable help for cases where it's difficult to pass that
extra option down through a script?

I also considered posting another patch to change the default for
CREATE COLLATION, but there are a few issues I'm not sure about. Should
the default be based on whether ICU support is available? Or the
datlocprovider for the current database? And/or some kind of
compatibility GUC?

Notes on the tests I needed to fix, in case they are interesting or
point to some kind of larger problem:

 * ecpg has a test that involves setting the client_encoding to LATIN1
which required a compatible server encoding so it was setting
ENCODING=SQL_ASCII, which ICU doesn't support. The ecpg test did not
look particularly sensitive to the locale, so I changed it to use
client_encoding=SQL_ASCII instead, so that the server encoding doesn't
matter.
 * citext has a test involving Turkish characters, which works for all
libc locales, but in ICU the test only works in Turkish locales. I skip
the test if datlocprovider='i', because citext doesn't seem very
important in an ICU world.
 * unaccent is broken if the database provider is ICU and LC_CTYPE=C,
because the t_isspace() (etc.) functions do not properly handle ICU.
Probably some other things are broken with that combination, but only
this test seems to exercise it. I just skipped the test for that broken
combination, but perhaps it should be fixed in the future.
 * initdb was being built with ICU as a dependency in meson, but not
autoconf. I assume it's fine to link ICU into initdb, so I changed the
Makefile.
 * I changed a couple tests to initialize with --locale-provider=libc.
They were testing that creating a database with the ICU provider but no
ICU locale fails, and that's easiest to test if the template is libc.
 * The CI test CompilerWarnings:mingw_cross_warning was failing because
ICU is not available. I added --without-icu in the .cirrus.yml file and
it works.

--
Jeff Davis
PostgreSQL Contributor Team - AWS



Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Borodin
Дата:
Сообщение: UUID v7
Следующее
От: Andres Freund
Дата:
Сообщение: Re: UUID v7