Re: ICU for global collation
| От | Julien Rouhaud |
|---|---|
| Тема | Re: ICU for global collation |
| Дата | |
| Msg-id | 20220817165359.mcco6ufe2vchokqo@jrouhaud обсуждение исходный текст |
| Ответ на | Re: ICU for global collation (Marina Polyakova <m.polyakova@postgrespro.ru>) |
| Ответы |
Re: ICU for global collation
|
| Список | pgsql-hackers |
Hi,
On Mon, Aug 15, 2022 at 03:06:32PM +0300, Marina Polyakova wrote:
>
> 1.1) It looks like there's a bug in the function get_db_infos
> (src/bin/pg_upgrade/info.c), where the version of the old cluster is always
> checked:
>
> if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
> snprintf(query + strlen(query), sizeof(query) - strlen(query),
> "'c' AS datlocprovider, NULL AS daticulocale, ");
> else
> snprintf(query + strlen(query), sizeof(query) - strlen(query),
> "datlocprovider, daticulocale, ");
>
> With the simple patch
>
> diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
> index df374ce4b362b4c6c87fc1fd0e476e5d6d353d9e..53ea348e211d3ac38334292bc16cb814bc13bb87
> 100644
> --- a/src/bin/pg_upgrade/info.c
> +++ b/src/bin/pg_upgrade/info.c
> @@ -319,7 +319,7 @@ get_db_infos(ClusterInfo *cluster)
>
> snprintf(query, sizeof(query),
> "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, ");
> - if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
> + if (GET_MAJOR_VERSION(cluster->major_version) < 1500)
> snprintf(query + strlen(query), sizeof(query) - strlen(query),
> "'c' AS datlocprovider, NULL AS daticulocale, ");
> else
>
> I got the expected error during the upgrade:
>
> locale providers for database "template1" do not match: old "libc", new
> "icu"
> Failure, exiting
Good catch. There's unfortunately not a lot of regression tests for
ICU-initialized clusters. I'm wondering if the build-farm client could be
taught about the locale provider rather than assuming libc. Clearly that
wouldn't have caught this issue, but it should still increase the coverage a
bit (I'm thinking of the recent problem with the abbreviated keys).
> 1.2) It looks like the mentioned asserion in dbcommands.c conflicts with the
> following lines earlier:
>
> if (dbiculocale == NULL)
> dbiculocale = src_iculocale;
>
> The following patch works for me:
>
> diff --git a/src/backend/commands/dbcommands.c
> b/src/backend/commands/dbcommands.c
> index b31a30550b025d48ba3cc250dc4c15f41f9a80be..17a2942341e528c01182fb6d4878580f2706bec9
> 100644
> --- a/src/backend/commands/dbcommands.c
> +++ b/src/backend/commands/dbcommands.c
> @@ -1048,6 +1048,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("ICU locale must be specified")));
> }
> + else
> + dbiculocale = NULL;
>
> if (dblocprovider == COLLPROVIDER_ICU)
> check_icu_locale(dbiculocale);
I think it would be better to do that in the various variable initialization.
Maybe switch the dblocprovider and dbiculocale initialization, and only
initialize dbiculocale if dblocprovider is COLLPROVIDER_ICU.
> 2) CREATE DATABASE does not always require the icu locale unlike initdb and
> createdb:
>
> $ createdb mydb --locale en_US.UTF-8 --template template0 --locale-provider
> icu
> createdb: error: database creation failed: ERROR: ICU locale must be
> specified
>
> $ psql -c "CREATE DATABASE mydb LOCALE \"en_US.UTF-8\" TEMPLATE template0
> LOCALE_PROVIDER icu" postgres
> CREATE DATABASE
>
> I'm wondering if this is not a fully-supported feature (because createdb
> creates an SQL command with LC_COLLATE and LC_CTYPE options instead of
> LOCALE option) or is it a bug in CREATE DATABASE?.. From
> src/backend/commands/dbcommands.c:
>
> if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale)
> {
> if (dlocale && dlocale->arg)
> dbiculocale = defGetString(dlocale);
> }
This discrepancy between createdb and CREATE DATABASE looks like an oversight,
as createdb indeed interprets --locale as:
if (locale)
{
if (lc_ctype)
pg_fatal("only one of --locale and --lc-ctype can be specified");
if (lc_collate)
pg_fatal("only one of --locale and --lc-collate can be specified");
lc_ctype = locale;
lc_collate = locale;
}
AFAIK the fallback in the CREATE DATABASE case is expected as POSIX locale
names should be accepted by icu, so this should work for createdb too.
В списке pgsql-hackers по дате отправления: