Обсуждение: BUG #17170: Invalid collation created with provider icu and initdb' locale C

Поиск
Список
Период
Сортировка

BUG #17170: Invalid collation created with provider icu and initdb' locale C

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17170
Logged by:          Andrew Bille
Email address:      andrewbille@gmail.com
PostgreSQL version: 14beta3
Operating system:   centos-7
Description:

Hi!
When using postgres built "--with-icu" I get an invalid collation with the
following script:
initdb -D db --locale=C
createdb testdb
psql testdb -c "CREATE COLLATION test (provider=icu,
locale='en_US.utf8');"
psql testdb -c "SELECT 'a' < 'b' COLLATE test;"
ERROR:  collation "test" for encoding "SQL_ASCII" does not exist
LINE 1: SELECT 'a' < 'b' COLLATE test;
                         ^
or 
psql testdb -c "DROP COLLATION test;"
ERROR:  collation "test" for encoding "SQL_ASCII" does not exist

Reproduced on REL_10_STABLE..REL_14_STABLE.

Thanks!


Re: BUG #17170: Invalid collation created with provider icu and initdb' locale C

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> When using postgres built "--with-icu" I get an invalid collation with the
> following script:
> initdb -D db --locale=C
> createdb testdb
> psql testdb -c "CREATE COLLATION test (provider=icu,
> locale='en_US.utf8');"
> psql testdb -c "SELECT 'a' < 'b' COLLATE test;"
> ERROR:  collation "test" for encoding "SQL_ASCII" does not exist
> LINE 1: SELECT 'a' < 'b' COLLATE test;
>                          ^

That error doesn't seem inappropriate; ICU collations don't work
in SQL_ASCII encoding.

> psql testdb -c "DROP COLLATION test;"
> ERROR:  collation "test" for encoding "SQL_ASCII" does not exist

That, however, is a bit annoying.  If you can create the collation
you should be able to drop it.

The immediate issue is that DROP COLLATION uses the same lookup
rule as other code paths, so it fails to see inapplicable
pg_collation entries at all.  So one approach to a fix is to
relax its lookup rule.  But I wonder if we should have prevented
the CREATE in the first place, instead.

Unfortunately, either approach is likely to still have some
edge-case odd behaviors :-(.  This whole business of having the
database encoding be implicitly part of the lookup rule is just
prone to that.

            regards, tom lane



Re: BUG #17170: Invalid collation created with provider icu and initdb' locale C

От
Tom Lane
Дата:
I wrote:
> The immediate issue is that DROP COLLATION uses the same lookup
> rule as other code paths, so it fails to see inapplicable
> pg_collation entries at all.  So one approach to a fix is to
> relax its lookup rule.  But I wonder if we should have prevented
> the CREATE in the first place, instead.

After a bit of experimentation, I'm inclined to the idea that
we should have rejected the CREATE COLLATION in the first place.
It's not very useful to create a collation that you're not going
to be able to use, and I think it's more user-friendly to say so
up front than to leave the user guessing about why subsequent
references don't work.

There is arguably one use-case that this fix shuts off: manually
adding ICU-based collations to template0, in the expectation
that they could get cloned into other databases where they'd be
useful.  But that seems like a mighty thin argument, especially
since you could initdb with an ICU-compatible encoding if you
want to do that.

            regards, tom lane

diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 5a2ba56cec..9b8a259eb2 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -215,7 +215,19 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
     if (!fromEl)
     {
         if (collprovider == COLLPROVIDER_ICU)
+        {
+            /*
+             * While an ICU collation can support various encodings, only
+             * allow one to be created when the current database's encoding is
+             * supported.  Otherwise we get surprising behaviors like not
+             * being able to drop the collation.
+             */
+            if (!is_encoding_supported_by_icu(GetDatabaseEncoding()))
+                ereport(ERROR,
+                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                         errmsg("current database's encoding is not supported with this provider")));
             collencoding = -1;
+        }
         else
         {
             collencoding = GetDatabaseEncoding();