Обсуждение: Q: error on updating collation version information
Dear all,
following an ICU upgrade, collations in a stock Debian PG 15.1
cluster now have divergent version information in pg_collations.
Now
gnumed_v22=> ALTER COLLATION pg_catalog."br_FR@euro" REFRESH VERSION;
ERROR: collation "pg_catalog.br_FR@euro" for encoding "UTF8" does not exist
despite
gnumed_v22=> select *, pg_encoding_to_char(collencoding) from pg_collation where collname = 'br_FR@euro';
-[ RECORD 1 ]-------+-----------
oid | 12413
collname | br_FR@euro
collnamespace | 11
collowner | 10
collprovider | c
collisdeterministic | t
collencoding | 16
collcollate | br_FR@euro
collctype | br_FR@euro
colliculocale |
collversion | 2.35
pg_encoding_to_char | LATIN9
However, note the UTF8 vs LATIN9.
The manual sayeth:
Some (less frequently used) encodings are not supported
by ICU. When the database encoding is one of these, ICU
collation entries in pg_collation are ignored. Attempting
to use one will draw an error along the lines of
“collation "de-x-icu" for encoding "WIN874" does not
exist”.
which sounds somewhat related.
The database encoding is UTF8. That br_FR@euro.LATIN9 had
_not_ been added manually. It is also not actively used in my
database(s).
What is the canonical advice on the way forward here ? Is
the _suggested_ solution to delete the collation or am I
missing to see the "proper" approach to fixing it ?
Thanks,
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Sun, Dec 04, 2022 at 01:22:02PM +0100 schrieb Karsten Hilbert:
> gnumed_v22=> ALTER COLLATION pg_catalog."br_FR@euro" REFRESH VERSION;
> ERROR: collation "pg_catalog.br_FR@euro" for encoding "UTF8" does not exist
The OS (libc) does seem to know that collation:
@hermes:~$ locale -a | grep br_FR
br_FR
br_FR@euro
br_FR.utf8
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Sun, Dec 04, 2022 at 01:22:02PM +0100 schrieb Karsten Hilbert: > following an ICU upgrade, collations in a stock Debian PG 15.1 > cluster now have divergent version information in pg_collations. Correction: this is following a libc upgrade 2.35 -> 2.36 Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 12/4/22 04:35, Karsten Hilbert wrote:
> Am Sun, Dec 04, 2022 at 01:22:02PM +0100 schrieb Karsten Hilbert:
>
>> following an ICU upgrade, collations in a stock Debian PG 15.1
>> cluster now have divergent version information in pg_collations.
>
> Correction: this is following a libc upgrade 2.35 -> 2.36
So to be clear this database is not using ICU, but collations from libc?
How was the database installed?
In first post you had:
gnumed_v22=> select *, pg_encoding_to_char(collencoding) from
pg_collation where collname = 'br_FR@euro';
-[ RECORD 1 ]-------+-----------
oid | 12413
collname | br_FR@euro
collnamespace | 11
collowner | 10
collprovider | c
collisdeterministic | t
collencoding | 16
collcollate | br_FR@euro
collctype | br_FR@euro
colliculocale |
collversion | 2.35
pg_encoding_to_char | LATIN9
where collprovider c means libc and collversion 2.35.
Not exactly sure how that interacts with from here:
https://www.postgresql.org/docs/current/catalog-pg-collation.html
collversion text
Provider-specific version of the collation. This is recorded when the
collation is created and then checked when it is used, to detect changes
in the collation definition that could lead to data corruption.
>
> Karsten
> --
> GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Am Sun, Dec 04, 2022 at 10:09:47AM -0800 schrieb Adrian Klaver:
> >>following an ICU upgrade, collations in a stock Debian PG 15.1
> >>cluster now have divergent version information in pg_collations.
> >
> >Correction: this is following a libc upgrade 2.35 -> 2.36
>
> So to be clear this database is not using ICU, but collations from libc?
Sorry for the confusion.
This database carries collations from _both_ libc and ICU in
pg_collations.
The collation in question (br_FR@euro) is _not_ in use (as in
being depended on by any in-database object).
> How was the database installed?
stock Debian
apt-get install postgresql-15 (which gives 15.1)
followed by
CREATE DATABASE "gnumed_v22" with owner = "redacted :-)" template = "template1" encoding = 'unicode';
as "postgres".
> In first post you had:
>
> gnumed_v22=> select *, pg_encoding_to_char(collencoding) from pg_collation where
> collname = 'br_FR@euro';
> -[ RECORD 1 ]-------+-----------
> oid | 12413
> collname | br_FR@euro
> collnamespace | 11
> collowner | 10
> collprovider | c
> collisdeterministic | t
> collencoding | 16
> collcollate | br_FR@euro
> collctype | br_FR@euro
> colliculocale |
> collversion | 2.35
> pg_encoding_to_char | LATIN9
>
> where collprovider c means libc and collversion 2.35.
Yeah, that's when I figured that I misspoke about the ICU upgrade.
Yes, there was an ICU upgrade, and yes, it did affect
collations. Those I was able to fix up (the "reindex /
revalidate constraint / refresh collation version" dance).
There also was a libc upgrade which also affected locales.
Most of them were fixable by that dance but some popped up
(such as br_FR@euro) to not be "correctable" showing the
"does not exist for encoding" error.
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Is this to be expected ?
PG 15.1 on Debian:
gnumed_v22=# select *, pg_collation_actual_version(oid), pg_encoding_to_char(collencoding) from pg_collation where
collname= 'zh_TW';
-[ RECORD 1 ]---------------+------------
oid | 12985
collname | zh_TW
collnamespace | 11
collowner | 10
collprovider | c
collisdeterministic | t
collencoding | 4
collcollate | zh_TW.euctw
collctype | zh_TW.euctw
colliculocale |
collversion | 2.35
pg_collation_actual_version | 2.36
pg_encoding_to_char | EUC_TW
-[ RECORD 2 ]---------------+------------
oid | 12986
collname | zh_TW
collnamespace | 11
collowner | 10
collprovider | c
collisdeterministic | t
collencoding | 6
collcollate | zh_TW.utf8
collctype | zh_TW.utf8
colliculocale |
collversion | 2.36
pg_collation_actual_version | 2.36
pg_encoding_to_char | UTF8
gnumed_v22=# begin;
BEGIN
gnumed_v22=*# alter collation pg_catalog."zh_TW" refresh version ;
NOTICE: version has not changed
ALTER COLLATION
gnumed_v22=*# alter collation pg_catalog."zh_TW.utf8" refresh version ;
NOTICE: version has not changed
ALTER COLLATION
gnumed_v22=*# alter collation pg_catalog."zh_TW.euctw" refresh version ;
ERROR: collation "pg_catalog.zh_TW.euctw" for encoding "UTF8" does not exist
gnumed_v22=!#
As far as I can tell the documentation asserts that since the
database encoding is UTF8 the pg_catalog."zh_TW.euctw" will
be ignored by the server for all practical purposes.
Does this mean it is impossible to "correct" its version
information ?
And if so, that is expected to be non-harmful and is not
expected to trigger nag messages ?
Thanks,
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Karsten Hilbert wrote: > The database encoding is UTF8. That br_FR@euro.LATIN9 had > _not_ been added manually. It is also not actively used in my > database(s). br_FR@euro.LATIN9 cannot be used actively in an UTF-8 database because it's for a different encoding than the database. It was probably available in the OS at initdb time. Every available locale (see locale -a) gets imported into the template databases, and then into the other databases, since CREATE DATABASE copies a template without filtering out the locales that are incompatible with the target database. There's no need to upgrade anything in the OS to get the ALTER COLLATION... REFRESH error you mention. It's sufficient to have a mix of collations for different encodings and try to refresh collations whose encoding are not compatible with the current database. > What is the canonical advice on the way forward here ? Is > the _suggested_ solution to delete the collation or am I > missing to see the "proper" approach to fixing it ? ISTM that dropping that collation from any non-LATIN9 database is the more sensible action. Maybe it could be discussed as a possible improvement to have ALTER COLLATION... REFRESH ignore the database encoding and still refresh the version number. Best regards, -- Daniel Vérité https://postgresql.verite.pro/ Twitter: @DanielVerite
Dear all, I managed to drop the "special" collations default, C, and POSIX with OIDs 100, 950, 951. Is there a way to recreate them (short of restoring a backup) ? Naive attempts with create collation do not seem to work out. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On Wed, 2022-12-21 at 15:24 +0100, Karsten Hilbert wrote:
> I managed to drop the "special" collations default, C, and
> POSIX with OIDs 100, 950, 951.
>
> Is there a way to recreate them (short of restoring a backup)
> ? Naive attempts with create collation do not seem to work
> out.
I would definitely go for the backup, but here is how you can
create these three rows in PostgreSQL v15:
INSERT INTO pg_collation
(oid, collname, collnamespace, collowner, collprovider,
collisdeterministic, collencoding, collcollate, collctype)
VALUES
(100, 'default', 11, 10, 'd', TRUE, -1, NULL, NULL),
(950, 'C', 11, 10, 'c', TRUE, -1, 'C', 'C'),
(951, 'POSIX', 11, 10, 'c', TRUE, -1, 'POSIX', 'POSIX');
Yours,
Laurenz Albe
Am Wed, Dec 21, 2022 at 03:41:24PM +0100 schrieb Laurenz Albe: > I would definitely go for the backup, but here is how you can > create these three rows in PostgreSQL v15: > > INSERT INTO pg_collation > (oid, collname, collnamespace, collowner, collprovider, > collisdeterministic, collencoding, collcollate, collctype) > VALUES > (100, 'default', 11, 10, 'd', TRUE, -1, NULL, NULL), > (950, 'C', 11, 10, 'c', TRUE, -1, 'C', 'C'), > (951, 'POSIX', 11, 10, 'c', TRUE, -1, 'POSIX', 'POSIX'); Many thanks ! I wasn't so sure whether inserting appropriate rows would be equivalent to create collation... (pg_collation might have been a view projecting inner workings of the server engine). Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Wed, Dec 21, 2022 at 03:46:48PM +0100 schrieb Karsten Hilbert: > I wasn't so sure whether inserting appropriate > rows would be equivalent to create collation... For that matter, is DELETE FROM pg_collation ... equivalent to DROP COLLATION ? Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
> For that matter, is DELETE FROM pg_collation ... equivalent
> to DROP COLLATION ?
There's also entries in pg_depend and pg_shdepend to worry
about.
For these built-in collations, as of v15 there are no such
entries, but prior versions had explicit "pin" entries.
regards, tom lane