Обсуждение: DROP COLLATION vs pg_collation question
Dear list members, maybe a naive question but I was unable to find an answer in the fine manual (sv_SE being an example) Does running DROP COLLATION IF EXISTS pg_catalog."sv_SE" also remove the corresponding row from pg_collation (assuming nothing depends on collation sv_SE) ? Experimentation seems to hint that way but I was unable to confirm. Thanks Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On Tue, 2024-06-11 at 23:15 +0200, Karsten Hilbert wrote: > maybe a naive question but I was unable to find an answer in > the fine manual (sv_SE being an example) > > Does running > > DROP COLLATION IF EXISTS pg_catalog."sv_SE" > > also remove the corresponding row from pg_collation (assuming > nothing depends on collation sv_SE) ? > > Experimentation seems to hint that way but I was unable to > confirm. Yes, that will delete a row from "pg_collation". System catalogs like "pg_collation" is where the database metadata are actually stored. Note that with DROP COLLATION you can only remove collations that belong to the encoding of your current database. Yours, Laurenz Albe
> > DROP COLLATION IF EXISTS pg_catalog."...." > > Yes, that will delete a row from "pg_collation". Many thanks. > Note that with DROP COLLATION you can only remove collations > that belong to the encoding of your current database. A-ha ! Can that bit be found anywhere in the docs ? IOW, the following code is exactly useless ? (because of the "collencoding <> _db_encoding" business ;-) create function gm.remove_unneeded_collations() returns void language plpgsql security definer as ' DECLARE _rec record; _db_name text; _db_encoding integer; BEGIN SELECT pg_catalog.current_database() INTO _db_name; SELECT encoding INTO _db_encoding FROM pg_database WHERE datname = _db_name; RAISE NOTICE ''database [%]: removing collations for encodings other than the database encoding [%]'', _db_name,pg_catalog.pg_encoding_to_char(_db_encoding); FOR _rec IN ( SELECT oid, collnamespace, collname, collencoding FROM pg_collation WHERE oid > 1000 AND collencoding IS NOT NULL AND collencoding <> -1 AND collencoding <> _db_encoding ) LOOP RAISE NOTICE ''dropping collation #% "%.%" (encoding: %)'', _rec.oid, _rec.collnamespace::regnamespace, _rec.collname,pg_catalog.pg_encoding_to_char(_rec.collencoding); BEGIN EXECUTE ''DROP COLLATION IF EXISTS '' || _rec.collnamespace::regnamespace || ''."'' || _rec.collname || ''"''; EXCEPTION WHEN undefined_object THEN RAISE NOTICE ''collation does not seem to exist (perhaps for the DB encoding ?)''; END; END LOOP; END;'; The reason for this being the wish to reduce the risk surface for locale version information changes at the OS level by removing collations not relevant to a given database. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On Wed, 2024-06-12 at 21:13 +0200, Karsten Hilbert wrote: > > Note that with DROP COLLATION you can only remove collations > > that belong to the encoding of your current database. > > A-ha ! Can that bit be found anywhere in the docs ? src/backend/catalog/namespace.c: /* * get_collation_oid - find a collation by possibly qualified name * * Note that this will only find collations that work with the current * database's encoding. */ Oid get_collation_oid(List *collname, bool missing_ok) Yours, Laurenz Albe
Am Thu, Jun 13, 2024 at 09:49:46AM +0200 schrieb Laurenz Albe: > > > Note that with DROP COLLATION you can only remove collations > > > that belong to the encoding of your current database. > > src/backend/catalog/namespace.c: > > /* > * get_collation_oid - find a collation by possibly qualified name > * > * Note that this will only find collations that work with the current > * database's encoding. > */ > Oid > get_collation_oid(List *collname, bool missing_ok) Thanks. Are collations per-database or per-cluster objects ? I am asking because I seem to not be enabled to 1) use a collation that's intended for an encoding different from the database encoding -> makes sense 2) remove a collation that's intended for an encoding different from the database encoding -> so far so good, ignore them, but 3) update collation version information in pg_collations for collations intended for an encoding different from the database encoding (ALTER COLLATION ... REFRESH VERSION fails) which in effect would mean that -- upon change of collation versions in the underlying operating system (ICU update, libc update) -- one would have to live with outdated version information in pg_collations short of dump/sed/restore or some such ? I'm pretty sure I am overlooking something. Best, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On Fri, 2024-06-14 at 22:08 +0200, Karsten Hilbert wrote: > Are collations per-database or per-cluster objects ? Each database has its own "pg_collation" catalog table. So they are local to the database, but the collations themselves are defined by an external library, so the implementation is shared. > I am asking because I seem to not be enabled to > > 3) update collation version information in pg_collations for > collations intended for an encoding different from the > database encoding (ALTER COLLATION ... REFRESH VERSION fails) > > which in effect would mean that -- upon change of collation > versions in the underlying operating system (ICU update, libc > update) -- one would have to live with outdated version > information in pg_collations short of dump/sed/restore or > some such ? That should not happen. What error do you get when you ALTER COLLATION ... REFRESH VERSION Does the following give you the same error? ALTER DATABASE ... REFRESH COLLATION VERSION Yours, Laurenz Albe
Am Sun, Jun 16, 2024 at 06:53:31AM +0200 schrieb Laurenz Albe: > On Fri, 2024-06-14 at 22:08 +0200, Karsten Hilbert wrote: > > Are collations per-database or per-cluster objects ? > > Each database has its own "pg_collation" catalog table. > > So they are local to the database, I would have thought so, thanks for confirming. > but the collations themselves > are defined by an external library, so the implementation is shared. Which in turn means I cannot at all truly _remove_ collations from a cluster at the SQL level, only make them invisible (and thereby not-to-be-used) inside a particular database by removing them from pg_collations via DROP COLLATION, right ? > > 3) update collation version information in pg_collations for > > collations intended for an encoding different from the > > database encoding (ALTER COLLATION ... REFRESH VERSION fails) > > > > which in effect would mean that -- upon change of collation > > versions in the underlying operating system (ICU update, libc > > update) -- one would have to live with outdated version > > information in pg_collations short of dump/sed/restore or > > some such ? > > That should not happen. What error do you get when you > > ALTER COLLATION ... REFRESH VERSION The error I got was to the effect of insufficient permissions (the connected user wasn't the owner of the collation). ALTERing as superuser updated collation version information just fine, so PEBKAC. For the record: Yes, collation versions CAN be updated regardless of whether a given collation applies to the database's encoding. > Does the following give you the same error? > > ALTER DATABASE ... REFRESH COLLATION VERSION That did not show the same error because I ran it as database owner (?). The database owner does not own the collations, however, which made the above fail. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On Sun, 2024-06-16 at 11:27 +0200, Karsten Hilbert wrote: > > the collations themselves > > are defined by an external library, so the implementation is shared. > > Which in turn means I cannot at all truly _remove_ collations > from a cluster at the SQL level, only make them invisible > (and thereby not-to-be-used) inside a particular database by > removing them from pg_collations via DROP COLLATION, right ? As far as PostgreSQL is concerned, you can remove them. You cannot remove the C library, but a PostgreSQL user can only use the collation if there is a collation defined in PostgreSQL. (Actually, that's not quite true: in CREATE DATABASE, you can use ICU collations. That does not depend on the collations defined in pg_collation.) Also, there is nothing that keeps a user from running CREATE COLLATION to create a collation in a schema where the user can CREATE objects. I wouldn't try too hard to make it impossible for users to use a collation you don't want. Dropping the collations is good enough to keep a user from using the wrong collation by mistake. Yours, Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes: > On Sun, 2024-06-16 at 11:27 +0200, Karsten Hilbert wrote: >> Which in turn means I cannot at all truly _remove_ collations >> from a cluster at the SQL level, only make them invisible >> (and thereby not-to-be-used) inside a particular database by >> removing them from pg_collations via DROP COLLATION, right ? > As far as PostgreSQL is concerned, you can remove them. It's really kind of moot, since you can't change the encoding of an existing database. So any pg_collation entries that are for an incompatible encoding cannot be used for anything in that database, and they might as well not be there. The reason they are there is merely an implementation detail: CREATE DATABASE clones those catalogs from the single copy of pg_collation in template0, which therefore had better include all collations that might be needed. regards, tom lane
Am Sun, Jun 16, 2024 at 04:38:49PM -0400 schrieb Tom Lane: > It's really kind of moot, since you can't change the encoding > of an existing database. So any pg_collation entries that are > for an incompatible encoding cannot be used for anything in that > database, and they might as well not be there. The reason they > are there is merely an implementation detail: CREATE DATABASE clones > those catalogs from the single copy of pg_collation in template0, > which therefore had better include all collations that might be > needed. I see, and since any database can be used as a template for more databases, which can be create with an encoding different from the template, it doesn't really make too much sense to be able to remove even pg_collation entries. So, DROP COLLATION is somewhat of a smoking gun pointed at my foot :-) Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Am Tue, Jun 18, 2024 at 03:02:56PM +0200 schrieb Karsten Hilbert: > I see, and since any database can be used as a template for > more databases, which can be create with an encoding > different from the template, Proving myself wrong: root@hermes:~/tmp# sudo -u postgres psql -e -f /tmp/db.sql Output format is wrapped. drop database if exists new_tmpl; DROP DATABASE create database new_tmpl with encoding = 'utf8'; CREATE DATABASE You are now connected to database "new_tmpl" as user "postgres". select current_database(); current_database ------------------ new_tmpl (1 row) drop database if exists new_latin1; psql:/tmp/db.sql:8: HINWEIS: Datenbank »new_latin1« existiert nicht, wird übersprungen DROP DATABASE create database new_latin1 with template = new_tmpl encoding = 'latin1' locale = 'de_DE@latin1'; psql:/tmp/db.sql:9: FEHLER: neue Kodierung (LATIN1) ist inkompatibel mit der Kodierung der Template-Datenbank (UTF8) TIP: Verwenden Sie die gleiche Kodierung wie die Template-Datenbank oder verwenden Sie template0 als Template. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 6/18/24 06:32, Karsten Hilbert wrote: > Am Tue, Jun 18, 2024 at 03:02:56PM +0200 schrieb Karsten Hilbert: > >> I see, and since any database can be used as a template for >> more databases, which can be create with an encoding >> different from the template, > > Proving myself wrong: > > > root@hermes:~/tmp# sudo -u postgres psql -e -f /tmp/db.sql > Output format is wrapped. > drop database if exists new_tmpl; > DROP DATABASE > create database new_tmpl with encoding = 'utf8'; > CREATE DATABASE > You are now connected to database "new_tmpl" as user "postgres". > select current_database(); > current_database > ------------------ > new_tmpl > (1 row) > > drop database if exists new_latin1; > psql:/tmp/db.sql:8: HINWEIS: Datenbank »new_latin1« existiert nicht, wird übersprungen > DROP DATABASE > create database new_latin1 with template = new_tmpl encoding = 'latin1' locale = 'de_DE@latin1'; > psql:/tmp/db.sql:9: FEHLER: neue Kodierung (LATIN1) ist inkompatibel mit der Kodierung der Template-Datenbank (UTF8) > TIP: Verwenden Sie die gleiche Kodierung wie die Template-Datenbank oder verwenden Sie template0 als Template. That is covered here: https://www.postgresql.org/docs/16/manage-ag-templatedbs.html "Another common reason for copying template0 instead of template1 is that new encoding and locale settings can be specified when copying template0, whereas a copy of template1 must use the same settings it does. This is because template1 might contain encoding-specific or locale-specific data, while template0 is known not to." Substitute <db_name other then template0> for template1 and you can get the error you received. > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com