Обсуждение: Improving collation-dependent indexes in system catalogs

Поиск
Список
Период
Сортировка

Improving collation-dependent indexes in system catalogs

От
Tom Lane
Дата:
Awhile back we noticed that a couple of system catalogs had
acquired indexes on "text" columns, which were unsafe because
their sort order was collation-dependent, so that cloning
template0 with a different database collation could yield
broken indexes.  We fixed this in commit 0b28ea79 with a
bit of a hack: we said it's okay to have such indexes as long
as they use the text_pattern_ops opclass, making them
collation insensitive.

While fooling with the idea of making type "name" collation
aware, it occurred to me that there's a better, more general
answer, which is to insist that collation-aware system catalog
columns must be marked with C collation.  This rule would apply
without modification to both "text" and "name" columns.  In the
wake of commit 5e0928005, it also means that pg_statistic data
for such a column would port safely across a database collation
change, which up to now it does not.  And I think we could have
the bootstrap code apply the rule automatically, making for one
less way to screw up when changing catalog definitions.

Thoughts, objections?

            regards, tom lane


Re: Improving collation-dependent indexes in system catalogs

От
Tom Lane
Дата:
I wrote:
> While fooling with the idea of making type "name" collation
> aware, it occurred to me that there's a better, more general
> answer, which is to insist that collation-aware system catalog
> columns must be marked with C collation.  This rule would apply
> without modification to both "text" and "name" columns.  In the
> wake of commit 5e0928005, it also means that pg_statistic data
> for such a column would port safely across a database collation
> change, which up to now it does not.  And I think we could have
> the bootstrap code apply the rule automatically, making for one
> less way to screw up when changing catalog definitions.

Concretely, this ...

            regards, tom lane

diff --git a/src/backend/access/common/scankey.c b/src/backend/access/common/scankey.c
index 781516c..5be4fe8 100644
*** a/src/backend/access/common/scankey.c
--- b/src/backend/access/common/scankey.c
*************** ScanKeyEntryInitialize(ScanKey entry,
*** 64,72 ****
   * It cannot handle NULL arguments, unary operators, or nondefault operators,
   * but we need none of those features for most hardwired lookups.
   *
!  * We set collation to DEFAULT_COLLATION_OID always.  This is appropriate
!  * for textual columns in system catalogs, and it will be ignored for
!  * non-textual columns, so it's not worth trying to be more finicky.
   *
   * Note: CurrentMemoryContext at call should be as long-lived as the ScanKey
   * itself, because that's what will be used for any subsidiary info attached
--- 64,72 ----
   * It cannot handle NULL arguments, unary operators, or nondefault operators,
   * but we need none of those features for most hardwired lookups.
   *
!  * We set collation to C_COLLATION_OID always.  This is the correct value
!  * for all collation-aware columns in system catalogs, and it will be ignored
!  * for other column types, so it's not worth trying to be more finicky.
   *
   * Note: CurrentMemoryContext at call should be as long-lived as the ScanKey
   * itself, because that's what will be used for any subsidiary info attached
*************** ScanKeyInit(ScanKey entry,
*** 83,89 ****
      entry->sk_attno = attributeNumber;
      entry->sk_strategy = strategy;
      entry->sk_subtype = InvalidOid;
!     entry->sk_collation = DEFAULT_COLLATION_OID;
      entry->sk_argument = argument;
      fmgr_info(procedure, &entry->sk_func);
  }
--- 83,89 ----
      entry->sk_attno = attributeNumber;
      entry->sk_strategy = strategy;
      entry->sk_subtype = InvalidOid;
!     entry->sk_collation = C_COLLATION_OID;
      entry->sk_argument = argument;
      fmgr_info(procedure, &entry->sk_func);
  }
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 7caab64..8e42255 100644
*** a/src/backend/bootstrap/bootstrap.c
--- b/src/backend/bootstrap/bootstrap.c
*************** DefineAttr(char *name, char *type, int a
*** 744,749 ****
--- 744,758 ----
              attrtypes[attnum]->attndims = 0;
      }

+     /*
+      * If a system catalog column is collation-aware, force it to use C
+      * collation, so that its behavior is independent of the database's
+      * collation.  This is essential to allow template0 to be cloned with a
+      * different database collation.
+      */
+     if (OidIsValid(attrtypes[attnum]->attcollation))
+         attrtypes[attnum]->attcollation = C_COLLATION_OID;
+
      attrtypes[attnum]->attstattarget = -1;
      attrtypes[attnum]->attcacheoff = -1;
      attrtypes[attnum]->atttypmod = -1;
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 8e2a248..115e4c6 100644
*** a/src/backend/catalog/genbki.pl
--- b/src/backend/catalog/genbki.pl
*************** my $GenbkiNextOid = $FirstGenbkiObjectId
*** 167,172 ****
--- 167,175 ----
  my $BOOTSTRAP_SUPERUSERID =
    Catalog::FindDefinedSymbolFromData($catalog_data{pg_authid},
      'BOOTSTRAP_SUPERUSERID');
+ my $C_COLLATION_OID =
+   Catalog::FindDefinedSymbolFromData($catalog_data{pg_collation},
+     'C_COLLATION_OID');
  my $PG_CATALOG_NAMESPACE =
    Catalog::FindDefinedSymbolFromData($catalog_data{pg_namespace},
      'PG_CATALOG_NAMESPACE');
*************** sub morph_row_for_pgattr
*** 693,699 ****

      # set attndims if it's an array type
      $row->{attndims} = $type->{typcategory} eq 'A' ? '1' : '0';
!     $row->{attcollation} = $type->{typcollation};

      if (defined $attr->{forcenotnull})
      {
--- 696,705 ----

      # set attndims if it's an array type
      $row->{attndims} = $type->{typcategory} eq 'A' ? '1' : '0';
!
!     # collation-aware catalog columns must use C collation
!     $row->{attcollation} = $type->{typcollation} != 0 ?
!         $C_COLLATION_OID : 0;

      if (defined $attr->{forcenotnull})
      {
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index b31fd5a..fdbd6a0 100644
*** a/src/backend/utils/cache/catcache.c
--- b/src/backend/utils/cache/catcache.c
***************
*** 22,27 ****
--- 22,28 ----
  #include "access/tuptoaster.h"
  #include "access/valid.h"
  #include "access/xact.h"
+ #include "catalog/pg_collation.h"
  #include "catalog/pg_operator.h"
  #include "catalog/pg_type.h"
  #include "miscadmin.h"
*************** CatalogCacheInitializeCache(CatCache *ca
*** 1014,1021 ****
          /* Fill in sk_strategy as well --- always standard equality */
          cache->cc_skey[i].sk_strategy = BTEqualStrategyNumber;
          cache->cc_skey[i].sk_subtype = InvalidOid;
!         /* Currently, there are no catcaches on collation-aware data types */
!         cache->cc_skey[i].sk_collation = InvalidOid;

          CACHE4_elog(DEBUG2, "CatalogCacheInitializeCache %s %d %p",
                      cache->cc_relname,
--- 1015,1022 ----
          /* Fill in sk_strategy as well --- always standard equality */
          cache->cc_skey[i].sk_strategy = BTEqualStrategyNumber;
          cache->cc_skey[i].sk_subtype = InvalidOid;
!         /* If a catcache key requires a collation, it must be C collation */
!         cache->cc_skey[i].sk_collation = C_COLLATION_OID;

          CACHE4_elog(DEBUG2, "CatalogCacheInitializeCache %s %d %p",
                      cache->cc_relname,
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 2359b4c..f0c52d9 100644
*** a/src/include/catalog/indexing.h
--- b/src/include/catalog/indexing.h
*************** DECLARE_UNIQUE_INDEX(pg_default_acl_oid_
*** 310,319 ****
  DECLARE_UNIQUE_INDEX(pg_db_role_setting_databaseid_rol_index, 2965, on pg_db_role_setting using btree(setdatabase
oid_ops,setrole oid_ops)); 
  #define DbRoleSettingDatidRolidIndexId    2965

! DECLARE_UNIQUE_INDEX(pg_seclabel_object_index, 3597, on pg_seclabel using btree(objoid oid_ops, classoid oid_ops,
objsubidint4_ops, provider text_pattern_ops)); 
  #define SecLabelObjectIndexId                3597

! DECLARE_UNIQUE_INDEX(pg_shseclabel_object_index, 3593, on pg_shseclabel using btree(objoid oid_ops, classoid oid_ops,
providertext_pattern_ops)); 
  #define SharedSecLabelObjectIndexId            3593

  DECLARE_UNIQUE_INDEX(pg_extension_oid_index, 3080, on pg_extension using btree(oid oid_ops));
--- 310,319 ----
  DECLARE_UNIQUE_INDEX(pg_db_role_setting_databaseid_rol_index, 2965, on pg_db_role_setting using btree(setdatabase
oid_ops,setrole oid_ops)); 
  #define DbRoleSettingDatidRolidIndexId    2965

! DECLARE_UNIQUE_INDEX(pg_seclabel_object_index, 3597, on pg_seclabel using btree(objoid oid_ops, classoid oid_ops,
objsubidint4_ops, provider text_ops)); 
  #define SecLabelObjectIndexId                3597

! DECLARE_UNIQUE_INDEX(pg_shseclabel_object_index, 3593, on pg_shseclabel using btree(objoid oid_ops, classoid oid_ops,
providertext_ops)); 
  #define SharedSecLabelObjectIndexId            3593

  DECLARE_UNIQUE_INDEX(pg_extension_oid_index, 3080, on pg_extension using btree(oid oid_ops));
*************** DECLARE_UNIQUE_INDEX(pg_policy_polrelid_
*** 333,339 ****
  DECLARE_UNIQUE_INDEX(pg_replication_origin_roiident_index, 6001, on pg_replication_origin using btree(roident
oid_ops));
  #define ReplicationOriginIdentIndex 6001

! DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, on pg_replication_origin using btree(roname
text_pattern_ops));
  #define ReplicationOriginNameIndex 6002

  DECLARE_UNIQUE_INDEX(pg_partitioned_table_partrelid_index, 3351, on pg_partitioned_table using btree(partrelid
oid_ops));
--- 333,339 ----
  DECLARE_UNIQUE_INDEX(pg_replication_origin_roiident_index, 6001, on pg_replication_origin using btree(roident
oid_ops));
  #define ReplicationOriginIdentIndex 6001

! DECLARE_UNIQUE_INDEX(pg_replication_origin_roname_index, 6002, on pg_replication_origin using btree(roname
text_ops));
  #define ReplicationOriginNameIndex 6002

  DECLARE_UNIQUE_INDEX(pg_partitioned_table_partrelid_index, 3351, on pg_partitioned_table using btree(partrelid
oid_ops));
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 6072f6b..52cd7b9 100644
*** a/src/test/regress/expected/opr_sanity.out
--- b/src/test/regress/expected/opr_sanity.out
*************** ORDER BY 1;
*** 2060,2077 ****
  -- a representational error in pg_index, but simply wrong catalog design.
  -- It's bad because we expect to be able to clone template0 and assign the
  -- copy a different database collation.  It would especially not work for
! -- shared catalogs.  Note that although text columns will show a collation
! -- in indcollation, they're still okay to index with text_pattern_ops,
! -- so allow that case.
  SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
  FROM (SELECT indexrelid, indrelid,
               unnest(indclass) as iclass, unnest(indcollation) as icoll
        FROM pg_index
        WHERE indrelid < 16384) ss
! WHERE icoll != 0 AND iclass !=
!     (SELECT oid FROM pg_opclass
!      WHERE opcname = 'text_pattern_ops' AND opcmethod =
!            (SELECT oid FROM pg_am WHERE amname = 'btree'));
   indexrelid | indrelid | iclass | icoll
  ------------+----------+--------+-------
  (0 rows)
--- 2060,2073 ----
  -- a representational error in pg_index, but simply wrong catalog design.
  -- It's bad because we expect to be able to clone template0 and assign the
  -- copy a different database collation.  It would especially not work for
! -- shared catalogs.  Collation-sensitive indexes should have "C" collation.
  SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
  FROM (SELECT indexrelid, indrelid,
               unnest(indclass) as iclass, unnest(indcollation) as icoll
        FROM pg_index
        WHERE indrelid < 16384) ss
! WHERE icoll != 0 AND
!     icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');
   indexrelid | indrelid | iclass | icoll
  ------------+----------+--------+-------
  (0 rows)
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 91c68f4..5cf4df1 100644
*** a/src/test/regress/sql/opr_sanity.sql
--- b/src/test/regress/sql/opr_sanity.sql
*************** ORDER BY 1;
*** 1333,1348 ****
  -- a representational error in pg_index, but simply wrong catalog design.
  -- It's bad because we expect to be able to clone template0 and assign the
  -- copy a different database collation.  It would especially not work for
! -- shared catalogs.  Note that although text columns will show a collation
! -- in indcollation, they're still okay to index with text_pattern_ops,
! -- so allow that case.

  SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
  FROM (SELECT indexrelid, indrelid,
               unnest(indclass) as iclass, unnest(indcollation) as icoll
        FROM pg_index
        WHERE indrelid < 16384) ss
! WHERE icoll != 0 AND iclass !=
!     (SELECT oid FROM pg_opclass
!      WHERE opcname = 'text_pattern_ops' AND opcmethod =
!            (SELECT oid FROM pg_am WHERE amname = 'btree'));
--- 1333,1344 ----
  -- a representational error in pg_index, but simply wrong catalog design.
  -- It's bad because we expect to be able to clone template0 and assign the
  -- copy a different database collation.  It would especially not work for
! -- shared catalogs.  Collation-sensitive indexes should have "C" collation.

  SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
  FROM (SELECT indexrelid, indrelid,
               unnest(indclass) as iclass, unnest(indcollation) as icoll
        FROM pg_index
        WHERE indrelid < 16384) ss
! WHERE icoll != 0 AND
!     icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');

Re: Improving collation-dependent indexes in system catalogs

От
Alvaro Herrera
Дата:
On 2018-Dec-15, Tom Lane wrote:

> I wrote:
> > While fooling with the idea of making type "name" collation
> > aware, it occurred to me that there's a better, more general
> > answer, which is to insist that collation-aware system catalog
> > columns must be marked with C collation.

> Concretely, this ...

Looks sane in a quick once-over.

I notice that some information_schema view columns end up with C
collation after this patch, and others remain with default collation.
Is that sensible?  (I think the only two cases where this might matter
at all are information_schema.parameters.parameter_name,
information_schema.routines.external_name and
information_schema.foreign_servers.foreign_server_type.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Improving collation-dependent indexes in system catalogs

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I notice that some information_schema view columns end up with C
> collation after this patch, and others remain with default collation.
> Is that sensible?  (I think the only two cases where this might matter
> at all are information_schema.parameters.parameter_name,
> information_schema.routines.external_name and
> information_schema.foreign_servers.foreign_server_type.)

Yeah.  Looking closer at that, there are no collation-sensitive indexes
in information_schema (if there were, the existing opr_sanity test would
have caught 'em).  But there are collation-sensitive table columns, which
do have pg_statistic entries, and those entries are at least nominally
broken by copying them into a database with a different default collation.

We could think of two ways to deal with that.  One is to plaster
COLLATE "C" on each textual table column in the information_schema.
A more aggressive approach is to attach COLLATE "C" to each of the
domain types that information_schema defines, which fixes the table
columns a fortiori, and also causes all of the exposed information_schema
view columns to acquire database-independent collations.

I tried both ways, as in the attached patches below (each meant to be
applied on top of my patch upthread), and they both pass check-world.

A possible advantage of the second approach is that it could end up
allowing comparisons on information_schema view columns to be translated
to indexable comparisons on the underlying "name" columns, which would
be a pleasant outcome.  On the other hand, people might be annoyed by
the semantics change, if they'd previously been doing that with the
expectation of getting database-collation-based comparisons.

I'm not sure whether the SQL standard says anything that either patch
would be violating.  I see that it does say that these domains have
CHARACTER SET SQL_TEXT, and that the collation of that character set
is implementation-defined, so I think we could get away with changing
so far as spec compliance is concerned.

            regards, tom lane

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 6227a8f3..742e2b6 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -1576,13 +1576,13 @@ GRANT SELECT ON sequences TO PUBLIC;
  */

 CREATE TABLE sql_features (
-    feature_id          character_data,
-    feature_name        character_data,
-    sub_feature_id      character_data,
-    sub_feature_name    character_data,
-    is_supported        yes_or_no,
-    is_verified_by      character_data,
-    comments            character_data
+    feature_id          character_data COLLATE "C",
+    feature_name        character_data COLLATE "C",
+    sub_feature_id      character_data COLLATE "C",
+    sub_feature_name    character_data COLLATE "C",
+    is_supported        yes_or_no COLLATE "C",
+    is_verified_by      character_data COLLATE "C",
+    comments            character_data COLLATE "C"
 );

 -- Will be filled with external data by initdb.
@@ -1599,11 +1599,11 @@ GRANT SELECT ON sql_features TO PUBLIC;
 -- clause 9.1.

 CREATE TABLE sql_implementation_info (
-    implementation_info_id      character_data,
-    implementation_info_name    character_data,
+    implementation_info_id      character_data COLLATE "C",
+    implementation_info_name    character_data COLLATE "C",
     integer_value               cardinal_number,
-    character_value             character_data,
-    comments                    character_data
+    character_value             character_data COLLATE "C",
+    comments                    character_data COLLATE "C"
 );

 INSERT INTO sql_implementation_info VALUES ('10003', 'CATALOG NAME', NULL, 'Y', NULL);
@@ -1628,13 +1628,13 @@ GRANT SELECT ON sql_implementation_info TO PUBLIC;
  */

 CREATE TABLE sql_languages (
-    sql_language_source         character_data,
-    sql_language_year           character_data,
-    sql_language_conformance    character_data,
-    sql_language_integrity      character_data,
-    sql_language_implementation character_data,
-    sql_language_binding_style  character_data,
-    sql_language_programming_language character_data
+    sql_language_source         character_data COLLATE "C",
+    sql_language_year           character_data COLLATE "C",
+    sql_language_conformance    character_data COLLATE "C",
+    sql_language_integrity      character_data COLLATE "C",
+    sql_language_implementation character_data COLLATE "C",
+    sql_language_binding_style  character_data COLLATE "C",
+    sql_language_programming_language character_data COLLATE "C"
 );

 INSERT INTO sql_languages VALUES ('ISO 9075', '1999', 'CORE', NULL, NULL, 'DIRECT', NULL);
@@ -1651,11 +1651,11 @@ GRANT SELECT ON sql_languages TO PUBLIC;
  */

 CREATE TABLE sql_packages (
-    feature_id      character_data,
-    feature_name    character_data,
-    is_supported    yes_or_no,
-    is_verified_by  character_data,
-    comments        character_data
+    feature_id      character_data COLLATE "C",
+    feature_name    character_data COLLATE "C",
+    is_supported    yes_or_no COLLATE "C",
+    is_verified_by  character_data COLLATE "C",
+    comments        character_data COLLATE "C"
 );

 INSERT INTO sql_packages VALUES ('PKG000', 'Core', 'NO', NULL, '');
@@ -1678,11 +1678,11 @@ GRANT SELECT ON sql_packages TO PUBLIC;
  */

 CREATE TABLE sql_parts (
-    feature_id      character_data,
-    feature_name    character_data,
-    is_supported    yes_or_no,
-    is_verified_by  character_data,
-    comments        character_data
+    feature_id      character_data COLLATE "C",
+    feature_name    character_data COLLATE "C",
+    is_supported    yes_or_no COLLATE "C",
+    is_verified_by  character_data COLLATE "C",
+    comments        character_data COLLATE "C"
 );

 INSERT INTO sql_parts VALUES ('1', 'Framework (SQL/Framework)', 'NO', NULL, '');
@@ -1705,9 +1705,9 @@ INSERT INTO sql_parts VALUES ('14', 'XML-Related Specifications (SQL/XML)', 'YES

 CREATE TABLE sql_sizing (
     sizing_id       cardinal_number,
-    sizing_name     character_data,
+    sizing_name     character_data COLLATE "C",
     supported_value cardinal_number,
-    comments        character_data
+    comments        character_data COLLATE "C"
 );

 INSERT INTO sql_sizing VALUES (34,    'MAXIMUM CATALOG NAME LENGTH', 63, NULL);
@@ -1753,10 +1753,10 @@ GRANT SELECT ON sql_sizing TO PUBLIC;

 CREATE TABLE sql_sizing_profiles (
     sizing_id       cardinal_number,
-    sizing_name     character_data,
-    profile_id      character_data,
+    sizing_name     character_data COLLATE "C",
+    profile_id      character_data COLLATE "C",
     required_value  cardinal_number,
-    comments        character_data
+    comments        character_data COLLATE "C"
 );

 GRANT SELECT ON sql_sizing_profiles TO PUBLIC;
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 52cd7b9..6dca1b7 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -2060,7 +2060,18 @@ ORDER BY 1;
 -- a representational error in pg_index, but simply wrong catalog design.
 -- It's bad because we expect to be able to clone template0 and assign the
 -- copy a different database collation.  It would especially not work for
--- shared catalogs.  Collation-sensitive indexes should have "C" collation.
+-- shared catalogs.
+SELECT relname, attname, attcollation
+FROM pg_class c, pg_attribute a
+WHERE c.oid = attrelid AND c.oid < 16384 AND
+    c.relkind != 'v' AND  -- we don't care about columns in views
+    attcollation != 0 AND
+    attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
+ relname | attname | attcollation
+---------+---------+--------------
+(0 rows)
+
+-- Double-check that collation-sensitive indexes have "C" collation, too.
 SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
 FROM (SELECT indexrelid, indrelid,
              unnest(indclass) as iclass, unnest(indcollation) as icoll
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 5cf4df1..64eca7e 100644
--- a/src/test/regress/sql/opr_sanity.sql
+++ b/src/test/regress/sql/opr_sanity.sql
@@ -1333,7 +1333,16 @@ ORDER BY 1;
 -- a representational error in pg_index, but simply wrong catalog design.
 -- It's bad because we expect to be able to clone template0 and assign the
 -- copy a different database collation.  It would especially not work for
--- shared catalogs.  Collation-sensitive indexes should have "C" collation.
+-- shared catalogs.
+
+SELECT relname, attname, attcollation
+FROM pg_class c, pg_attribute a
+WHERE c.oid = attrelid AND c.oid < 16384 AND
+    c.relkind != 'v' AND  -- we don't care about columns in views
+    attcollation != 0 AND
+    attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
+
+-- Double-check that collation-sensitive indexes have "C" collation, too.

 SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
 FROM (SELECT indexrelid, indrelid,
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 6227a8f3..0fbcfa8 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -208,7 +208,7 @@ CREATE DOMAIN cardinal_number AS integer
  * CHARACTER_DATA domain
  */

-CREATE DOMAIN character_data AS character varying;
+CREATE DOMAIN character_data AS character varying COLLATE "C";


 /*
@@ -216,7 +216,7 @@ CREATE DOMAIN character_data AS character varying;
  * SQL_IDENTIFIER domain
  */

-CREATE DOMAIN sql_identifier AS character varying;
+CREATE DOMAIN sql_identifier AS character varying COLLATE "C";


 /*
@@ -243,7 +243,7 @@ CREATE DOMAIN time_stamp AS timestamp(2) with time zone
  * YES_OR_NO domain
  */

-CREATE DOMAIN yes_or_no AS character varying(3)
+CREATE DOMAIN yes_or_no AS character varying(3) COLLATE "C"
     CONSTRAINT yes_or_no_check CHECK (value IN ('YES', 'NO'));


diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 52cd7b9..6dca1b7 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -2060,7 +2060,18 @@ ORDER BY 1;
 -- a representational error in pg_index, but simply wrong catalog design.
 -- It's bad because we expect to be able to clone template0 and assign the
 -- copy a different database collation.  It would especially not work for
--- shared catalogs.  Collation-sensitive indexes should have "C" collation.
+-- shared catalogs.
+SELECT relname, attname, attcollation
+FROM pg_class c, pg_attribute a
+WHERE c.oid = attrelid AND c.oid < 16384 AND
+    c.relkind != 'v' AND  -- we don't care about columns in views
+    attcollation != 0 AND
+    attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
+ relname | attname | attcollation
+---------+---------+--------------
+(0 rows)
+
+-- Double-check that collation-sensitive indexes have "C" collation, too.
 SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
 FROM (SELECT indexrelid, indrelid,
              unnest(indclass) as iclass, unnest(indcollation) as icoll
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 5cf4df1..64eca7e 100644
--- a/src/test/regress/sql/opr_sanity.sql
+++ b/src/test/regress/sql/opr_sanity.sql
@@ -1333,7 +1333,16 @@ ORDER BY 1;
 -- a representational error in pg_index, but simply wrong catalog design.
 -- It's bad because we expect to be able to clone template0 and assign the
 -- copy a different database collation.  It would especially not work for
--- shared catalogs.  Collation-sensitive indexes should have "C" collation.
+-- shared catalogs.
+
+SELECT relname, attname, attcollation
+FROM pg_class c, pg_attribute a
+WHERE c.oid = attrelid AND c.oid < 16384 AND
+    c.relkind != 'v' AND  -- we don't care about columns in views
+    attcollation != 0 AND
+    attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
+
+-- Double-check that collation-sensitive indexes have "C" collation, too.

 SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
 FROM (SELECT indexrelid, indrelid,