Обсуждение: Two questions about "pg_constraint"

Поиск
Список
Период
Сортировка

Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
*Question 1: why does "pg_constraint" have a "connamespace" column?*

I created this temporary view (using PG 14.4):

create temporary view all_constraints(t_owner, t_schema, t_name, c_name, same) as
select
  r.rolname,
  s.nspname,
  c.relname,
  x.conname,
  (x.connamespace = c.relnamespace)
from
  pg_class c
  inner join
  pg_roles r
  on c.relowner = r.oid
  inner join
  pg_namespace s
  on c.relnamespace = s.oid
  inner join pg_constraint x
  on c.oid = x.conrelid
where c.relkind = 'r';

I created three tables, each with a user-created constraint. The tables also have implicitly created primary key constraints.

Then I did this:

select count(*) from all_constraints;

It said that the count is over a hundred. (All but the rows for my three tables are for rows for tables in the "pg_catalog" schema.)

Then I did this:

select exists(select 1 from all_constraints where not same)::text;

It said "false".

Over one hundred seems to be a fair sample size. So it seems to be reasonable to assume that "pg_constraint.connamespace = pg_class.relnamespace" is always true. Ordinary common-sense analysis of the query suggests this too. If the hypothesis is right, then "connamespace" is simply a derived value. And this would be a departure from usual table design practice.

What do you think?

*Question 2: what happened to the column "consrc"?*

The PG 11 account of "pg_constraint"

describes "consrc" (text) thus:

« If a check constraint, a human-readable representation of the expression »

Ad hoc queries in my PG 11.9 env show results like « (v = lower(v)) » in this column for my tables. This is useful information. But the PG 14 version of "pg_constraint" has no such column (and nor does the doc mention it). Is this information now exposed somewhere else?

Re: Two questions about "pg_constraint"

От
Adrian Klaver
Дата:
On 8/24/22 13:11, Bryn Llewellyn wrote:
> *Question 1: why does "pg_constraint" have a "connamespace" column?*
> 

> What do you think?
> 
> *Question 2: what happened to the column "consrc"?*
> 
> The PG 11 account of "pg_constraint"
> https://www.postgresql.org/docs/11/catalog-pg-constraint.html 
> <https://www.postgresql.org/docs/11/catalog-pg-constraint.html>
> 
> describes "consrc" (text) thus:
> 
> « If a check constraint, a human-readable representation of the expression »
> 
> Ad hoc queries in my PG 11.9 env show results like « (v = lower(v)) » in 
> this column for my tables. This is useful information. But the PG 14 
> version of "pg_constraint" has no such column (and nor does the doc 
> mention it). Is this information now exposed somewhere else?
> 

It was in 11 but not later, so lets look at release notes from 12 --> 14

Here in 12:

"

Remove obsolete pg_constraint.consrc column (Peter Eisentraut)

This column has been deprecated for a long time, because it did not 
update in response to other catalog changes (such as column renamings). 
The recommended way to get a text version of a check constraint's 
expression from pg_constraint is pg_get_expr(conbin, conrelid). 
pg_get_constraintdef() is also a useful alternative.
"

-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Two questions about "pg_constraint"

От
Adrian Klaver
Дата:
On 8/24/22 13:17, Adrian Klaver wrote:
> On 8/24/22 13:11, Bryn Llewellyn wrote:

>> Ad hoc queries in my PG 11.9 env show results like « (v = lower(v)) » 
>> in this column for my tables. This is useful information. But the PG 
>> 14 version of "pg_constraint" has no such column (and nor does the doc 
>> mention it). Is this information now exposed somewhere else?
>>
> 
> It was in 11 but not later, so lets look at release notes from 12 --> 14

Should have been ... not 14.




-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Two questions about "pg_constraint"

От
Adrian Klaver
Дата:
On 8/24/22 13:11, Bryn Llewellyn wrote:
> *Question 1: why does "pg_constraint" have a "connamespace" column?*
> 
> I created this temporary view (using PG 14.4):
> 
> *create temporary view all_constraints(t_owner, t_schema, t_name, 
> c_name, same) as
> select
>    r.rolname,
>    s.nspname,
>    c.relname,
>    x.conname,
>    (x.connamespace = c.relnamespace)
> from
>    pg_class c
>    inner join
>    pg_roles r
>    on c.relowner = r.oid
>    inner join
>    pg_namespace s
>    on c.relnamespace = s.oid
>    inner join pg_constraint x
>    on c.oid = x.conrelid
> where c.relkind = 'r';
> *
> 
> I created three tables, each with a user-created constraint. The tables 
> also have implicitly created primary key constraints.
> 
> Then I did this:
> 
> *select count(*) from all_constraints;
> *
> It said that the count is over a hundred. (All but the rows for my three 
> tables are for rows for tables in the "pg_catalog" schema.)
> 
> Then I did this:
> 
> *select exists(select 1 from all_constraints where not same)::text;
> *
> It said "false".
> 
> Over one hundred seems to be a fair sample size. So it seems to be 
> reasonable to assume that "pg_constraint.connamespace = 
> pg_class.relnamespace" is always true. Ordinary common-sense analysis of 
> the query suggests this too. If the hypothesis is right, then 
> "connamespace" is simply a derived value. And this would be a departure 
> from usual table design practice.
> 
> What do you think?
> 

create table c1 (id integer, constraint pk1 primary key(id));
CREATE TABLE

create table c2 (id integer, constraint pk1 primary key(id));
ERROR:  relation "pk1" already exists


create table test.c2 (id integer, constraint pk1 primary key(id));
CREATE TABLE

  select conname, connamespace from pg_constraint where conname = 'pk1';
  conname | connamespace
---------+--------------
  pk1     |         2200
  pk1     |        59706


From:

https://www.postgresql.org/docs/current/catalog-pg-constraint.html

conname name

Constraint name (not necessarily unique!)

So connamespace makes it unique.

-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
adrian.klaver@aklaver.com wrote:

bryn@yugabyte.com wrote:

Question 2: what happened to the column "consrc"?

It was in 11 but not later, so let's look at release notes...

HERE:
«
Remove obsolete pg_constraint.consrc column (Peter Eisentraut)

This column has been deprecated for a long time, because it did not update in response to other catalog changes (such as column renamings). The recommended way to get a text version of a check constraint's expression from pg_constraint is pg_get_expr(conbin, conrelid). pg_get_constraintdef() is also a useful alternative.
»

Thanks, Adrian. I should have tried Google for "remove pg_constraint consrc". This finds the quote as the top hit.

I confirmed that "pg_get_expr(conbin, conrelid)" shows « (v = lower(v)) » for my example table.

Re: Two questions about "pg_constraint"

От
Tom Lane
Дата:
Bryn Llewellyn <bryn@yugabyte.com> writes:
> *Question 1: why does "pg_constraint" have a "connamespace" column?*

You appear to be assuming that every pg_constraint entry is tied to
a table.  This isn't so.

(1) That catalog also carries check constraints for domains, which
are tied to types instead.  Yeah, you could imagine some rule like
"look in either pg_class or pg_type to find the schema", but it'd
be really painful.

(2) The SQL standard describes "assertions", which are global
check constraints that can affect multiple tables.  We don't
support those, and very possibly never will, but the pg_constraint
catalog is set up to support them.  Presumably they'd be stored
with conrelid and contypid both zero, so there would be no other
place to find out the assertion's schema.

I'm not entirely convinced that putting these two (or three) sorts
of objects in the same catalog was a great design.  However, that's
what we've got and changing it seems like more trouble than it'd be
worth.

            regards, tom lane



Re: Two questions about "pg_constraint"

От
Adrian Klaver
Дата:
On 8/24/22 13:43, Bryn Llewellyn wrote:
>> /adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> wrote:/
>>

> 
> Thanks, Adrian. I should have tried Google for "remove pg_constraint 
> consrc". This finds the quote as the top hit.

To see all release notes together go here:

https://bucardo.org/postgres_all_versions

Then you can search on the page.

> 
> I confirmed that "pg_get_expr(conbin, conrelid)" shows « (v = lower(v)) 
> » for my example table.


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Greg Sabino Mullane ? Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:

Thanks, Adrian. I should have tried Google for "remove pg_constraint consrc". This finds the quote as the top hit.

To see all release notes together go here:

https://bucardo.org/postgres_all_versions

Then you can search on the page.

I confirmed that "pg_get_expr(conbin, conrelid)" shows « (v = lower(v)) » for my example table.

Thanks again, Andrian. That's gold-dust. I see that the page says:

« This page was generated on August 11, 2022 by a script (version 1.30) by Greg Sabino Mullane, and contains information for 461 versions of Postgres. »

And I went on to find this:


Do you see posts to this list, Greg? If so, then thank you very much!

Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
tgl@sss.pgh.pa.us wrote:

bryn@yugabyte.com writes:

Question 1: why does "pg_constraint" have a "connamespace" column?

You appear to be assuming that every pg_constraint entry is tied to a table. This isn't so.

I see how this might have appeared to be the case. But I did, very much, know about the possibility to define a constraint on a domain. (It's a very valuable feature and I've used it a lot.)

(1) That catalog also carries check constraints for domains, which are tied to types instead. Yeah, you could imagine some rule like "look in either pg_class or pg_type to find the schema", but it'd be really painful.

Yes, I knew that too. The documentation of the catalog tables is terse to a fault—and I, at least, have to do empirical tests to try to discern the proper mental model. I'm probably misunderstanding some things. Having said this, the results from these query seem to be informative. I did it in a "play" database where I'd created both a table with a column check constraint and a domain with a value constraint.

select count(*) from pg_constraint where (conrelid <> 0 and contypid =   0); -->> 107 rows

select count(*) from pg_constraint where (conrelid =  0 and contypid <>  0); -->> 3 rows

select exists(
    select 1 from pg_constraint
    where not (
        (conrelid =  0 and contypid <> 0) or
        (conrelid <> 0 and contypid =  0)
      )
  )::text as "conrelid and contypid both non-zero"; -->> false

My hypothesis (of which I've been unable to find a statement in the doc) is that this shows a reliable invariant. In other words, it seems that a constraint has an arc-FK relationship to its parent EITHER in "pg_class" OR in "pg_type". I think that you imply this with your "look in either pg_class or pg_type to find the schema". However, I don't feel this lookup to be painful. I anyway want a view that lists all the schema objects that implement my database app.

I've copied my definition of my "schema_objects" view at the end. It unions these (aliased) columns:

oid, name, schema_oid, owner_oid, kind

from "pg_class", "pg_type", and "pg_proc". (These, so far, are sufficient for my present purpose.) It also adds the column "namespace" to materialize which catalog table is the source:

pg_class → 'relations', pg_type → 'types', and pg_proc → 'subprograms'

Then it joins this union with "pg_namespace" and "pg_roles" to get the human-readable names for these.

The upshot is that my "schema_objects" view has a surrogate PK, "oid", and the unique business key "(name, schema, namespace)".

I don't know what better name to use for my "namespace" column. It's taken up by the name of the view for schemas (which should, in my book, be called "pg_schema"—in the singular or plural according to the usual coin-toss. My name captures the meaning and accords with the fact that I can have an object whose "name" is 'x' three times in "schema" called 's1' distinguished from each other by the fact that each has a different "namespace" thus: 'relations', 'types' and 'subprograms'.

I'm confused by these outcomes:

- when you create a composite type manually, you automatically get an entry with its name and schema BOTH in "pg_class" AND in "pg_type".

- when you create a table or a view manually, you again automatically get an entry with its name and schema BOTH in "pg_class" AND in "pg_type".

Where can I read about the mental model for this?

Anyway, with my "schema objects" in place, I can create a "constrants" view thus:

create view a.constraints(c_name, c_kind, c_expr, t_name, t_schema, t_namespace, t_kind) as
with c(conname, contype, expr, sch_obj_id) as (
  select
    conname,
    contype,
    pg_get_expr(conbin, conrelid),
    case contypid
      when 0 then conrelid
      else        contypid
    end
  from pg_constraint)
select
  c.conname,
  c.contype,
  c.expr,
  s.name,
  s.schema,
  s.namespace,
  s.kind
from
  a.schema_objects s
  inner join
  c
  on sch_obj_id = s.oid;

Then this query:

select c_name, c_kind, c_expr, t_name, t_schema, t_namespace, t_kind from a.constraints
where t_schema = any(array['s1', 's2', 's3'])
order by c_name, t_name, t_schema, t_namespace;

shows this result:

  c_name  | c_kind |     c_expr     | t_name | t_schema | t_namespace |     t_kind     
----------+--------+----------------+--------+----------+-------------+----------------
 d_chk    | c      | s2.fnc(VALUE)  | dom    | s3       | types       | domain
 itm_pkey | p      |                | itm    | s1       | relations   | ordinary-table
 t_chk    | c      | (v = lower(v)) | tab    | s1       | relations   | ordinary-table
 tab_pkey | p      |                | tab    | s1       | relations   | ordinary-table

for the test objects that I created. So (from the user P.o.V.) a constraint is uniquely identified by its name, its kind, and the identity of the schema object that it hangs off. This seems to me to be just what I want.

(2) The SQL standard describes "assertions", which are global check constraints that can affect multiple tables. We don't support those, and very possibly never will, but the pg_constraint catalog is set up to support them. Presumably they'd be stored with conrelid and contypid both zero, so there would be no other place to find out the assertion's schema.

Well yes, I wish that PG did support SQL assertions. I expect that you know that there's a vote to see how popular he idea would be for Oracle Database:


But I've no idea what might come of that.

I'm not entirely convinced that putting these two (or three) sorts of objects in the same catalog was a great design. However, that's what we've got and changing it seems like more trouble than it'd be worth.

I don't know what you mean by "in that same catalog". I thought that "catalog" was the prose name for all those tables in the "pg_catalog" schema of which we've focused on a few here: "pg_class", "pg_type", "pg_proc", "pg_namespace", "pg_roles", and "pg_constraint".

It would seem to me that a SQL assertion must be a (first class) schema object (i.e. with, as the user would see it, the unique business key "(name, schema, namespace)" and with its very own "owner"—and then all sorts of other properties.

I can't see how it could belong in "pg_constraint" because it doesn't "hang off" a single schema object (like existing kinds of constraint do) but, rather, spans any number of tables which (I'd hope) could have various owners. I'd assume there be a new "assertion on" privilege that would have to be granted to the owner of the to-be-created assertion on each of the tables that the new assertion would span.

So, to follow the pattern, there should be a new dedicated "pg_assertion" table. This would have no impact on what the existing "pg_constraint" would record because you couldn't have a constraint on an assertion.

--------------------------------------------------------------------------------

create view a.schema_objects(oid, name, schema, namespace, owner, kind) as
  with o(oid, name, schema_oid, namespace, owner_oid, kind) as
    (
      select
        oid,
        relname,
        relnamespace,
        'relations',
        relowner,
        case relkind
          when 'r' then 'ordinary-table'
          when 'v' then 'view'
          when 'i' then 'index'
          when 'S' then 'sequence'
          when 'c' then 'composite-type'
          else          'other'
        end
      from pg_class
    union all
      select
        oid,
        typname,
        typnamespace,
        'types',
        typowner,
        case typtype
          when 'b' then 'base type'
          when 'c' then 'composite-type'
          when 'd' then 'domain'
          when 'e' then 'enum'
          else          'other'
        end
      from pg_type
      -- 'b' (here) is automtically gerenerated 'base type'
      -- has no info. value for the present use-case
      where typtype <> 'b'
    union all
      select
        oid,
        proname,
        pronamespace,
        'subprograms',
        proowner,
        case prokind
          when 'f' then 'function'
          when 'p' then 'procedure'
          else          'other'
        end
      from pg_proc
    )
select
  o.oid,
  o.name,
  s.nspname,
  namespace,
  r.rolname,
  o.kind
from
  o
  inner join
  pg_namespace s
  on o.schema_oid = s.oid
  inner join pg_roles r
  on o.owner_oid = r.oid;

Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
adrian.klaver@aklaver.com wrote:


Question 1: why does "pg_constraint" have a "connamespace" column?

create table c1 (id integer, constraint pk1 primary key(id));

create table c2 (id integer, constraint pk1 primary key(id));
ERROR:  relation "pk1" already exists

create table test.c2 (id integer, constraint pk1 primary key(id));

select conname, connamespace from pg_constraint where conname = 'pk1';
conname | connamespace
---------+--------------
pk1     |         2200
pk1     |        59706

[From the doc] — conname name — Constraint name (not necessarily unique!) So connamespace makes it unique.

I'll assume that, in Adrian's example, bare "create table c1" puts it into the public schema. But it could be, equivalently, that the creating user has a search path with some other schema (but not the schema 'test’) in first-to-be-searched position.

The example implicitly brings indexes into the picture. The explanation of the "index_parameters" rule in the “create table” syntax starts off thus:

« index_parameters in UNIQUE, PRIMARY KEY, and EXCLUDE constraints are... »

and it goes on to mention some choices that you can make about the indexes that necessarily back the PK and unique constraints. (The notions apply at "alter table" time too.) However, it says nothing to indicate that the user can specify the name (or the schema) for this backing index. I can't find where the rule is stated for how the index is (as it must be) automatically named. But it seems, from experiment, that the name for the index that backs a PK or a unique constraint is simply the name that the user gave the constraint—or the generated name of the constraint if it wasn't named.

The schema-and-name combination (a.k.a. the qualified name) for an index must be unique in "pg_rel". So "create table c2 ... constraint pk1) fails because there already is a backing index called "pk1" in place in the schema in question. That's why the error message says « relation "pk1" already exists » and not « constraint "pk1" already exists ».

Here's a contrived example to emphasize this point:

create table s1.irrelevant_name_1(
  k int,
  v1 int,
  v2 int,
  constraint irrelevant_name_2 primary key(k));

create index c on s1.irrelevant_name_1(v1);

It seems odd that an index must be in the same schema as the table upon whose columns (or expressions) it is created. The "create index" account documents this restriction. An ad hoc test shows that the index must also have the same owner as its table. I can't find where this restriction is documented. I don't suppose that these restrictions pose a practical problem.

Now do this:

alter table s1.irrelevant_name_1 add
  constraint c unique(v2);

It fails with the error "relation "c" already exists" (and not "constraint "c" already exists"). And sure enough, there is no existing constraint called 'c'.

With me so far?

Now try this:

create table s1.irrelevant_name_1(
  k int primary key,
  v text,
  constraint c check(v = lower(v)));

create domain s1.irrelevant_name_2 as int[]
  constraint c check(cardinality(value) > 1);

No errors occur. B.t.w., the table and the domain must have different names because of the secondary effect that the name of a table causes a row BOTH in "pg_class" AND in "pg_type".

Now do Adrian's query:

select conname, connamespace from pg_constraint where conname = 'c';

This was my result. (Your oid values will vary.)

 conname | connamespace 
---------+--------------
 c       |      2114282
 c       |      2114282

We see two constraints with the same name "in" the same schema. I double quoted "in" because it's the wrong way to talk about this. Constraints (just like triggers) are not schema-objects. But the things that "pg_class", "pg_type", and "pg_proc" list _are_ schema-objects.

A schema-object's business unique key is its name, the name of the schema in which it's located, and the name of the catalog table where it's listed. (Elsewhere, I called this third fact the "namespace" because its the scope within which the name must be unique. But you might not like this use of "namespace".)

Anyway, "in" is used correctly when we say that a schema-object is in a schema. A schema-object also has (must have) an owner. But this isn't part of its business unique key.

In contrast, a constraint (and a trigger) are not in a schema. Nor do they have an owner. Rather, each hangs off a schema object (and must do this if it's to exist). The business unique key of each of a constraint and a trigger is its own name together with the business unique key of what it hangs off. You might argue that it inherits the schema and the owner of the schema-object off which it hangs. But I think that this is the wrong way to think about it. That's why I argue that "pg_constraint" should not have a "connamespace" column—just as it already doesn't have a "conowner" column.

The fact that "pg_constraint"  does have a "connamespace" column is an example of what I believe the argot of our trade calls a "transitive dependency" (the criterion which, added to the criteria for 2NF elevates it to 3NF). "\d pg constraint" says that the table has a unique constraint on "(conrelid, contypid, conname)". And (as I read the doc for "conrelid" and "contypid") is does imply the rule that I observed: either one is non zero and the other is zero or vice versa (and so not both zero and not both non-zero). So "(conrelid, contypid)" (i.e. just part of the unique key) does indeed translate to the unique identifier of the item (table or domain) that the constraint hangs off. When I know the value of (conrelid, contypid),  I know the oid of the schema that this schema object is in. Ergo, "connamespace" depends upon the partial key "(conrelid, contypid)"—i.e. this is a transitive dependency.

This takes me back to the query that I used as the background to pose my « why does "pg_constraint" have a "connamespace" column? » question. Here it is—in an extended form. And here it does indeed use "(conrelid, contypid)" to look up the schema oid in the appropriate one of "pg_class" or "pg_type".

It helps to establish the query, for re-use, as a temporary view.

create temporary view constraint_facts(
  conname,
  schema_object_oid,
  "catalog table",
  "schema_oid from pg_class/pg_type",
  same)
as
with
  c1(conname, schema_object_oid, c_tab, connamespace) as (
  select
    conname,
    case contypid
      when 0 then conrelid
      else        contypid
    end,
    case contypid
      when 0 then 'pg_class'
      else        'pg_type'
    end,
    connamespace
  from pg_constraint),

  c2(conname, schema_object_oid, c_tab, schema_oid, connamespace) as (
  select
    conname,
    schema_object_oid,
    c_tab,
    case c_tab
      when 'pg_class' then (select relnamespace  from pg_class where oid = schema_object_oid)
      when 'pg_type'  then (select typnamespace  from pg_type  where oid = schema_object_oid)
    end,
    connamespace
  from c1)

select
  conname,
  schema_object_oid,
  c_tab,
  schema_oid,
  (schema_oid = connamespace)
from c2;

Now use it to look at our two constraints called 'c':

select conname, schema_object_oid, "catalog table", "schema_oid from pg_class/pg_type", same::text
from constraint_facts
where conname = 'c'
order by conname, schema_object_oid, "catalog table";

This is the result:

 conname | schema_object_oid | catalog table | schema_oid from pg_class/pg_type | same 
---------+-------------------+---------------+----------------------------------+------
 c       |           2114297 | pg_class      |                          2114282 | true
 c       |           2114306 | pg_type       |                          2114282 | true

Now try it on all the entire contents of "pg_constraint"—one's own stuff together with the hundred or so rows that implement the PG system:

select exists(select 1 from constraint_facts where not same)::text;

The answer is "false". Please tell me if you can see a typo in my query, a fault in the reasoning the led to how I spelled it, or a flaw in my overall analysis. I realize, of course that the "false" result here merely shows that the hypothesis that the query tests is not disproved. I don't think that it can be proved. Rather, its truth, or otherwise, comes from the prose statement of the requirements and the design of the entire set of catalog tables—before any programming was done.

But it does seem to me that the "false" outcome is consistent with the common sense analysis that leads to the conclusion that "pg_constraint" is not in 3NF.

Two more things. First, the results from my "constrainst" view (that uses my "schema_objects" view). I showed these in this turn in the present thread:


select c_name, t_name, t_schema, t_namespace, t_kind, c_kind, c_expr from a.constraints
where c_name = 'c'
order by t_name, t_schema, t_namespace
;

This is the result:

 c_name |      t_name       | t_schema | t_namespace |     t_kind     | c_kind |          c_expr          
--------+-------------------+----------+-------------+----------------+--------+--------------------------
 c      | irrelevant_name_1 | s1       | relations   | ordinary-table | c      | (v = lower(v))
 c      | irrelevant_name_2 | s1       | types       | domain         | c      | (cardinality(VALUE) > 1)

The first four columns are each constraint's business unique key. And the last three are some of its properties or of what it hangs off.

And finally, a little test to confirm that the constraint kind is not part of a constraint's business unique key. Make sure that schema 's1' is empty when you try it.

create table s1.t(
  k int,
  v int,
  constraint c primary key(k),
  constraint c unique(v));

It fails because (as I'd say it) it attempts to create two constrains whose "(c_name, t_name, t_schema, t_namespace)" values are the same—or, briefly, « constraint "c" already exists ». In fact, the error message spells it with "relation" and not with "constraint". I think that this is an error.

Re: Two questions about "pg_constraint"

От
Christophe Pettus
Дата:

> On Aug 25, 2022, at 21:43, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> [...]

I've read this a few times, and I am having trouble understanding what behavior you were expecting out of PostgreSQL,
andwhat behavior you received that you didn't think was correct.  If it is "pg_constraint has a column connamespace,
andthat appears to be a denormalization since a constraint is always in the same schema as the table it is owned by," I
believeTom explained the reason for that. 

If that's not what is concerning you, can you summarize it in a sentence two?




Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
xof@thebuild.com wrote:

bryn@yugabyte.com wrote:
[...]

I've read this a few times, and I am having trouble understanding what behavior you were expecting out of PostgreSQL, and what behavior you received that you didn't think was correct. If it is "pg_constraint has a column connamespace, and that appears to be a denormalization since a constraint is always in the same schema as the table it is owned by”, I believe Tom explained the reason for that. If that's not what is concerning you, can you summarize it in a sentence [or] two?

Thanks for your interest in this, Christophe. I’ll start again from scratch. I’m copying David because he said something relevant to the present discussion in an earlier separate thread.

But, first, a caveat. I write, here, as if I'm 100% confident of the correctness of everything I say. I've found that this is an effective way to provoke correction when I'm wrong. And I look forward to correction from you (all) here.

I’ll be as brief as I can. But I’m afraid that I cannot condense the problem statement beyond my points #1 and #2 below. And then I need a preamble before the problem statement and a discussion after it.

Notice that I don't think that there's any wrong behavior here. Rather, there's just that annoying transitive dependency and, worse for me, what appears to be some wrong ways to talk about stuff from the user's PoV where only business unique keys matter and the surrogate oid values vanish from the picture.

For example, Christophe used the phrase "a constraint is always in the same schema as..." (and didn't mention domains). And David said "the trigger owner is the context in which the trigger function is executed". Neither a constraint nor a trigger is "in" a schema or "has" an owner. In both cases, these are derived (a.k.a. transitive) properties.

Here, now, is my problem statement:

«
1. The "pg_constraint" column, "connamespace", is a transitive dependency. Therefore, tautologically, it need not be there. (So I’m not convinced by what Tom said here.) Given that "pg_constraint" does have its "connamespace" column, it may just as well have a "conowner"column as a second transitive dependency. On the other hand, "pg_trigger" has neither a "tgnamespace" column nor a "tgowner" column. But, if you (all) think that transitive dependencies are a nice usability feature, and you (all) can guarantee that the derived values are always in step with their source, then I suppose that no harm is done—beyond possibly leading the novice to assume a faulty mental. Whatever is decided, the documentation should make it clear.

2. The larger point is that I want to draw a distinction between a (primary) schema object, like a table, a domain, or a function and a secondary object like a constraint or a trigger. The business unique key of a schema object is its schema-qualified name together with the catalog table ("pg_class", "pg_type", "pg_proc",...) that defines the uniqueness scope for its name. A schema object also has an owner—but its name is not part of its business unique key. In contrast, the business unique key of a secondary object is its own name together with the business unique key of the object off which it hangs. And a secondary object cannot exist without hanging off a schema object.
»

I have to use more words now to illustrate, and dramatize, my point. First, run my demo. Just copy-and-paste the code at the end and paste it into a ".sql" script. Then execute it in psql. You can run it time and again. It finishes by producing this output:

 conname | connamespace |  n   
---------+--------------+------
 c       |      2127201 | 2000

 tgname |  n   
--------+------
 t      | 1000

So I've created 2K constraints, all called 'c', and all "in" (as Christophe and Adrian would have it) the same schema. And I've created 1K triggers, all called 't', and all "having" (as David would have it) the same owner.

Look at the definitions of the temporary views whose results sets I showed above. Each constraint has a unique value for the tuple "(conname, conrelid, contypid)"—as it must have because my "constrains" view selects from just the single table "pg_constraint" and this places a uniqueness requirement on this tuple. It's only to be expected that I can have any number of constraints, all with the same "conname" and "connamespace" because the definition of "pg_constraint" places no uniqueness requirement on either of these, neither separately nor jointly.

Similarly, each trigger has a unique value for the tuple "(tgname, tgrelid)"—as it must have because my "triggers" view selects from just the single table "pg_trigger" and this places a uniqueness requirement on this tuple. It's only to be expected that I can have any number of constraints, all with the same "tgname" because the definition of "pg_trigger" places no uniqueness requirement on this.

With a bit more typing, I could materialize both a schema and an owner for each constraint and each trigger. (I showed the SQL for doing this for a constraint earlier in this thread. And I've done it for myself for triggers.) The situation is a little harder for constraints than for triggers because a constraint can hang off either a table or a domain while a DML trigger can hang off just a table. This leads to the arc-FK (implied, if not declared) from "pg_constraint" either to "pg_class" or to "pg_type". The documentation for "pg_constraint" on the topic:

«
conrelid (oid references pg_class.oid): The table this constraint is on; 0 if not a table constraint
contypid (oid references pg_type.oid): The domain this constraint is on; 0 if not a domain constraint
»

together with the requirement that "(conname, conrelid, contypid)" is unique (and some empirical testing) makes it clear. Thanks, Tom, for asking me to think about this. It implies more elaborate SQL to get the facts for the schema object that a constraint hangs off than it does to get those facts for a trigger. But it's not (as Tom claimed it would be) daunting. I expect that my SQL could be improved. But it wasn't hard to write what I did.

Anyway... the structure of the "pg_constraint" and "pg_trigger" tables, their unique indexes, and where their implied oid-FKs point is fully consistent with my mental model, thus:

- A schema object is uniquely identified by its qualified name and its namespace ('relations', types', 'subprograms',...).

- A constraint (as a secondary object) is uniquely identified by its own name and the identity of the schema object off which it hangs

- A trigger, too, (as a secondary object) is also uniquely identified by its own name and the identity of the schema object off which it hangs

- Because a schema object lives in a schema and has an owner, you might like to think that a constraint, or a trigger, each lives in a schema and has an owner. But I think that it's both unnecessary and sometimes unhelpful to elide the transitive step here. (This elision led Adrian to say that a constraint name must be unique within a schema. And it led David to say that the owner of a trigger plays a part in determining the "current_role" that a "security invoker" trigger function sees when the trigger fires.)

--------------------------------------------------------------------------------
-- Boring code to guarantee a fresh start and error-free re-runs.

\c postgres postgres
set client_min_messages = warning;
drop database if exists bryn;
create database bryn owner postgres;
revoke all on database bryn from public;

\c bryn postgres
set client_min_messages = warning;

do $body$
begin
  begin
    drop owned by u1 cascade;
  exception
    when undefined_object then null;
  end;
  drop schema public;
  create schema s1 authorization postgres;
  revoke usage on schema s1 from public;
  drop role if exists u1;
  create role u1 login password 'p';
  grant connect on database bryn to u1;
  grant create on database bryn to u1;
  grant temporary on database bryn to u1;
  grant usage on schema s1 to u1;
  grant create on schema s1 to u1;
  alter user u1 set search_path = pg_catalog, pg_temp;
end;
$body$;
--------------------------------------------------------------------------------
\c bryn u1
/*
  The interesting code starts here. It simply creates lots of tables and
  domains, all in the single schema 's1'. Each table has a trigger called 't'.
  And both each table and each domain has a constraint called 'c'.
*/;

create function s1.f()
  returns trigger
  security definer
  language plpgsql
as $body$
begin
  return old;
end;
$body$;

do $body$
declare
  no_of_iterations constant int not null := 1000;

  cr_table constant text not null := '
    create table s1.%I(
      k int primary key,
      v text,
      constraint c check(v = lower(v)))';

  cr_trigger constant text not null := '
    create trigger t
      after delete
      on s1.%I
      for each statement
      execute function s1.f()';

  cr_domain constant text not null := '
    create domain s1.%I as int[]
      constraint c check(cardinality(value) > 1)';
begin
  for j in 1..no_of_iterations loop
    execute format(cr_table,   'a'||j::text);
    execute format(cr_trigger, 'a'||j::text);
    execute format(cr_domain,  'b'||j::text);
  end loop;
end;
$body$;

create temporary view constraints as
select conname, conrelid, contypid, connamespace
from pg_constraint where conname = 'c';

create temporary view triggers as
select tgname, tgrelid
from pg_trigger where tgname = 't';

select conname, connamespace, count(*) as n
from constraints
group by conname, connamespace;

select tgname, count(*) as n
from triggers
group by tgname;

Re: Two questions about "pg_constraint"

От
Christophe Pettus
Дата:

> On Aug 26, 2022, at 15:33, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> [...]

I'm still not clear on what you are proposing.  Are you proposing a change to PostgreSQL to remove the "connamespace"
columnfrom the "pg_constraint" table, since it can be derived from other tables? 


Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
xof@thebuild.com wrote:

bryn@yugabyte.com wrote:
[...]

I'm still not clear on what you are proposing. Are you proposing a change to PostgreSQL to remove the "connamespace" column from the "pg_constraint" table, since it can be derived from other tables?

No, I’m not proposing any code change. I'm simply seeking the proper understanding in this space. This started off with what I thought were two simple questions.

One was about the "pg_constrain.consrc" column that vanished at some release boundary between PG Version 11, where it was present, and PG 14, where it wasn't. Adrian answered that promptly. And I learned from his reply, and what it referred to, that the Version 11 implementation was faulty and that a robust alternative way to get the answer that the "consrc" column aimed to give was available. I learned, too, a little bit more about how to navigate the PG docs. So "case closed" (and closed very quickly) there.

My other question was about the "connamespace" column. It seemed to me, both at first and still now, that this is a clear instance of a transitive dependency. And I tried to ask only if I'd understood it right. A quick, straightforward, reply, thus:

« Yes, "connamespace" does indeed hold a derived value and could be removed. However, such a change would have only a tiny practical benefit. And the effort to remove it would not be small. Therefore it will simply live on. »

would have been perfect. I would have learned that my analysis was correct. And, if I saw no reason to reference the redundant column, then I simply wouldn't. In other words, it would've been "case closed" there too.

However, my question was met by an assertion that the column was needed.

Adrian showed an experiment that was meant to show that, in the "pg_constraint" table, the "(conname, connamespace)" tuple is a unique key. But you can see that this isn't the case from `\d` on that table, the docs, and from the experiment that I showed.

Tom said that "conname" was necessary because you can have a constraint on each of a table and a domain. But `\d`, the docs, and another experiment that I showed demonstrate that this isn't the case either.

That's why this thread has gone from turn to turn.

Nobody has yet agreed that "connamespace" is indeed a transitive dependency. You, Christophe, _seem_ to agree because you did say "it can be derived from other tables". But that's not quite like saying, simply and directly, that my analysis is sound.

This leaves me, then, in doubt. I fear that I must be missing something. And this fear has now become my main concern.

See what I mean, now?

Re: Two questions about "pg_constraint"

От
Christophe Pettus
Дата:

> On Aug 26, 2022, at 18:47, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> No, I’m not proposing any code change.

Thanks for clarifying.



Re: Two questions about "pg_constraint"

От
Tom Lane
Дата:
Bryn Llewellyn <bryn@yugabyte.com> writes:
> My other question was about the "connamespace" column. It seemed to me,
> both at first and still now, that this is a clear instance of a
> transitive dependency.

I think a more productive way to think about it is that it's
denormalization for efficiency; specifically to let constraints
be looked up by name+namespace without having to get other
catalogs involved.  (SET CONSTRAINTS is one thing that requires
that, and I think there are others.)

            regards, tom lane



Re: Two questions about "pg_constraint"

От
Bryn Llewellyn
Дата:
That's the answer I was seeking. So it's case closed for both of my « Two questions about "pg_constraint" ». Thanks, Tom. And thanks to the others who've contributed to this thread.



My other question was about the "connamespace" column. It seemed to me, both at first and still now, that this is a clear instance of a transitive dependency.

I think a more productive way to think about it is that it's denormalization for efficiency; specifically to let constraints be looked up by name+namespace without having to get other catalogs involved.  (SET CONSTRAINTS is one thing that requires that, and I think there are others.)