Обсуждение: domain for WITHOUT OVERLAPS

Поиск
Список
Период
Сортировка

domain for WITHOUT OVERLAPS

От
jian he
Дата:
hi.

While working on domain IS JSON, I found out that
WITHOUT OVERLAPS does not support for domain too.
but it does support user-defined range types (via CREATE TYPE).

after looking around:
check_exclusion_or_unique_constraint->ExecWithoutOverlapsNotEmpty
ExecWithoutOverlapsNotEmpty typtype should be domain's basetype's typtype
otherwise it will fallback to:
            elog(ERROR, "WITHOUT OVERLAPS column \"%s\" is not a range
or multirange",
                 NameStr(attname));

That means we need to cheaply get the domain basetype's
pg_type.typtype in lookup_type_cache.
so I added a new char field: TypeCacheEntry.domainBaseTyptype.

it should be working:
For primary key, unique constraint: pg_index.indclass is
the same for domain and basetype.
no need to worry too much about pg_index.indcollation because range type is not
collateable.
For foreign key: pg_constraint (conpfeqop, conppeqop, conffeqop) are the same
for domain and basetype.


--
jian
https://www.enterprisedb.com/

Вложения

Re: domain for WITHOUT OVERLAPS

От
Paul A Jungwirth
Дата:
On Tue, Dec 2, 2025 at 11:39 PM jian he <jian.universality@gmail.com> wrote:
>
> While working on domain IS JSON, I found out that
> WITHOUT OVERLAPS does not support for domain too.
> but it does support user-defined range types (via CREATE TYPE).
>
> after looking around:
> check_exclusion_or_unique_constraint->ExecWithoutOverlapsNotEmpty
> ExecWithoutOverlapsNotEmpty typtype should be domain's basetype's typtype
> otherwise it will fallback to:
>             elog(ERROR, "WITHOUT OVERLAPS column \"%s\" is not a range
> or multirange",
>                  NameStr(attname));
>
> That means we need to cheaply get the domain basetype's
> pg_type.typtype in lookup_type_cache.
> so I added a new char field: TypeCacheEntry.domainBaseTyptype.

Thanks for the bug report and fix! Have you created a commitfest entry
for this? I didn't find one.

+CREATE DOMAIN d_textrange1 AS int4range CHECK (VALUE <> '[10,10]');
+CREATE DOMAIN d_textrange2 AS textrange2 CHECK (VALUE <> '[c,c]');
+CREATE DOMAIN d_textrange2c AS d_textrange2;

The first domain should be called d_int4range or d_int4range1, right?

Also let's name them like int4range_d so that we can use int4_d_range
for a range over a domain.

Please use closed/open notation to match the rest of the ranges in the
file: '[10,11)' and '[c,d)'.

I think there are these cases to consider:

- WITHOUT OVERLAPS on a rangetype whose subtype has a domain
- WITHOUT OVERLAPS on a rangetype with a domain on itself
- WITHOUT OVERLAPS on a multirangetype whose subtype has a domain
- WITHOUT OVERLAPS on a multirangetype whose rangetype has a domain
- WITHOUT OVERLAPS on a multirangetype with a domain on itself

For instance we could set things up like so:

-- range+multirange over a domain:
create domain int4_d as integer check (value <> 10);
create type int4_d_range as range (subtype = int4_d);

-- domain on a range:
create domain int4range_d as int4range check (value <> '[10,11)');

-- domain on a multirange:
create domain int4multirange_d as int4multirange check (value <> '{[10,11)}');

Then we have this:

```
[v19devel:5432][426675] postgres=# select oid, typname, typtype,
typbasetype, typtypmod from pg_type where oid >= 20000 order by oid ;
  oid  |      typname       | typtype | typbasetype | typtypmod
-------+--------------------+---------+-------------+-----------
 24595 | _int4_d            | b       |           0 |        -1
 24596 | int4_d             | d       |          23 |        -1
 24598 | _int4_d_range      | b       |           0 |        -1
 24599 | int4_d_multirange  | m       |           0 |        -1
 24600 | _int4_d_multirange | b       |           0 |        -1
 24601 | int4_d_range       | r       |           0 |        -1
 24608 | _int4range_d       | b       |           0 |        -1
 24609 | int4range_d        | d       |        3904 |        -1
 24611 | _int4multirange_d  | b       |           0 |        -1
 24612 | int4multirange_d   | d       |        4451 |        -1
```

Note that creating a domain on a range does not create a corresponding
multirange. Maybe that is a bug. It means we can't test the 4th case
above.

It looks like domains on the subtype work, but not directly on a range
or multirange:

```
[v19devel:5432][426675] postgres=# create table t1 (id int4range,
valid_at int4_d_range, primary key (id, valid_at without overlaps));
CREATE TABLE
[v19devel:5432][426675] postgres=# create table t2 (id int4range,
valid_at int4range_d, primary key (id, valid_at without overlaps));
ERROR:  column "valid_at" in WITHOUT OVERLAPS is not a range or multirange type
LINE 1: ...ate table t2 (id int4range, valid_at int4range_d, primary ke...
                                                             ^
[v19devel:5432][426675] postgres=# create table t3 (id int4range,
valid_at int4_d_multirange, primary key (id, valid_at without
overlaps));
CREATE TABLE
[v19devel:5432][426675] postgres=# create table t4 (id int4range,
valid_at int4_multirange_d, primary key (id, valid_at without
overlaps));
ERROR:  type "int4_multirange_d" does not exist
LINE 1: create table t4 (id int4range, valid_at int4_multirange_d, p...
```

It would be good to have tests for all of those. For instance:

```
-- domain on a multirange:

CREATE DOMAIN textmultirange2_d AS textmultirange2 CHECK (VALUE <> '{[c,d)}');
CREATE TABLE temporal_mltrng4 (
  id d_int4range1,
  valid_at textmultirange2_d,
  CONSTRAINT temporal_rng4_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);

INSERT INTO temporal_mltrng4 VALUES ('[1,2)', '{[c,d)}'); --domain
constraint violation
INSERT INTO temporal_mltrng4 VALUES ('[1,2)', '{[a,g)}');
INSERT INTO temporal_mltrng4 VALUES ('[1,2)', '{[b,c)}'); --error
INSERT INTO temporal_mltrng4 VALUES ('[2,3)', '{[B,C)}'), ('[2,2]',
'{[A,C)}'); --error

CREATE TABLE temporal_mltrngfk (parent_id d_int4range1, id int4range,
valid_at textmultirange2_d);
ALTER TABLE temporal_mltrngfk
  ADD CONSTRAINT temporal_mltrngfk_fk
    FOREIGN KEY (parent_id, PERIOD valid_at)
    REFERENCES temporal_mltrng4;

INSERT INTO temporal_mltrngfk VALUES ('[1,2)', '[2,3)', '{[d,e)}');
TABLE temporal_mltrng4;
TABLE temporal_mltrngfk;

UPDATE temporal_mltrng4 SET valid_at = '{[c,d)}'; --error
UPDATE temporal_mltrng4 SET valid_at = '{[a,h)}';
UPDATE temporal_mltrng4 SET valid_at = '{[f,g)}'; --error

DROP TABLE temporal_mltrng4, temporal_mltrngfk;
```

When I try that, it looks like your patch fixes multiranges too.

Yours,

--
Paul              ~{:-)
pj@illuminatedcomputing.com



Re: domain for WITHOUT OVERLAPS

От
jian he
Дата:
On Thu, Dec 4, 2025 at 2:39 AM Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
>
> I think there are these cases to consider:
>
> - WITHOUT OVERLAPS on a rangetype whose subtype has a domain
> - WITHOUT OVERLAPS on a rangetype with a domain on itself
> - WITHOUT OVERLAPS on a multirangetype whose subtype has a domain
> - WITHOUT OVERLAPS on a multirangetype whose rangetype has a domain
> - WITHOUT OVERLAPS on a multirangetype with a domain on itself
>
hi.

please check the attached polished test. (no C code change).

I think the test I am missing is
> - WITHOUT OVERLAPS on a multirangetype whose rangetype has a domain
because it expects CREATE DOMAIN also to create a multirange type for rangetype.
but currently we didn't do it.


--
jian
https://www.enterprisedb.com

Вложения

Re: domain for WITHOUT OVERLAPS

От
Paul A Jungwirth
Дата:
On Wed, Dec 10, 2025 at 8:23 PM jian he <jian.universality@gmail.com> wrote:
> polished tests for
> > - WITHOUT OVERLAPS on a rangetype whose subtype has a domain
> > - WITHOUT OVERLAPS on a rangetype with a domain on itself
> > - WITHOUT OVERLAPS on a multirangetype whose subtype has a domain
> > - WITHOUT OVERLAPS on a multirangetype whose rangetype has a domain
> > - WITHOUT OVERLAPS on a multirangetype with a domain on itself
>
> no tests for
> > - WITHOUT OVERLAPS on a multirangetype whose rangetype has a domain
> now.
>
> because the expected behavior assumes that CREATE DOMAIN also generates a
> corresponding multirange type for the domain range type, which is not true.
>
> --example, we expect create domain also create a multirange type for xxint4
> CREATE DOMAIN xxint4 AS int4range CHECK (VALUE <> '[10,11)');
> SELECT typname FROM pg_type WHERE typname ~* 'xxint4';

I'm not sure whether creating a domain on a range should create a
multirange or not. I asked the list on another thread. Since it does
not create a new range constructor either (or an entry in pg_range),
maybe not creating a new multirange is correct. Anyway I think we can
fix the WITHOUT OVERLAPS issue without dealing with that.


> diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
> index 6a347698edf..e27ca1466f2 100644
> --- a/src/backend/utils/cache/typcache.c
> +++ b/src/backend/utils/cache/typcache.c
> @@ -944,6 +944,8 @@ lookup_type_cache(Oid type_id, int flags)
>          typentry->domainBaseTypmod = -1;
>          typentry->domainBaseType =
>              getBaseTypeAndTypmod(type_id, &typentry->domainBaseTypmod);
> +        typentry->domainBaseTyptype =
> +            get_typtype(typentry->domainBaseType);
>      }
>      if ((flags & TYPECACHE_DOMAIN_CONSTR_INFO) &&
>          (typentry->flags & TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS) == 0 &&

Is there any performance concern about adding this lookup?
From what I can tell we don't use TYPECACHE_DOMAIN_BASE_INFO very
often, so it is probably okay.
We can get here in the executor from ExecEvalWholeRowVar, but that
seems acceptable to me.


> +--
> +-- tests for range over domain, multirange over a domain, custom range type over
> +-- domain.
> +--
> +CREATE DOMAIN int4_d as integer check (value <> 10);
> +CREATE TYPE int4_d_range as range (subtype = int4_d);
> +CREATE DOMAIN int4multirange_d as int4multirange check (value <> '{[10,11)}');
> +CREATE DOMAIN d_int4range1 AS int4range CHECK (VALUE <> '[10,11)');
> +CREATE DOMAIN d_textrange2 AS textrange2 CHECK (VALUE <> '[c,d)');
> +CREATE DOMAIN d_textrange2c AS d_textrange2;
> +CREATE DOMAIN textmultirange2_d AS textmultirange2 CHECK (VALUE <> '{[c,d)}');

The comment doesn't seem to match. It mentions three scenarios, but I'm seeing:

- int4_d_range is a range over a domain.
- int4multirange_d is a domain over a multirange.
- d_int4range1 is a domain over a range.
- d_textrange2 is a domain over a custom range.
- d_textrange2c is a domain over a domain, with no extra constraint.
What is this one for?
- textmultirange2_d is a domain over a custom multirange.

I think the naming is confusing. Sometimes you prepend "d_" and
sometimes you append "_d".
Consistency helps here because I can infer what the name means without
having to look it up.
What do you think of always appending "_d"?

Why "d_int4range1"? I don't see a "d_int4range2".


> +CREATE TABLE temporal_rng4mrng (
> +  id  d_int4range1,
> +  valid_at int4multirange_d,
> +  CONSTRAINT temporal_rng4_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
> +);
> +INSERT INTO temporal_rng4mrng VALUES ('[1,2)', '{[10,11)}'); --error
> +ERROR:  value for domain int4multirange_d violates check constraint "int4multirange_d_check"
> +INSERT INTO temporal_rng4mrng VALUES ('[1,2)', '{[10,13)}'), ('[1,2)', '{[2,13)}'); --error
> +ERROR:  conflicting key value violates exclusion constraint "temporal_rng4_pk"
> +DETAIL:  Key (id, valid_at)=([1,2), {[2,13)}) conflicts with existing key (id, valid_at)=([1,2), {[10,13)}).
> +DROP TABLE temporal_rng4mrng;

Okay, so we test a domain over a multirange here.

I don't understand the table name temporal_rng4mrng. It might be a
typo for temporal_mltrng4?


> +CREATE TABLE temporal_rng4 (
> +  id  d_int4range1,
> +  id1 int4_d_range,
> +  valid_at d_textrange2c,
> +  CONSTRAINT temporal_rng4_pk PRIMARY KEY (id, id1, valid_at WITHOUT OVERLAPS)
> +);
> +INSERT INTO temporal_rng4 VALUES ('[1,2)', '[1,2)', '[c,d)'); --error
> +ERROR:  value for domain d_textrange2c violates check constraint "d_textrange2_check"
> +INSERT INTO temporal_rng4 VALUES ('[1,2)', '[10,12)', '[a,g)'); --error
> +ERROR:  value for domain int4_d violates check constraint "int4_d_check"
> +LINE 1: INSERT INTO temporal_rng4 VALUES ('[1,2)', '[10,12)', '[a,g)...
> +                                                   ^

Here we test a domain over a custom rangetype (d_textrange2c).
Does using domains for the non-WITHOUT OVERLAPS parts add anything to this test?
Or should we have separate tests where those are in the WITHOUT
OVERLAPS position?

We get a line number for the second failure above but not the first.
Is that something we can fix?


> +CREATE TABLE temporal_rngfk (
> +  parent_id d_int4range1,
> +  id int4range,
> +  id1 int4_d_range,
> +  valid_at d_textrange2);
> +ALTER TABLE temporal_rngfk
> +  ADD CONSTRAINT temporal_rngfk_fk
> +    FOREIGN KEY (parent_id, id1, PERIOD valid_at)
> +    REFERENCES temporal_rng4;
> +INSERT INTO temporal_rngfk VALUES ('[1,2)', '[2,3)', '[9,11)', '[d,e)');
> +TABLE temporal_rng4;
> +  id   |  id1   | valid_at
> +-------+--------+----------
> + [1,2) | [9,11) | [a,g)
> +(1 row)
> +
> +TABLE temporal_rngfk;
> + parent_id |  id   |  id1   | valid_at
> +-----------+-------+--------+----------
> + [1,2)     | [2,3) | [9,11) | [d,e)
> +(1 row)
> +
> +UPDATE temporal_rng4 SET valid_at = '[c,d)'; --error
> +ERROR:  value for domain d_textrange2c violates check constraint "d_textrange2_check"
> +UPDATE temporal_rng4 SET valid_at = '[a,h)';
> +UPDATE temporal_rng4 SET valid_at = '[f,g)'; --error
> +ERROR:  update or delete on table "temporal_rng4" violates foreign key constraint "temporal_rngfk_fk" on table
"temporal_rngfk"
> +DETAIL:  Key (id, id1, valid_at)=([1,2), [9,11), [a,h)) is still referenced from table "temporal_rngfk".
> +DROP TABLE temporal_rng4, temporal_rngfk;

Here we test foreign keys with a domain over a custom rangetype.
Again I'm not sure it adds anything to add int4_d_range.

What about a domain referencing a non-domain and a non-domain
referencing a domain (in the WITHOUT OVERLAPS position)?
We allow that for regular FK parts, so we should allow it for the
WITHOUT OVERLAPS part as well.


> +-- domain on a multirange
> +CREATE TABLE temporal_mltrng4 (
> +  id d_int4range1,
> +  valid_at textmultirange2_d,
> +  CONSTRAINT temporal_rng4_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
> +);
> +INSERT INTO temporal_mltrng4 VALUES ('[1,2)', '{[c,d)}'); --error
> +ERROR:  value for domain textmultirange2_d violates check constraint "textmultirange2_d_check"
> +INSERT INTO temporal_mltrng4 VALUES ('[1,2)', '{[a,g)}');
> +INSERT INTO temporal_mltrng4 VALUES ('[1,2)', '{[b,c)}'); --error
> +ERROR:  conflicting key value violates exclusion constraint "temporal_rng4_pk"
> +DETAIL:  Key (id, valid_at)=([1,2), {[b,c)}) conflicts with existing key (id, valid_at)=([1,2), {[a,g)}).
> +INSERT INTO temporal_mltrng4 VALUES ('[2,3)', '{[B,C)}'), ('[2,3)', '{[A,C)}'); --error
> +ERROR:  conflicting key value violates exclusion constraint "temporal_rng4_pk"
> +DETAIL:  Key (id, valid_at)=([2,3), {[A,C)}) conflicts with existing key (id, valid_at)=([2,3), {[B,C)}).
> +CREATE TABLE temporal_mltrngfk (parent_id d_int4range1, id int4range,
> +valid_at textmultirange2_d);
> +ALTER TABLE temporal_mltrngfk
> +  ADD CONSTRAINT temporal_mltrngfk_fk
> +    FOREIGN KEY (parent_id, PERIOD valid_at)
> +    REFERENCES temporal_mltrng4;
> +INSERT INTO temporal_mltrngfk VALUES ('[1,2)', '[2,3)', '{[d,e)}');
> +TABLE temporal_mltrng4;
> +  id   | valid_at
> +-------+----------
> + [1,2) | {[a,g)}
> +(1 row)
> +
> +TABLE temporal_mltrngfk;
> + parent_id |  id   | valid_at
> +-----------+-------+----------
> + [1,2)     | [2,3) | {[d,e)}
> +(1 row)
> +
> +UPDATE temporal_mltrng4 SET valid_at = '{[c,d)}'; --error
> +ERROR:  value for domain textmultirange2_d violates check constraint "textmultirange2_d_check"
> +UPDATE temporal_mltrng4 SET valid_at = '{[a,h)}';
> +UPDATE temporal_mltrng4 SET valid_at = '{[f,g)}'; --error
> +ERROR:  update or delete on table "temporal_mltrng4" violates foreign key constraint "temporal_mltrngfk_fk" on table
"temporal_mltrngfk"
> +DETAIL:  Key (id, valid_at)=([1,2), {[a,h)}) is still referenced from table "temporal_mltrngfk".
> +DROP TABLE temporal_mltrng4, temporal_mltrngfk;

This is similar to the test above using table temporal_rng4mrng.
The only difference is that the multirange uses a custom range type.
That's great, but maybe the comment can express that, and the two test
groups could be adjacent?


> @@ -309,6 +419,27 @@ CREATE TABLE temporal_rng3 (
>  );
>  ALTER TABLE temporal_rng3 DROP CONSTRAINT temporal_rng3_uq;
>  DROP TABLE temporal_rng3;
> +CREATE TABLE temporal_rng4 (
> +  id d_int4range1,
> +  valid_at d_textrange2c,
> +  CONSTRAINT temporal_rng4_pk UNIQUE (id, valid_at WITHOUT OVERLAPS)
> +);
> +INSERT INTO temporal_rng4 VALUES ('[1,2)', NULL), (NULL, '[1,2)');
> +INSERT INTO temporal_rng4 VALUES ('[1,2)', '[c,d)'); --error
> +ERROR:  value for domain d_textrange2c violates check constraint "d_textrange2_check"
> +INSERT INTO temporal_rng4 VALUES ('[1,2)', '[a,d)');
> +INSERT INTO temporal_rng4 VALUES ('[1,2)', '[b,c)'); --error
> +ERROR:  conflicting key value violates exclusion constraint "temporal_rng4_pk"
> +DETAIL:  Key (id, valid_at)=([1,2), [b,c)) conflicts with existing key (id, valid_at)=([1,2), [a,d)).
> +INSERT INTO temporal_rng4 VALUES ('[10,11)', NULL); --error
> +ERROR:  value for domain d_int4range1 violates check constraint "d_int4range1_check"
> +INSERT INTO temporal_rng4 VALUES ('[2,3)', '[B,C)'), ('[2,3)', '[A,C)'); --error
> +ERROR:  conflicting key value violates exclusion constraint "temporal_rng4_pk"
> +DETAIL:  Key (id, valid_at)=([2,3), [A,C)) conflicts with existing key (id, valid_at)=([2,3), [B,C)).
> +DROP TABLE temporal_rng4;
> +DROP TYPE int4_d_range;
> +DROP DOMAIN d_int4range1, d_textrange2c, d_textrange2,
> +            textmultirange2_d, int4multirange_d, int4_d;
>  DROP TYPE textrange2;
>  --
>  -- test ALTER TABLE ADD CONSTRAINT

Here we test a domain over a custom rangetype. Great! But again a
comment explaining the intent would be good.

The order of the new test groups seems odd to me. I would have
expected to go from simple to complex, e.g. domain over range, domain
over custom range, domain over multirange, domain over custom
multirange.

Also testing a range over a domain (in WITHOUT OVERLAPS position) would be good.

Yours,

--
Paul              ~{:-)
pj@illuminatedcomputing.com



Re: domain for WITHOUT OVERLAPS

От
Paul A Jungwirth
Дата:
On Mon, Dec 22, 2025 at 7:25 PM Paul A Jungwirth
<pj@illuminatedcomputing.com> wrote:
>
> On Wed, Dec 10, 2025 at 8:23 PM jian he <jian.universality@gmail.com> wrote:
> > +-- tests for range over domain, multirange over a domain, custom range type over
> > +-- domain.
> > +--
> > +CREATE DOMAIN int4_d as integer check (value <> 10);
> > +CREATE TYPE int4_d_range as range (subtype = int4_d);
> > +CREATE DOMAIN int4multirange_d as int4multirange check (value <> '{[10,11)}');
> > +CREATE DOMAIN d_int4range1 AS int4range CHECK (VALUE <> '[10,11)');
> > +CREATE DOMAIN d_textrange2 AS textrange2 CHECK (VALUE <> '[c,d)');
> > +CREATE DOMAIN d_textrange2c AS d_textrange2;
> > +CREATE DOMAIN textmultirange2_d AS textmultirange2 CHECK (VALUE <> '{[c,d)}');
>
> The comment doesn't seem to match. It mentions three scenarios, but I'm seeing:
>
> - int4_d_range is a range over a domain.
> - int4multirange_d is a domain over a multirange.
> - d_int4range1 is a domain over a range.
> - d_textrange2 is a domain over a custom range.
> - d_textrange2c is a domain over a domain, with no extra constraint.
> What is this one for?
> - textmultirange2_d is a domain over a custom multirange.

Another variation I'm not seeing tested yet is a multirange over a
domain. In other words what ranges call the "subtype" is a domain.
Such a multirange is already created by these lines:

> > +CREATE DOMAIN int4_d as integer check (value <> 10);
> > +CREATE TYPE int4_d_range as range (subtype = int4_d);

For example we can use it:

[v19devel:5432][454113] postgres=# select '{[1,2)}'::int4_d_multirange;
 int4_d_multirange
-------------------
 {[1,2)}
(1 row)

[v19devel:5432][454113] postgres=# select '{[10,12)}'::int4_d_multirange;
ERROR:  value for domain int4_d violates check constraint "int4_d_check"
LINE 1: select '{[10,12)}'::int4_d_multirange;
               ^

So it seems worthwhile to test WITHOUT OVERLAPS with that variation as well.

Yours,

--
Paul              ~{:-)
pj@illuminatedcomputing.com