Обсуждение: Unexpected modification of check constraint definition

Поиск
Список
Период
Сортировка

Unexpected modification of check constraint definition

От
Stuart Campbell
Дата:
Hi there,

I noticed that check constraint definitions are sometimes rewritten/normalized on input, and I was hoping to understand that a little better.

For instance, if I create this table with a check constraint:

create table foo (
  val varchar,
  constraint val_valid check (val in ('a','b','c'))
);

and then dump the schema with pg_dump, it looks more like this:

CREATE TABLE public.foo (
    val character varying,
    CONSTRAINT val_valid CHECK (((val)::text = ANY ((ARRAY['a'::character varying, 'b'::character varying, 'c'::character varying])::text[])))
);

However, if I then recreate the schema from that dump, and then dump with pg_dump a second time, it ends up different again:

CREATE TABLE public.foo (
    val character varying,
    CONSTRAINT val_valid CHECK (((val)::text = ANY (ARRAY[('a'::character varying)::text, ('b'::character varying)::text, ('c'::character varying)::text])))
);

I'm working in a Ruby on Rails application where the schema is periodically dumped to a structure.sql file on disk. So, it would be convenient if the constraint definition was "stable" (otherwise, there's unnecessary noise in our version control history)

Is it expected that the second form is rewritten into the third form? It seems a bit odd to see all the type casting going on, but maybe there is a good reason for that. (Maybe this is an issue with using varchar instead of text?)

Regards,
Stuart

This communication and any attachments may contain confidential information and are intended to be viewed only by the intended recipients. If you have received this message in error, please notify the sender immediately by replying to the original message and then delete all copies of the email from your systems.


Re: Unexpected modification of check constraint definition

От
Laurenz Albe
Дата:
On Wed, 2026-01-07 at 21:32 +1100, Stuart Campbell wrote:
> I noticed that check constraint definitions are sometimes rewritten/normalized on input, and I was hoping to
understandthat a little better. 
>
> For instance, if I create this table with a check constraint:
>
> create table foo (
>   val varchar,
>   constraint val_valid check (val in ('a','b','c'))
> );
>
> and then dump the schema with pg_dump, it looks more like this:
>
> CREATE TABLE public.foo (
>     val character varying,
>     CONSTRAINT val_valid CHECK (((val)::text = ANY ((ARRAY['a'::character varying, 'b'::character varying,
'c'::charactervarying])::text[]))) 
> );
>
> However, if I then recreate the schema from that dump, and then dump with pg_dump a second time, it ends up different
again:
>
> CREATE TABLE public.foo (
>     val character varying,
>     CONSTRAINT val_valid CHECK (((val)::text = ANY (ARRAY[('a'::character varying)::text, ('b'::character
varying)::text,('c'::character varying)::text]))) 
> );
>
> I'm working in a Ruby on Rails application where the schema is periodically dumped to a structure.sql
> file on disk. So, it would be convenient if the constraint definition was "stable" (otherwise, there's
> unnecessary noise in our version control history)
>
> Is it expected that the second form is rewritten into the third form? It seems a bit odd to see all
> the type casting going on, but maybe there is a good reason for that. (Maybe this is an issue with
> using varchar instead of text?)

Yes, using "varchar" is definitely part of why it is so odd.
There is no equality operator for "varchar", so you need an (implicit) cast to "text".
That implicit cast is made explicit when the parsed binary form of the constraint expression is
reverse engineered to a string during "pg_dump".

I'd say that the change you mention "just happened", but you can never rely on these expressions
being rendered in a fixed way - this can change any time.  Look at this artificial example:

  CREATE TABLE test (col varchar CONSTRAINT con CHECK (col <> 'y'));

  SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'con';

          pg_get_constraintdef
  ════════════════════════════════════
   CHECK (((col)::text <> 'y'::text))

  CREATE TYPE public.text AS enum ('x');
  SET search_path = public, pg_catalog;

  SELECT pg_get_constraintdef(oid) FROM pg_constraint WHERE conname = 'con';

                     pg_get_constraintdef
  ══════════════════════════════════════════════════════════
   CHECK (((col)::pg_catalog.text <> 'y'::pg_catalog.text))

The proper solution is not to rely on the way that such expressions are rendered.
Rather than relying on a tool like pg_dump to create your "structure.sql", write it
by hand an maintain in in a version control system.

But perhaps it is good enoulh if you define the constraint by casting to "text"
early: CHECK (val::text IN ('a','b','c'))

> This communication and any attachments may contain confidential information and are intended to be
> viewed only by the intended recipients.

Got it, I won't forward your mail... err...

Yours,
Laurenz Albe



Re: Unexpected modification of check constraint definition

От
Adrian Klaver
Дата:
On 1/7/26 02:32, Stuart Campbell wrote:
> Hi there,
> 

> I'm working in a Ruby on Rails application where the schema is 
> periodically dumped to a structure.sql file on disk. So, it would be 
> convenient if the constraint definition was "stable" (otherwise, there's 
> unnecessary noise in our version control history)
> 
> Is it expected that the second form is rewritten into the third form? It 
> seems a bit odd to see all the type casting going on, but maybe there is 
> a good reason for that. (Maybe this is an issue with using varchar 
> instead of text?)

https://www.postgresql.org/docs/current/datatype-character.html

"text is PostgreSQL's native string data type, in that most built-in 
functions operating on strings are declared to take or return text not 
character varying. For many purposes, character varying acts as though 
it were a domain over text."


When you did the dump/restore cycles where they from and to the same 
Postgres version/instance?

> 
> Regards,
> Stuart
> 
> This communication and any attachments may contain confidential 
> information and are intended to be viewed only by the intended 
> recipients. If you have received this message in error, please notify 
> the sender immediately by replying to the original message and then 
> delete all copies of the email from your systems.
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Unexpected modification of check constraint definition

От
Stuart Campbell
Дата:
On Wed, Jan 7, 2026 at 11:57 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Yes, using "varchar" is definitely part of why it is so odd.
There is no equality operator for "varchar", so you need an (implicit) cast to "text".

Got it. That seems like a possible reason to prefer text over varchar.
 
That implicit cast is made explicit when the parsed binary form of the constraint expression is
reverse engineered to a string during "pg_dump".

Makes sense.
 
I'd say that the change you mention "just happened", but you can never rely on these expressions
being rendered in a fixed way - this can change any time.

That seems reasonable. What mostly seemed unexpected was that the parsed expression changed the second time. i.e. original expression -> rewritten expression with explicit type casting -> rewritten expression with (even more!) explicit type casting.
 
But perhaps it is good enoulh if you define the constraint by casting to "text"
early: CHECK (val::text IN ('a','b','c'))

Sounds good, I may try that. Thanks!
 
> This communication and any attachments may contain confidential information and are intended to be
> viewed only by the intended recipients.

Got it, I won't forward your mail... err...

Heh, yes... sorry about that. I can't control how that message is included from this email address. Rest assured, you and everyone else here are the intended recipients :-)

Regards,
Stuart

This communication and any attachments may contain confidential information and are intended to be viewed only by the intended recipients. If you have received this message in error, please notify the sender immediately by replying to the original message and then delete all copies of the email from your systems.


Re: Unexpected modification of check constraint definition

От
Stuart Campbell
Дата:
On Thu, Jan 8, 2026 at 2:24 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
https://www.postgresql.org/docs/current/datatype-character.html

"text is PostgreSQL's native string data type, in that most built-in
functions operating on strings are declared to take or return text not
character varying. For many purposes, character varying acts as though
it were a domain over text."

Thanks. It seems like I should generally prefer to use text over varchar. (I've read advice along those lines elsewhere.)
 
When you did the dump/restore cycles where they from and to the same
Postgres version/instance?

Yes. In the example I provided, that was all from the same Postgres 16.4 instance. 

Regards,
Stuart

This communication and any attachments may contain confidential information and are intended to be viewed only by the intended recipients. If you have received this message in error, please notify the sender immediately by replying to the original message and then delete all copies of the email from your systems.


Re: Unexpected modification of check constraint definition

От
Tom Lane
Дата:
Stuart Campbell <stuart.campbell@ridewithvia.com> writes:
> On Wed, Jan 7, 2026 at 11:57 PM Laurenz Albe <laurenz.albe@cybertec.at>
> wrote:
>> That implicit cast is made explicit when the parsed binary form of the
>> constraint expression is
>> reverse engineered to a string during "pg_dump".

> Makes sense.

For context, in pg_dump we are quite concerned that the dumped form of
the expression be interpreted the same way when reloaded, so that's
why implicit casts get made explicit.  (The worry is mainly that the
destination server could have a different set of available operators or
casting rules, so it might opt for some different interpretation of
an underspecified expression.)  This isn't a perfect rule, because the
parser may behave subtly differently when faced with different input,
but it's the best we've been able to do.

> That seems reasonable. What mostly seemed unexpected was that the parsed
> expression changed the second time. i.e. original expression -> rewritten
> expression with explicit type casting -> rewritten expression with (even
> more!) explicit type casting.

Yeah, this is actually a pretty complicated case, because the parser
decides that it ought to convert the IN to an "= ANY(ARRAY)"
construct.  That opens up the question of whether to cast at the level
of the individual array elements, or at the level of the array as a
whole.  The heuristics for that get affected by what casts were in the
original input.

            regards, tom lane