Обсуждение: Domains versus arrays versus typmods

Поиск

Список

Период

Сортировка

Domains versus arrays versus typmods

От

Tom Lane

Дата:

19 октября 2010 г., 19:14:16

In bug #5717, Richard Huxton complains that a domain declared likeCREATE DOMAIN mynums numeric(4,2)[1];
doesn't work properly, ie, the typmod isn't enforced in places where
it reasonably ought to be.  I dug into this a bit, and found that there
are more worms in this can than I first expected.

In the first place, plpgsql seems a few bricks shy of a load, in that
its code to handle assignment to an array element doesn't appear to
be considering arrays with typmod at all: it just passes typmod -1 to
exec_simple_cast_value() in the PLPGSQL_DTYPE_ARRAYELEM case in
exec_assign_value().  Nonetheless, if you try a case like
declare  x numeric(4,2)[1];begin  x[1] := $1;

you'll find the typmod *is* enforced.  It turns out that the reason that
it works is that after constructing the modified array value,
exec_assign_value() calls itself recursively to do the actual assignment
to the array variable --- and that call sees that the array variable has
a typmod, so it applies the cast *at the array level*, ie it
disassembles the array, re-coerces each element, and builds a new array.
So it's horridly inefficient but it works.

That could and should be improved, but it seems to be just a matter
of local changes in pl_exec.c, and it's only an efficiency hack anyway.
The big problem is that none of this happens when the variable is
declared as a domain, and I believe that that is a significantly more
wide-ranging issue.

The real issue as I see it is that it's possible to subscript an array
without realizing that the array's type is really a domain that
incorporates a typmod specification.  This happens because of a hack
we did to simplify implementation of domains over arrays: we expose the
array element type as typelem of the domain as well.  For example, given
Richard's declaration we have

regression=# select typname,oid,typelem,typbasetype,typtypmod from pg_type where typname = 'mynums';typname |  oid  |
typelem| typbasetype | typtypmod 
 
---------+-------+---------+-------------+-----------mynums  | 92960 |    1700 |        1231 |    262150
(1 row)

If we were to wrap another domain around that, we'd have

regression=# create domain mynums2 as mynums;
CREATE DOMAIN
regression=# select typname,oid,typelem,typbasetype,typtypmod from pg_type where typname = 'mynums2';typname |  oid  |
typelem| typbasetype | typtypmod 
 
---------+-------+---------+-------------+-----------mynums2 | 92976 |    1700 |       92960 |        -1
(1 row)

and at this point it's completely unobvious from inspection of the
pg_type entry that any typmod needs to be applied when subscripting.

So I'm suspicious that there are a boatload of bugs related to this kind
of domain declaration, not just the one case in plpgsql.

I think that what we ought to do about it is to stop exposing typelem
in domains' pg_type rows.  If you want to subscript a domain value, you
should have to drill down to its base type with getBaseTypeAndTypmod,
which would also give you the correct typmod to apply.  If we set
typelem to zero in domain pg_type rows, it shouldn't take too long to
find any places that are missing this consideration --- the breakage
will be obvious rather than subtle.

This catalog definitional change obviously shouldn't be back-patched,
but possibly the ensuing code changes could be, since the typelem change
is just to expose where things are wrong and wouldn't be a prerequisite
for corrected code to behave correctly.  Or we could decide that this is
a corner case we're not going to try to fix in back branches.  It's been
wrong since day 0, so there's certainly an argument that it's not
important enough to risk back-patching.

Comments?
        regards, tom lane

Re: Domains versus arrays versus typmods

От

Robert Haas

Дата:

19 октября 2010 г., 21:47:28

On Tue, Oct 19, 2010 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In bug #5717, Richard Huxton complains that a domain declared like
>        CREATE DOMAIN mynums numeric(4,2)[1];
> doesn't work properly, ie, the typmod isn't enforced in places where
> it reasonably ought to be.  I dug into this a bit, and found that there
> are more worms in this can than I first expected.
>
> In the first place, plpgsql seems a few bricks shy of a load, in that
> its code to handle assignment to an array element doesn't appear to
> be considering arrays with typmod at all: it just passes typmod -1 to
> exec_simple_cast_value() in the PLPGSQL_DTYPE_ARRAYELEM case in
> exec_assign_value().  Nonetheless, if you try a case like
>
>        declare
>          x numeric(4,2)[1];
>        begin
>          x[1] := $1;
>
> you'll find the typmod *is* enforced.  It turns out that the reason that
> it works is that after constructing the modified array value,
> exec_assign_value() calls itself recursively to do the actual assignment
> to the array variable --- and that call sees that the array variable has
> a typmod, so it applies the cast *at the array level*, ie it
> disassembles the array, re-coerces each element, and builds a new array.
> So it's horridly inefficient but it works.
>
> That could and should be improved, but it seems to be just a matter
> of local changes in pl_exec.c, and it's only an efficiency hack anyway.
> The big problem is that none of this happens when the variable is
> declared as a domain, and I believe that that is a significantly more
> wide-ranging issue.
>
> The real issue as I see it is that it's possible to subscript an array
> without realizing that the array's type is really a domain that
> incorporates a typmod specification.  This happens because of a hack
> we did to simplify implementation of domains over arrays: we expose the
> array element type as typelem of the domain as well.  For example, given
> Richard's declaration we have
>
> regression=# select typname,oid,typelem,typbasetype,typtypmod from pg_type where typname = 'mynums';
>  typname |  oid  | typelem | typbasetype | typtypmod
> ---------+-------+---------+-------------+-----------
>  mynums  | 92960 |    1700 |        1231 |    262150
> (1 row)
>
> If we were to wrap another domain around that, we'd have
>
> regression=# create domain mynums2 as mynums;
> CREATE DOMAIN
> regression=# select typname,oid,typelem,typbasetype,typtypmod from pg_type where typname = 'mynums2';
>  typname |  oid  | typelem | typbasetype | typtypmod
> ---------+-------+---------+-------------+-----------
>  mynums2 | 92976 |    1700 |       92960 |        -1
> (1 row)
>
> and at this point it's completely unobvious from inspection of the
> pg_type entry that any typmod needs to be applied when subscripting.
>
> So I'm suspicious that there are a boatload of bugs related to this kind
> of domain declaration, not just the one case in plpgsql.
>
> I think that what we ought to do about it is to stop exposing typelem
> in domains' pg_type rows.  If you want to subscript a domain value, you
> should have to drill down to its base type with getBaseTypeAndTypmod,
> which would also give you the correct typmod to apply.  If we set
> typelem to zero in domain pg_type rows, it shouldn't take too long to
> find any places that are missing this consideration --- the breakage
> will be obvious rather than subtle.

I fear that this is going to degrade the performance of common cases
as a way of debugging rare cases.

> This catalog definitional change obviously shouldn't be back-patched,
> but possibly the ensuing code changes could be, since the typelem change
> is just to expose where things are wrong and wouldn't be a prerequisite
> for corrected code to behave correctly.  Or we could decide that this is
> a corner case we're not going to try to fix in back branches.  It's been
> wrong since day 0, so there's certainly an argument that it's not
> important enough to risk back-patching.
>
> Comments?

It might be reasonable to back-patch whatever we decide on into 9.0,
because it is so new, but I would be reluctant to go back further
unless we have some evidence that it's bothering people.  It seems to
me that this can could have a lot of worms in it, and I fear that
there could be several rounds of fixes, which I would rather not
inflict on users of supposedly-stable branches.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Domains versus arrays versus typmods

От

Tom Lane

Дата:

19 октября 2010 г., 22:17:54

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 19, 2010 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think that what we ought to do about it is to stop exposing typelem
>> in domains' pg_type rows. �If you want to subscript a domain value, you
>> should have to drill down to its base type with getBaseTypeAndTypmod,
>> which would also give you the correct typmod to apply. �If we set
>> typelem to zero in domain pg_type rows, it shouldn't take too long to
>> find any places that are missing this consideration --- the breakage
>> will be obvious rather than subtle.

> I fear that this is going to degrade the performance of common cases
> as a way of debugging rare cases.

We've already accepted the cost of doing getBaseTypeAndTypmod() in a
whole lot of performance-critical parsing paths, on the off chance that
the target datatype might be a domain.  It's not apparent to me that
array subscripting is so important as to deserve an exemption from that.
Especially when not doing so doesn't work.

>> Comments?

> It might be reasonable to back-patch whatever we decide on into 9.0,
> because it is so new, but I would be reluctant to go back further
> unless we have some evidence that it's bothering people.  It seems to
> me that this can could have a lot of worms in it, and I fear that
> there could be several rounds of fixes, which I would rather not
> inflict on users of supposedly-stable branches.

Well, we have bug #5717 as evidence that it's bothering people ;-).
But I agree that the case for back-patching is a bit thin, especially
if it might result in any user-visible behavioral changes.

One case that I've realized is a problem is domain constraints at the
array level:

regression=# create domain orderedpair as int[2] check (value[1] < value[2]);
CREATE DOMAIN
regression=# select array[2,1]::orderedpair;    -- expect failure
ERROR:  value for domain orderedpair violates check constraint "orderedpair_check"
regression=# create table op (f1 orderedpair);
CREATE TABLE
regression=# insert into op values (array[1,2]);
INSERT 0 1
regression=# insert into op values (array[2,1]);    -- expect failure
ERROR:  value for domain orderedpair violates check constraint "orderedpair_check"
regression=# update op set f1[2] = 0;        -- expect failure ... oops
UPDATE 1
regression=# select * from op; f1   
-------{1,0}
(1 row)

The reason this fails is that the result of the ArrayRef "f1[2] := 0"
is considered to be of the domain type, so we don't recheck the
constraint.  As this example demonstrates, that assumption is simply
broken.  The correct implementation, I believe, is that the result
ought to be considered to be of the base type (ie, just int[]), which
would cause an implicit re-coercion to the domain type to occur during
the assignment, offering a chance to re-verify the constraint.

Right offhand I don't see a way that that sort of change can safely be
back-patched.  The incorrect assumption about the ArrayRef's result
type is already baked into stored rules in existing databases.
        regards, tom lane

Re: Domains versus arrays versus typmods

От

Robert Haas

Дата:

19 октября 2010 г., 23:06:53

On Tue, Oct 19, 2010 at 9:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Oct 19, 2010 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think that what we ought to do about it is to stop exposing typelem
>>> in domains' pg_type rows.  If you want to subscript a domain value, you
>>> should have to drill down to its base type with getBaseTypeAndTypmod,
>>> which would also give you the correct typmod to apply.  If we set
>>> typelem to zero in domain pg_type rows, it shouldn't take too long to
>>> find any places that are missing this consideration --- the breakage
>>> will be obvious rather than subtle.
>
>> I fear that this is going to degrade the performance of common cases
>> as a way of debugging rare cases.
>
> We've already accepted the cost of doing getBaseTypeAndTypmod() in a
> whole lot of performance-critical parsing paths, on the off chance that
> the target datatype might be a domain.  It's not apparent to me that
> array subscripting is so important as to deserve an exemption from that.
> Especially when not doing so doesn't work.

Hmm... so are there no cases where zeroing out the typelem will cost
us an otherwise-unnecessary syscache lookup?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Domains versus arrays versus typmods

От

Tom Lane

Дата:

20 октября 2010 г., 11:03:48

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 19, 2010 at 9:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We've already accepted the cost of doing getBaseTypeAndTypmod() in a
>> whole lot of performance-critical parsing paths, on the off chance that
>> the target datatype might be a domain. �It's not apparent to me that
>> array subscripting is so important as to deserve an exemption from that.
>> Especially when not doing so doesn't work.

> Hmm... so are there no cases where zeroing out the typelem will cost
> us an otherwise-unnecessary syscache lookup?

My point is that anyplace that is relying on the surface typelem,
without drilling down to see what the base type is, is wrong.
So yeah, those lookups are (will be) necessary.
        regards, tom lane

Re: Domains versus arrays versus typmods

От

Robert Haas

Дата:

20 октября 2010 г., 11:47:59

On Wed, Oct 20, 2010 at 10:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Oct 19, 2010 at 9:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We've already accepted the cost of doing getBaseTypeAndTypmod() in a
>>> whole lot of performance-critical parsing paths, on the off chance that
>>> the target datatype might be a domain.  It's not apparent to me that
>>> array subscripting is so important as to deserve an exemption from that.
>>> Especially when not doing so doesn't work.
>
>> Hmm... so are there no cases where zeroing out the typelem will cost
>> us an otherwise-unnecessary syscache lookup?
>
> My point is that anyplace that is relying on the surface typelem,
> without drilling down to see what the base type is, is wrong.
> So yeah, those lookups are (will be) necessary.

OK.  In that case, +1 from me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Domains versus arrays versus typmods

От

Richard Huxton

Дата:

21 октября 2010 г., 03:51:21

On 20/10/10 01:47, Robert Haas wrote:
> On Tue, Oct 19, 2010 at 6:14 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> Comments?
>
> It might be reasonable to back-patch whatever we decide on into 9.0,
> because it is so new, but I would be reluctant to go back further
> unless we have some evidence that it's bothering people.  It seems to
> me that this can could have a lot of worms in it, and I fear that
> there could be several rounds of fixes, which I would rather not
> inflict on users of supposedly-stable branches.

The work-around I applied when I stumbled across this was just to apply 
an explicit cast before my function's RETURN. That neatly solves my 
particular problem (which I at first thought was a formatting issue 
somewhere in my app).

The real danger with this is the opportunity to end up with occasional 
bad data in tables, quite possibly unnoticed. If I'd come across this in 
an existing system rather than a new app I'm pretty sure it would have 
confused me for a lot longer than it did.
--   Richard Huxton  Archonet Ltd

Re: Domains versus arrays versus typmods

От

Tom Lane

Дата:

21 октября 2010 г., 13:46:52

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Oct 20, 2010 at 10:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My point is that anyplace that is relying on the surface typelem,
>> without drilling down to see what the base type is, is wrong.
>> So yeah, those lookups are (will be) necessary.

> OK.  In that case, +1 from me.

I've come across another interesting definitional issue, which is what
properties should domains have with respect to matching to polymorphic
arguments.  Currently, the polymorphic matching functions take domains
at face value (ie, without noticing their relationships to their base
types), with one exception: because they use get_element_type() to
decide if a type matches ANYARRAY, domains over arrays will be
considered to match ANYARRAY.  This leads to some weird inconsistencies
and at least one genuine bug.  Observe (this is with 9.0.x HEAD):

regression=# create domain myi as int;
CREATE DOMAIN
regression=# select array[1,2] || 3;?column? 
----------{1,2,3}
(1 row)

regression=# select array[1,2] || 3::myi;
ERROR:  operator does not exist: integer[] || myi

In this case, one might expect myi to be automatically downcast to int
so that it could be matched up with the int array, but that's not
happening.  However:

regression=# create domain myia as int[];
CREATE DOMAIN
regression=# select array[1,2]::myia || 3;?column? 
----------{1,2,3}
(1 row)

So we will downcast myia to int[], or at least one might assume that's
what's happening.  But actually it's worse than that: the result of this
operation is thought to be myia not int[], because myia itself is taken
as matching ANYARRAY, and the operator result is the same ANYARRAY type.
Thus, this case goes off the rails completely:

regression=# create domain myia2 as int[] check(array_length(value,1) = 2);
CREATE DOMAIN
regression=# select array[1,2]::myia2;array 
-------{1,2}
(1 row)

regression=# select array[1,2,3]::myia2;
ERROR:  value for domain myia2 violates check constraint "myia2_check"
regression=# select array[1,2]::myia2 || 3;?column? 
----------{1,2,3}
(1 row)

The result of the || is considered to be myia2, as can be seen for
example this way:

regression=# create view vvv as select array[1,2]::myia2 || 3 as x;
CREATE VIEW
regression=# \d vvv    View "public.vvv"Column | Type  | Modifiers 
--------+-------+-----------x      | myia2 | 

So we have a value that's claimed to belong to the domain, but it
doesn't meet the domain's constraints.

What I am intending to do about this in the short run is to leave the
anyarray-ness tests in the polymorphic-compatibility-checking functions
as-is.  That will mean (with the change in typelem for domains) that a
domain over array doesn't match ANYARRAY unless you explicitly downcast
it.  I argue that this is consistent with the current behavior of not
auto-downcasting domains to match the element type of an array.  We
could go back and change it later, but if we do, we should try to make
both cases provide auto-downcast-when-needed behavior.  I have not dug
into just what code changes would be needed for that.  Auto-downcast
wouldn't be exactly compatible with the current behavior anyway, since
it would result in a different claimed type for the operator result.

Comments?
        regards, tom lane

Re: Domains versus arrays versus typmods

От

"Kevin Grittner"

Дата:

21 октября 2010 г., 14:11:39

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> regression=# select array[1,2] || 3::myi;
> ERROR:  operator does not exist: integer[] || myi
> 
> In this case, one might expect myi to be automatically downcast to
< int so that it could be matched up with the int array, but that's
> not happening.
I guess it should allow that, although for my uses of domains it's
hard to see a reasonable use case for it, so that part doesn't
bother me too much.
> regression=# create domain myia as int[];
> CREATE DOMAIN
> regression=# select array[1,2]::myia || 3;
>  ?column? 
> ----------
>  {1,2,3}
> (1 row)
> 
> So we will downcast myia to int[], or at least one might assume
> that's what's happening.  But actually it's worse than that: the
> result of this operation is thought to be myia not int[], because
> myia itself is taken as matching ANYARRAY, and the operator result
> is the same ANYARRAY type.
That is actually what I would want and expect.  Let's say I have an
array of attorney bar numbers, and I add one more as a literal. 
While an argument could be made that the integer should be cast to
a bar number before being added to the array, we don't require that
for an assignment to a simple variable in the domain, so it would be
surprising to require a cast here, and even more surprising for the
concatenation to result in an array of primitive integers rather
than a array of attorney bar numbers.
> regression=# create domain myia2 as int[]
> check(array_length(value,1) = 2);
> CREATE DOMAIN
> regression=# select array[1,2]::myia2 || 3;
>  ?column? 
> ----------
>  {1,2,3}
> (1 row)
> So we have a value that's claimed to belong to the domain, but it
> doesn't meet the domain's constraints.
Yeah, that's obviously wrong.
-Kevin

Re: Domains versus arrays versus typmods

От

Tom Lane

Дата:

21 октября 2010 г., 14:36:28

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So we will downcast myia to int[], or at least one might assume
>> that's what's happening.  But actually it's worse than that: the
>> result of this operation is thought to be myia not int[], because
>> myia itself is taken as matching ANYARRAY, and the operator result
>> is the same ANYARRAY type.
> That is actually what I would want and expect.  Let's say I have an
> array of attorney bar numbers, and I add one more as a literal. 
> While an argument could be made that the integer should be cast to
> a bar number before being added to the array, we don't require that
> for an assignment to a simple variable in the domain, so it would be
> surprising to require a cast here, and even more surprising for the
> concatenation to result in an array of primitive integers rather
> than a array of attorney bar numbers.

I disagree with that argument: you are confusing an array over a domain
type with a domain over an array type.  In the latter case, the domain
could have additional constraints (such as the length constraint in my
other example), and there's no reason to assume that || or other array
operators would preserve those constraints.

A perhaps comparable example is
create domain verysmallint as int check (value < 10);
select 9::verysmallint + 1;

The result of the addition is int, not verysmallint, which is why you
don't get an error.

From an abstract-data-type point of view, the fact that any of these
operations are even allowed without an explicit downcast is a bit
uncool: it exposes the implementation of the domain type, which one
could argue shouldn't be allowed, at least not without some notational
marker showing you know what you're doing.  But the SQL committee
seems to have decided to ignore that tradition and allow auto-downcasts.
Nonetheless, when a domain type is fed to an operator that works on its
base type, it has to be clearly understood that there *is* an implied
downcast, and whatever special properties the domain may have had will
be lost.  As far as the operator and its result are concerned, the
domain is just its base type.

I'm not against fixing these cases so that auto downcasts happen, I'm
just saying that it's outside the scope of what I'm going to do in
response to the current bug.
        regards, tom lane

Re: Domains versus arrays versus typmods

От

"Kevin Grittner"

Дата:

21 октября 2010 г., 14:51:02

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> you are confusing an array over a domain type with a domain over
> an array type.
Yes I was.  Sorry.  Argument withdrawn.
-Kevin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Domains versus arrays versus typmods