Обсуждение: Casts

Поиск
Список
Период
Сортировка

Casts

От
stark
Дата:
It seems odd to me that implicit casts are checked for when you call a
function but not when you're implicitly calling a function via a cast. As a
result there are a *lot* of redundant casts in our catalog, essentially n!
casts for a domain with n types in it. So for example there are 138 casts
between the various numeric data types including every possible pairing of
char, int2, int4, int8, float4, float8, and numeric.

Now I don't think it actually costs us anything to have so many casts but it
sure makes adding new fully functional user defined types a pain. Adding a
single new numeric data type requires creating 12 new casts and a second one
would require 14, etc.

What's strange is that you do not have to go to such lengths to get a fully
functional data type in other respects. One implicit cast to numeric and you
can use +, -, log(), exp(), floor(), ceil(), etc. You may want to implement
some of those for performance reasons but for most relying on the implicit
cast is perfectly reasonable.

It seems like what ought to happen is that every data domain should have a
single blessed data type that is the root data type for that domain. Every
data type in that domain should have a single implicit cast to that root data
type. That effectively is how all the data types are set up in fact. They have
all these dozens of assignment casts and a single implicit cast to a type
chosen in some sort of unspoken consensus.

There has been some fear expressed in the past that too many implicit casts
create surprising side effects. I think that's a valid fear but only relevant
if we have two or more such casts for a single data type or have a cast to an
inappropriate type. As long as we have precisely one implicit cast for every
type and it's a cast to a datatype with basically the same semantics it seems
like we should be safe.

So for example if all the numeric data types had an implicit cast to numeric
and an assignment cast from numeric it ought to be possible for the planner to
find a way to handle an explicit cast between any two arbitrary numeric types
using just those. It could use the same logic it uses to find functions by
looking first for an exact match, then any assignment cast from a data type to
which it can implicitly cast to first.

That would let people add a new fully functional numeric type by creating only
two casts instead of 12. And a second one by creating two more instead of 14,
and so on.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Casts

От
Tom Lane
Дата:
stark <stark@enterprisedb.com> writes:
> It seems odd to me that implicit casts are checked for when you call a
> function but not when you're implicitly calling a function via a cast. As a
> result there are a *lot* of redundant casts in our catalog, essentially n!
> casts for a domain with n types in it. So for example there are 138 casts
> between the various numeric data types including every possible pairing of
> char, int2, int4, int8, float4, float8, and numeric.

This is intentional.  If you explicitly cast type foo to type bar there
should not be any question about what function will be invoked.  The
cost is a few more rows in pg_cast ... so what?  Adding rows to pg_cast
is not the most painful part of making a new datatype.

As for "the parser ought to be able to find two-step cast pathways",
no thanks.  The increase in search time and the decrease in
predictability are both undesirable.

> There has been some fear expressed in the past that too many implicit casts
> create surprising side effects.

Not "some fear" ... we have seen people badly burned, time and time
again, by the ill-considered implicit casts that are already in there.
IMHO we need fewer implicit casts, not more.
        regards, tom lane


Re: Casts

От
Tom Lane
Дата:
stark <stark@enterprisedb.com> writes:
> I think the ideal combination is having every type have precisely one
> implicit cast "up" the type "tree" and assignment casts down the
> "tree".

No, because for example in the case of the numeric datatypes, that would
result in *every* cross-type operation being done in NUMERIC.  Do you
really want, for example, "bigint var = integer constant" to be
interpreted as "var::numeric = const::numeric"?  (Hint: you'll lose the
benefit of any index on the var.)

Actually it's worse than that, because the top of the numeric datatype
hierarchy is float8 not numeric; this is more or less forced by SQL's
rules about exact vs inexact arithmetic.  So your proposal would really
reduce to doing all cross-type arithmetic in float8 ...  which would
at least be a lot faster than numeric, but it's not gonna fly from an
accuracy point of view.

What we want, and what the existing pg_cast rules give us, is a rule
that an operation between two different numeric types is done in the
"wider" of those two types.  You could probably propose some explicit
representation of this concept that would be more compact than the
present pg_cast table --- but it would also be less flexible.  I don't
really see much to be gained that way.
        regards, tom lane


Re: Casts

От
stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

> stark <stark@enterprisedb.com> writes:
>> It seems odd to me that implicit casts are checked for when you call a
>> function but not when you're implicitly calling a function via a cast. As a
>> result there are a *lot* of redundant casts in our catalog, essentially n!
>> casts for a domain with n types in it. So for example there are 138 casts
>> between the various numeric data types including every possible pairing of
>> char, int2, int4, int8, float4, float8, and numeric.
>
> This is intentional.  If you explicitly cast type foo to type bar there
> should not be any question about what function will be invoked.  The
> cost is a few more rows in pg_cast ... so what?  

And hundreds of lines of copy-pasted code with potential bugs.

> As for "the parser ought to be able to find two-step cast pathways",
> no thanks.  The increase in search time and the decrease in
> predictability are both undesirable.
>
>> There has been some fear expressed in the past that too many implicit casts
>> create surprising side effects.
>
> Not "some fear" ... we have seen people badly burned, time and time
> again, by the ill-considered implicit casts that are already in there.
> IMHO we need fewer implicit casts, not more.

I agree we need fewer implicit casts. But I think you take that too far to
dogmatically imply that every implicit cast is suspect.

I think the ideal combination is having every type have precisely one implicit
cast "up" the type "tree" and assignment casts down the "tree". I don't see us
every needing anything more complex than a flat "tree" of a single base type
for each domain and everyone directly a child of that one type. So every
numeric data type would have an implicit cast to numeric, every text type
would have an implicit cast to text, etc.

The implicit casts from numeric to other types seem suspect to me. And there
are tons of cases where we have implicit casts in both directions such as bit
to varbit and varbit to bit. That makes no sense to me. That would be where
the surprising effects come from.

As long as every type has precisely one implicit cast up I think it would
become safe to rely on it for many of the operations and not have to define
redundant operations for every data type.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Casts

От
Martijn van Oosterhout
Дата:
On Wed, Aug 09, 2006 at 12:21:40PM +0100, stark wrote:
> I think the ideal combination is having every type have precisely one implicit
> cast "up" the type "tree" and assignment casts down the "tree". I don't see us
> every needing anything more complex than a flat "tree" of a single base type
> for each domain and everyone directly a child of that one type. So every
> numeric data type would have an implicit cast to numeric, every text type
> would have an implicit cast to text, etc.

That's basically what we have now, except any numeric type cast be
"upcast" implicity but "downcast" requires assignment. The tree
basically goes like: smallint -> integer -> bigint -> numeric -> real
-> double precision.

You have to be able to cast all the inbetween version implicitly also.
If someone writes an expression with "int2 = int4" you don't really
want to cast them both up to numeric, or worse, double precision.
Currently we can upcast the int2 to int4 implicitly and that allows us
to use an index on the int4 column.

>
> The implicit casts from numeric to other types seem suspect to me. And there
> are tons of cases where we have implicit casts in both directions such as bit
> to varbit and varbit to bit. That makes no sense to me. That would be where
> the surprising effects come from.

The reason for those is basically there is no real difference between bit
and varbit, or character varying and text. Those types are essentially
the same, so why would you want to make a function that treated them
differently?

There are no such round-trips amongst the numeric types.

> As long as every type has precisely one implicit cast up I think it would
> become safe to rely on it for many of the operations and not have to define
> redundant operations for every data type.

You're not required to provide all the casts, but it's user friendly to
do so. Requiring double casts to go between two essentially compatable
types seems silly...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: Casts

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> You're not required to provide all the casts, but it's user friendly to
> do so. Requiring double casts to go between two essentially compatable
> types seems silly...

I believe what Greg had in mind included the idea that the parser would
automatically find two-step cast pathways if there wasn't a direct path.
This scares me though, as it seems like a recipe for surprising behavior.
        regards, tom lane