Обсуждение: Why DEFAULT text 'now' does not work for TIMESTAMP columns

Поиск
Список
Период
Сортировка

Why DEFAULT text 'now' does not work for TIMESTAMP columns

От
Tom Lane
Дата:
Per Leon's recent gripe in the bugs list, I have confirmed that

create table t1 (f1 int4, f2 timestamp default text 'now');

does *not* work as expected --- the stored constraint expression
has been pre-reduced to a timestamp constant, even though you
get the desired behavior with

create table t1 (f1 int4, f2 datetime default text 'now');

I have tracked down the cause of this, and it's a mess.

First off, there is a pg_proc entry for converting a text value
to datetime (proc text_datetime, OID 1351) but there is no similar
function for timestamp.  Therefore, the parser's can_coerce_type
function returns "true" if asked whether it can coerce text to
datetime, but "false" for text to timestamp.

Second, the actual storage of the constraint expression is being
done through an incredibly klugy series of hacks.  After the grammar
parses the constraint expression, the expression is converted back
to text form (!) instead of being output as a parsetree.  Then,
StoreAttrDefault does a parse_and_plan on that text to generate a
parsetree, which it will use as the executable form of the constraint.
This code makes me ill to look at ... for one thing the reverse-
conversion code is not nearly as smart as it needs to be:

create table quotedefault (f1 int4, f2 text default 'abc\'def');
ERROR:  parser: parse error at or near "def"

because the deparsed text handed to StoreAttrDefault just looks like'abc'def'

But the immediate problem is that StoreAttrDefault tries to coerce the
compiled expression to the target data type.  If it can't get there
through can_coerce_type, it does a forced coercion by reparsing the
constraint text with ":: targettype" added on.  (Still another bug:
since it doesn't put parentheses around the constraint expression text
when it does that, the typecast will actually be parsed as applied to
the last component of the expression, not the whole thing... which could
lead to the wrong type coming out.)  Of course "text 'now' :: timestamp"
will be reduced to a timestamp constant, and at that point we've lost.

I am not sure what should be done to clean this up.  A brute-force
solution would be to make sure that there is a text-to-whatever
conversion function in pg_proc for any type where the type input
function is not necessarily a constant --- but I'm not sure which
types besides timestamp might meet that description.  In any case
I do not care for the code in StoreAttrDefault at all.

I am about to commit parser fixes that ensure a DEFAULT value is
correctly coerced to the column type when it is used (that is,
transformInsertStatement now does a coerce_type rather than just
assuming what is in pg_attrdef is the right type).  So, one possible
approach is to remove the coercion code from StoreAttrDefault
altogether.  That would mean that    field1 datetime 'now'
would start acting the same as    field1 datetime text 'now'
currently does: both of them would be coerced to datetime at runtime,
not when the constraint expression is created.  Given the frequency
with which newbies complain about the current behavior, I think that
that might be a Good Thing.  But it would be a change in behavior,
and I suppose there are scenarios where you'd like to be able to get
the old behavior.

Comments?
        regards, tom lane


Re: [HACKERS] Why DEFAULT text 'now' does not work for TIMESTAMP columns

От
"Ross J. Reedstrom"
Дата:
On Sun, Jul 18, 1999 at 06:19:41PM -0400, Tom Lane wrote:
<excellent problem analysis snipped>
> 
> I am about to commit parser fixes that ensure a DEFAULT value is
> correctly coerced to the column type when it is used (that is,
> transformInsertStatement now does a coerce_type rather than just
> assuming what is in pg_attrdef is the right type).  So, one possible
> approach is to remove the coercion code from StoreAttrDefault
> altogether.  That would mean that
>         field1 datetime 'now'
> would start acting the same as
>         field1 datetime text 'now'
> currently does: both of them would be coerced to datetime at runtime,
> not when the constraint expression is created.  Given the frequency
> with which newbies complain about the current behavior, I think that
> that might be a Good Thing.  But it would be a change in behavior,
> and I suppose there are scenarios where you'd like to be able to get
> the old behavior.
> 
> Comments?

My only comment: it seems to me that after your fix, one could still get the
old behavior via something like:              field1 datetime now()
or perhaps:              field1 datetime 'now'::datetime
correct? And the default behavior will now be what most naive users
expect.  As long as a workaround exists for the cases where someone cares
about the table definition time, I wouldn't worry about staying 'bug'
(or misfeature?) compatible, unless it's an official, committee backed
SQL standard misfeature, of course. ;-) Sounds good to me.

Ross
-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005


Re: [HACKERS] Why DEFAULT text 'now' does not work for TIMESTAMP columns

От
Thomas Lockhart
Дата:
> Per Leon's recent gripe in the bugs list, I have confirmed that
> create table t1 (f1 int4, f2 timestamp default text 'now');
> does *not* work as expected --- the stored constraint expression
> has been pre-reduced to a timestamp constant, even though you
> get the desired behavior with
> create table t1 (f1 int4, f2 datetime default text 'now');
> I have tracked down the cause of this, and it's a mess.

Yes. We've had lots of small "improvements" in the code (there is
blood on my hands :) which danced around a fundamental problem: it
would be nice to pre-evaluate functions on constants, but there are a
few functions/constants (e.g. "random" or 'now') which shouldn't be
done this way. Functions and types really should have an "is cachable"
attribute so that they can be pre-evaluated when possible.

> First off, there is a pg_proc entry for converting a text value
> to datetime (proc text_datetime, OID 1351) but there is no similar
> function for timestamp.  Therefore, the parser's can_coerce_type
> function returns "true" if asked whether it can coerce text to
> datetime, but "false" for text to timestamp.

At the moment, timestamp serves the useful purpose of illustrating how
annoying a partially implemented and poorly supported feature can be.
I've been putting off relabeling "datetime" and "timespan" as
"timestamp" and "interval", thinking that it should wait for the major
rev bump to 7.0. But it really shouldn't wait. This would align the
best types in the date/time code with SQL-standard names. The original
timestamp and interval code would be killed. That doesn't fix the
underlying problems handling defaults, but would stop most complaints
about timestamp...

> Second, the actual storage of the constraint expression is being
> done through an incredibly klugy series of hacks.  After the grammar
> parses the constraint expression, the expression is converted back
> to text form (!) instead of being output as a parsetree.

Yeah, well, originally it was just passed through as a string, but the
parser couldn't validate the syntax under those circumstances. So I
had the parser tokenize it, and then reassemble the string. But
apparently I didn't try very hard to reassemble it correctly.

> This code makes me ill to look at ... 

You should know by now that Postgres internals aren't for a weak
stomach ;)

> I am not sure what should be done to clean this up.  A brute-force
> solution would be to make sure that there is a text-to-whatever
> conversion function in pg_proc for any type where the type input
> function is not necessarily a constant --- but I'm not sure which
> types besides timestamp might meet that description.  In any case
> I do not care for the code in StoreAttrDefault at all.
> 
> I am about to commit parser fixes that ensure a DEFAULT value is
> correctly coerced to the column type when it is used (that is,
> transformInsertStatement now does a coerce_type rather than just
> assuming what is in pg_attrdef is the right type).  So, one possible
> approach is to remove the coercion code from StoreAttrDefault
> altogether.  That would mean that
>                 field1 datetime 'now'
> would start acting the same as
>                 field1 datetime text 'now'
> currently does: both of them would be coerced to datetime at runtime,
> not when the constraint expression is created.  Given the frequency
> with which newbies complain about the current behavior, I think that
> that might be a Good Thing.  But it would be a change in behavior,
> and I suppose there are scenarios where you'd like to be able to get
> the old behavior.

Sorry, how does that change behavior for the worse? I can see it
taking a performance hit, but under which circumstances would runtime
evaluation be counter-intuitive or wrong?

And while you are being annoyed by code, how about looking at problems
with trying to use indices on constants and on functions calls? I've
assumed that it could benefit from a judicious application of
coerce_type...
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Why DEFAULT text 'now' does not work for TIMESTAMP columns

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Yes. We've had lots of small "improvements" in the code (there is
> blood on my hands :) which danced around a fundamental problem: it
> would be nice to pre-evaluate functions on constants, but there are a
> few functions/constants (e.g. "random" or 'now') which shouldn't be
> done this way. Functions and types really should have an "is cachable"
> attribute so that they can be pre-evaluated when possible.

Yes.  There is a proiscachable column in pg_proc, but it doesn't look
like it currently contains useful data, nor do I find any code looking
at it.  We need to put trustworthy data into that column and then we
can start using it to tell whether functions are safe to pre-evaluate
on constants.  In the case at hand, timestamp_in() would have to be
marked unsafe.   I'd like to add a generalized constant-subexpression-
collapser to the optimizer, and it would need something like that to
tell it whether to collapse functions whose inputs are constants.
(Once we did that, the parser would no longer need to worry about
pre-evaluating type conversion functions on constants, which is
effectively what it does now in selected cases...)

> I've been putting off relabeling "datetime" and "timespan" as
> "timestamp" and "interval", thinking that it should wait for the major
> rev bump to 7.0. But it really shouldn't wait.

As long as both sets of names are accepted, I think it probably wouldn't
matter if the implementation of one of them changes.  I wouldn't like to
have my tables containing "datetime" suddenly stop working though...

> Yeah, well, originally it was just passed through as a string, but the
> parser couldn't validate the syntax under those circumstances. So I
> had the parser tokenize it, and then reassemble the string. But
> apparently I didn't try very hard to reassemble it correctly.

I was thinking about letting the parser output the same parsetree as
it does for everything else (thereby saving a chunk of grammar code)
and then building a little subroutine that could deparse a parsetree
to text.  It wouldn't be that much bigger than the part of the grammar
that's doing the task ... for that matter, I think Jan may already have
such a thing somewhere in the rules support.

>> So, one possible
>> approach is to remove the coercion code from StoreAttrDefault
>> altogether.  That would mean that
>> field1 datetime 'now'
>> would start acting the same as
>> field1 datetime text 'now'
>> currently does: both of them would be coerced to datetime at runtime,
>> not when the constraint expression is created.  Given the frequency
>> with which newbies complain about the current behavior, I think that
>> that might be a Good Thing.  But it would be a change in behavior,
>> and I suppose there are scenarios where you'd like to be able to get
>> the old behavior.

> Sorry, how does that change behavior for the worse? I can see it
> taking a performance hit, but under which circumstances would runtime
> evaluation be counter-intuitive or wrong?

I'm not sure it would ever be counter-intuitive, but I can just see
someone coming along and saying "Hey, I *wanted* to store the time
of creation of the table as the default!".  There might be more
plausible examples with other non-cacheable functions, but I haven't
thought of any offhand.

In any case, you could get that result by evaluating the function in
a separate command and pasting its result into the CREATE TABLE, so
there's no serious loss of functionality.

> And while you are being annoyed by code, how about looking at problems
> with trying to use indices on constants and on functions calls?

"Indices on constants"?  I'm confused...
        regards, tom lane


Re: [HACKERS] Why DEFAULT text 'now' does not work for TIMESTAMP columns

От
Thomas Lockhart
Дата:
> As long as both sets of names are accepted, I think it probably wouldn't
> matter if the implementation of one of them changes.  I wouldn't like to
> have my tables containing "datetime" suddenly stop working though...

No, we would have the names aliased in the parser, as we do for
int->int4, etc.

> > And while you are being annoyed by code, how about looking at problems
> > with trying to use indices on constants and on functions calls?
> "Indices on constants"?  I'm confused...

I just phrased it poorly. I was referring to the int2_col = int4_val
problem...
                   - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Why DEFAULT text 'now' does not work for TIMESTAMP columns

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> As long as both sets of names are accepted, I think it probably wouldn't
>> matter if the implementation of one of them changes.  I wouldn't like to
>> have my tables containing "datetime" suddenly stop working though...

> No, we would have the names aliased in the parser, as we do for
> int->int4, etc.

Oh, I see.  And so my next pg_dump output would magically have the
standard names.  That's cool.

>>>> And while you are being annoyed by code, how about looking at problems
>>>> with trying to use indices on constants and on functions calls?
>> "Indices on constants"?  I'm confused...

> I just phrased it poorly. I was referring to the int2_col = int4_val
> problem...

Ah.  I've got that on my to-do list, but I've got no good idea for a
solution yet... see prior traffic...
        regards, tom lane