Обсуждение: Problems on NUMERIC

Поиск
Список
Период
Сортировка

Problems on NUMERIC

От
jwieck@debis.com (Jan Wieck)
Дата:
Hi,

    sometimes it's good not to spend too much efford implementing
    the final solution first. So for the NUMERIC.

    First I wonder why the can_coerce... stuff is  #if'd  out  of
    parse_relation.c?     For     the     NUMERIC     type    the
    numeric(num,typmod) must be called if someone does an

        INSERT INTO ... SELECT * FROM ...

    But it isn't. It is only called when there  are  calculations
    done  on the columns. I also checked that for BPCHAR type and
    it simply throws an ERROR  if  the  target's  length  doesn't
    match.

    This  might be easy to fix, but the other problem I have is a
    bit more difficult.

    When binary operators (add, sub, mul, div)  are  called,  the
    required  precision of the result isn't known. And the coerce
    function numeric(num,typmod) will  only  be  called  for  the
    final result. Now have the following situation:

    CREATE TABLE t1 (id int4, annual_val numeric(20,4));
    CREATE TABLE t2 (id int4, monthly_val numeric(24,8));

    INSERT INTO t2 SELECT id, annual_val / '12' FROM t1;

    A  multiplication  would have a maximum number of digits that
    can appear after the decimal point. It is the sum  of  number
    of  digits  present  in  the  two operators. But not so for a
    division.

    If we want to implement NUMERIC with a  real  high  precision
    (maybe  4000  or  more  digits),  there would currently be no
    other chance than to do  the  division  with  the  full  ever
    possible  precision  and  then  throw away most of the digits
    when the result is assigned  to  the  target  column.  Wasted
    efford and more important MUCH WASTED CPU.

    I can think of something like this:

    On add/subtract the results precision after the decimal point
    is the higher of the two operands.

    On multiply the results precision after the decimal point  is
    the sum of the precisions of the two operands.

    On  divide  the  results precision after the decimal point is
    like for mult or the double of the higher  precision  of  the
    two operands. Any other suggestions?

    On the other hand it is possible to do it as

    INSERT INTO t2 SELECT id, ROUND(annual_val,8) / '12' FROM t1;

    How  do  other  databases  handle  this  problem.  How is the
    precision of a numeric result defined?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] Problems on NUMERIC

От
"Thomas G. Lockhart"
Дата:
>     First I wonder why the can_coerce... stuff is  #if'd  out  of
>     parse_relation.c?

Oh! That looks like my style of #if FALSE, but I can't recall why it is
that way. Will look at it. Does it work to just substitute an #if TRUE?
Perhaps I had it disabled during debugging, but...

>     How  do  other  databases  handle  this  problem.  How is the
>     precision of a numeric result defined?

I've enclosed some snippets from my SQL92 2nd Draft Standard doc. It
gives you a lot of latitude :)
                    - Tom
 Syntax Rules
 1) If the data type of both operands of a dyadic arithmetic opera- tor is exact numeric, then the data type of the
resultis exact numeric, with precision and scale determined as follows:
 
 a) Let S1 and S2 be the scale of the first and second operands    respectively.
 b) The precision of the result of addition and subtraction is    implementation-defined, and the scale is the maximum
ofS1    and S2.
 
 c) The precision of the result of multiplication is implementation-    defined, and the scale is S1 + S2.
 d) The precision and scale of the result of division is    implementation-defined.

<snip large amounts>
 Whenever an exact or approximate numeric value is assigned to a data item or parameter representing an exact numeric
value,an approximation of its value that preserves leading significant dig- its after rounding or truncating is
representedin the data type of the target. The value is converted to have the precision and scale of the target. The
choiceof whether to truncate or round is implementation-defined.
 
 An approximation obtained by truncation of a numerical value N for an <exact numeric type> T is a value V
representablein T such that N is not closer to zero than the numerical value of V and such that the absolute value of
thedifference between N and the numer- ical value of V is less than the absolute value of the difference between two
successivenumerical values representable in T.
 
 An approximation obtained by rounding of a numerical value N for an <exact numeric type> T is a value V representable
inT such that the absolute value of the difference between N and the nu- merical value of V is not greater than half
theabsolute value of the difference between two successive numerical values repre- sentable in T. If there are more
thanone such values V, then it is implementation-defined which one is taken.
 
 All numerical values between the smallest and the largest value, inclusive, representable in a given exact numeric
typehave an approximation obtained by rounding or truncation for that type; it is implementation-defined which other
numericalvalues have such approximations.
 


Re: [HACKERS] Problems on NUMERIC

От
"Thomas G. Lockhart"
Дата:
> >     First I wonder why the can_coerce... stuff is  #if'd  out  of
> >     parse_relation.c?

It looks like the routine where the code appears cannot return a
modified expression, so I just placed this code in there as a marker.
Should be possible to add some code to get this feature working.
                     - Tom


Re: [HACKERS] Problems on NUMERIC

От
jwieck@debis.com (Jan Wieck)
Дата:
>
> >     First I wonder why the can_coerce... stuff is  #if'd  out  of
> >     parse_relation.c?
>
> Oh! That looks like my style of #if FALSE, but I can't recall why it is
> that way. Will look at it. Does it work to just substitute an #if TRUE?
> Perhaps I had it disabled during debugging, but...

    Hmmm  -  elog(ERROR, "Type %s(%d) can be coerced to...  looks
    like debugging code for me. Maybe you wanted to elog(DEBUG...
    ?

>
> >     How  do  other  databases  handle  this  problem.  How is the
> >     precision of a numeric result defined?
>
> I've enclosed some snippets from my SQL92 2nd Draft Standard doc. It
> gives you a lot of latitude :)
>

    Thanks! That helps alot!

>
>   d) The precision and scale of the result of division is
>      implementation-defined.

    I love those definitions :-)

    So I'll make the display scale of a division

        min( max(S1, S2), SLIMIT)

    and the internal result scale

        min( RMINIMUM, max(R1, R2) + 2, RLIMIT)

    where  S1  and S2 are the display scales of the two operands,
    R1 and R2 are the internal present scales and SLIMIT,  RLIMIT
    is  the  implementation-defined  maximum allowed scale at all
    (what about 4000 for SLIMIT ?).  The RMINIMUM is  8  to  have
    anything  at  least computed internal with 8 digits after the
    decimal point (because the defaults for NUMERIC are precision
    30 scale 6).

    If  then  the result is assigned to another tuples attribute,
    numeric(num,typmod) will be called and do the  rounding  with
    the  scale  defined  in typmod. If numeric_out(num) is called
    for it, it will be output rounded to the above display scale.

    With  'round(att1,  500)  /  att2'  someone  can then get the
    result with 500 digits scale.

    This way it  is  flexible  enough  but  not  to  much  wasted
    computing is done.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] Problems on NUMERIC

От
"Thomas G. Lockhart"
Дата:
>     First I wonder why the can_coerce... stuff is  #if'd  out  of
>     parse_relation.c?     For     the     NUMERIC     type    the
>     numeric(num,typmod) must be called if someone does an
> 
>         INSERT INTO ... SELECT * FROM ...
> 
>     But it isn't. It is only called when there  are  calculations
>     done  on the columns. I also checked that for BPCHAR type and
>     it simply throws an ERROR  if  the  target's  length  doesn't
>     match.

Sorry, I'm having trouble thinking of a case which does not behave
properly with the existing types. I've tried inserting varchar(10)
columns into a varchar(1) column, I've tried inserting int columns into
float columns, etc etc. How are you getting handleTargetColname() /
checkTargetTypes() called where it is rejecting things?

It may be that splitting that attribute field into two pieces for
NUMERIC is opening a can of worms, since there are specific assumptions
about what that field means throughout the code :(

Maybe we should think about how to isolate the type-specific
interpretation of that attribute field into a type-specific handler
routine? Ooh, that sounds like a pain...
                        - Tom


Re: [HACKERS] Problems on NUMERIC

От
jwieck@debis.com (Jan Wieck)
Дата:
>
> >     First I wonder why the can_coerce... stuff is  #if'd  out  of
> >     parse_relation.c?     For     the     NUMERIC     type    the
> >     numeric(num,typmod) must be called if someone does an
> >
> >         INSERT INTO ... SELECT * FROM ...
> >
> >     But it isn't. It is only called when there  are  calculations
> >     done  on the columns. I also checked that for BPCHAR type and
> >     it simply throws an ERROR  if  the  target's  length  doesn't
> >     match.
>
> Sorry, I'm having trouble thinking of a case which does not behave
> properly with the existing types. I've tried inserting varchar(10)
> columns into a varchar(1) column, I've tried inserting int columns into
> float columns, etc etc. How are you getting handleTargetColname() /
> checkTargetTypes() called where it is rejecting things?

    pgsql=> create table t1 (a char(10));
    CREATE
    pgsql=> create table t2 (a char(4));
    CREATE
    pgsql=> insert into t2 select * from t1;
    ERROR:  Length of a is not equal to the length of target column a
    pgsql=>

>
> It may be that splitting that attribute field into two pieces for
> NUMERIC is opening a can of worms, since there are specific assumptions
> about what that field means throughout the code :(

    It  doesn't  produce  any  problems  so  far,  only  that the
    function numeric(num,typmod) isn't called when doing a  plain
    INSERT ... SELECT.  It is only called when comparisions where
    performed in the SELECT clause of the INSERT on  the  numeric
    attributes.  But  I  need that call to force the rounding and
    range check at INSERT time.  Otherwise,  the  values  in  the
    target  table  will  output  later  with  the  scale of their
    original source table, and that's wrong.  Also  it  would  be
    possible  to insert 1000.0 into a numeric(5,2) attribute, and
    that shouldn't be.

    Maybe I have to hook for NUMERIC there in parse_relation too.
    Up  to  now I'm compiling the whole thing as loadable module.
    I'll check it that's possible when moving it to the builtins.

    But  in  general I think if there is a function with the same
    name as a  type,  that  take  this  type  plus  another  int4
    argument,  this  must  be a range checker/padder/truncator or
    the like and it should be called before values  are  assigned
    to attributes.

>
> Maybe we should think about how to isolate the type-specific
> interpretation of that attribute field into a type-specific handler
> routine? Ooh, that sounds like a pain...

    Noooooooooo


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] Problems on NUMERIC

От
"Thomas G. Lockhart"
Дата:
> > How are you getting handleTargetColname() /
> > checkTargetTypes() called where it is rejecting things?

OK, I'm not sure why the behavior is different if I explicitly specify
the columns (which of course I had for testing):

postgres=> insert into t2 select a from t1;
INSERT 0 0
postgres=> insert into t2 select * from t1;
ERROR:  Length of 'a' is not equal to the length of target column 'a'

>     It  doesn't  produce  any  problems  so  far,  only  that the
>     function numeric(num,typmod) isn't called when doing a  plain
>     INSERT ... SELECT.

Hmm. Even when you explicitly specify the columns as I did in my example
above? I should be able to get the wildcard example to work sometime
before v6.5, and I *think* that the explicit cases should do what you
want. As a loadable module, your data type will only match itself for
type coersion, but that's what you want for now. When it is built in,
then you will be able to specify that it is higher or lower in a
heirarchy with, for example, int4 and float8.

>     Maybe I have to hook for NUMERIC there in parse_relation too.
>     Up  to  now I'm compiling the whole thing as loadable module.
>     I'll check it that's possible when moving it to the builtins.

Unless you can't find a test case which does work for you, don't bother
looking at it; I'll pick it up some time soon.

>     But  in  general I think if there is a function with the same
>     name as a  type,  that  take  this  type  plus  another  int4
>     argument,  this  must  be a range checker/padder/truncator or
>     the like and it should be called before values  are  assigned
>     to attributes.

That's how it should work afaik, at least for variable-length types. Not
all types are checked for this conversion function...
                     - Tom