Обсуждение: BUG #6217: to_char() gives incorrect output for very small float values

Поиск
Список
Период
Сортировка

BUG #6217: to_char() gives incorrect output for very small float values

От
"Chris Gernon"
Дата:
The following bug has been logged online:

Bug reference:      6217
Logged by:          Chris Gernon
Email address:      kabigon@gmail.com
PostgreSQL version: 9.1.0
Operating system:   Windows XP
Description:        to_char() gives incorrect output for very small float
values
Details:

The to_char() function gives incorrect output for float values whose decimal
expansion has several digits (more than somewhere around 14-15) after the
decimal point.

To reproduce:

CREATE TABLE t (
id serial,
f double precision,
CONSTRAINT t_pk PRIMARY KEY (id)
);

INSERT INTO t (f) VALUES (0.0000000000000000000000000000000563219288);

----------------------------------------

SELECT to_char(f,
'FM999990.99999999999999999999999999999999999999999999999999') FROM t WHERE
id = 1;

Expected Output:
0.0000000000000000000000000000000563219288

Actual Output:
0.

----------------------------------------


SELECT to_char(f,
'999990.99999999999999999999999999999999999999999999999999') FROM t WHERE id
= 1;

Expected Output:
     0.00000000000000000000000000000005632192880000000000

Actual Output:
      0.00000000000000

----------------------------------------

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
"Kevin Grittner"
Дата:
"Chris Gernon" <kabigon@gmail.com> wrote:

> The to_char() function gives incorrect output for float values
> whose decimal expansion has several digits (more than somewhere
> around 14-15) after the decimal point.

These are approximate data types.  On what basis do you think the
values returned in your examples are wrong?  The demonstrated
results don't surprise me, given that they match to about the limits
of the approximate data type involved.  It also wouldn't surprise me
to see slightly different results on different architectures or
operating systems.  If you want exact values, you should use a type
which supports that, like numeric.

-Kevin

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
Tom Lane
Дата:
"Chris Gernon" <kabigon@gmail.com> writes:
> CREATE TABLE t (
> id serial,
> f double precision,
> CONSTRAINT t_pk PRIMARY KEY (id)
> );

> INSERT INTO t (f) VALUES (0.0000000000000000000000000000000563219288);

> ----------------------------------------

> SELECT to_char(f,
> 'FM999990.99999999999999999999999999999999999999999999999999') FROM t WHERE
> id = 1;

> Expected Output:
> 0.0000000000000000000000000000000563219288

> Actual Output:
> 0.

My immediate reaction to that is that float8 values don't have 57 digits
of precision.  If you are expecting that format string to do something
useful you should be applying it to a numeric column not a double
precision one.

It's possible that we can kluge things to make this particular case work
like you are expecting, but there are always going to be similar-looking
cases that can't work because the precision just isn't there.

(In a quick look at the code, the reason you just get "0." is that it's
rounding off after 15 digits to ensure it doesn't print garbage.  Maybe
it could be a bit smarter for cases where the value is very much smaller
than 1, but it wouldn't be a simple change.)

            regards, tom lane

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
"Kevin Grittner"
Дата:
Christopher Gernon <kabigon@gmail.com> wrote:

> to_char() should be able to convert 5.6e-32 to text just as easily
> as it can convert 5.6e-3. For some reason, it doesn't.

Oh, I see your point now, and I agree with you.

We should probably at least put this on the TODO list, I think.  Any
objections?

-Kevin

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
Tom Lane
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Christopher Gernon <kabigon@gmail.com> wrote:
>> to_char() should be able to convert 5.6e-32 to text just as easily
>> as it can convert 5.6e-3. For some reason, it doesn't.

> Oh, I see your point now, and I agree with you.

> We should probably at least put this on the TODO list, I think.  Any
> objections?

If we're gonna fix it, we should just fix it, I think.  I was
considering taking a stab at it, but if someone else would like to,
that's fine too.

One other thing I notice in the same area is that the handling of NaNs
and infinities seems a bit incomplete.  There's an explicit special case
for them in the EEEE-format code path, but not otherwise, and I think
that the results you get for other formats will vary depending on what
the local implementation of snprintf does.  What *should* the output be,
if the input is NaN or Inf?

            regards, tom lane

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
Christopher Gernon
Дата:
On Tue, Sep 20, 2011 at 1:39 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> These are approximate data types. =A0On what basis do you think the
> values returned in your examples are wrong? =A0The demonstrated

Because PostgreSQL still has access to all the significant digits:

test1=3D# SELECT f FROM t WHERE id =3D 1;
        f
-----------------
 5.63219288e-032
(1 row)

Since floats are stored with a significand and an exponent, to_char()
should be able to convert 5.6e-32 to text just as easily as it can
convert 5.6e-3. For some reason, it doesn't.

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
"Kevin Grittner"
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Christopher Gernon <kabigon@gmail.com> wrote:
>>> to_char() should be able to convert 5.6e-32 to text just as
>>> easily as it can convert 5.6e-3. For some reason, it doesn't.
>
>> Oh, I see your point now, and I agree with you.
>
>> We should probably at least put this on the TODO list, I think.
>> Any objections?
>
> If we're gonna fix it, we should just fix it, I think.  I was
> considering taking a stab at it, but if someone else would like
> to, that's fine too.

I wouldn't mind doing it, but not until after the CF wraps.  On the
other hand, isn't this is one of those compatibility functions?
Perhaps it would best be done by someone who has familiarity with,
and access to, a database with which we're trying to be compatible.

-Kevin

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
Tom Lane
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we're gonna fix it, we should just fix it, I think.  I was
>> considering taking a stab at it, but if someone else would like
>> to, that's fine too.

> I wouldn't mind doing it, but not until after the CF wraps.  On the
> other hand, isn't this is one of those compatibility functions?
> Perhaps it would best be done by someone who has familiarity with,
> and access to, a database with which we're trying to be compatible.

Chris already stated that the case gives the answer he expects in
several other DBs, so I don't seem much need for further compatibility
checking on the "don't round off prematurely" angle.  However, it would
be interesting to know what Oracle etc do with NaN and Infinity,
assuming they even support such numbers.

Currently what our code does for the format-with-EEEE case is to output
"#" in all digit positions.  It would be plausible for that to happen
in the non-EEEE cases too, but whether that's actually what happens in
other systems is something I don't know.

            regards, tom lane

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
"Kevin Grittner"
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> it would be interesting to know what Oracle etc do with NaN and
> Infinity, assuming they even support such numbers.
>
> Currently what our code does for the format-with-EEEE case is to
> output "#" in all digit positions.  It would be plausible for that
> to happen in the non-EEEE cases too, but whether that's actually
> what happens in other systems is something I don't know.

From a quick web search, it looks like '#' filling is the typical
approach for infinity and NaN.

-Kevin

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
Greg Stark
Дата:
On Tue, Sep 20, 2011 at 8:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> However, it would
> be interesting to know what Oracle etc do with NaN and Infinity,
> assuming they even support such numbers.

Note that it looks like NUMBER cannot store either Infinity or NaN.
They can only occur in BINARY_FLOAT and BINARY_DOUBLE. From the docs:

> If a BINARY_FLOAT or BINARY_DOUBLE value is converted to CHAR or NCHAR, and the input is either infinity or NaN (not
anumber), then Oracle always returns the pound signs to replace the value. 

And testing shows:


SQL> select to_char(cast('NAN' as binary_float), 'FM9999.9999') from dual;

TO_CHAR(CA
----------
##########

SQL> select to_char(cast('-Inf' as binary_float), 'FM9999.9999') from dual;

TO_CHAR(CA
----------
##########

SQL> select to_char(cast('+Inf' as binary_float), 'FM9999.9999') from dual;

TO_CHAR(CA
----------
##########


--
greg

Re: BUG #6217: to_char() gives incorrect output for very small float values

От
Tom Lane
Дата:
Greg Stark <stark@mit.edu> writes:
> On Tue, Sep 20, 2011 at 8:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, it would
>> be interesting to know what Oracle etc do with NaN and Infinity,
>> assuming they even support such numbers.

> Note that it looks like NUMBER cannot store either Infinity or NaN.
> They can only occur in BINARY_FLOAT and BINARY_DOUBLE. From the docs:

>> If a BINARY_FLOAT or BINARY_DOUBLE value is converted to CHAR or NCHAR, and the input is either infinity or NaN (not
anumber), then Oracle always returns the pound signs to replace the value. 

> And testing shows:

> SQL> select to_char(cast('NAN' as binary_float), 'FM9999.9999') from dual;

> TO_CHAR(CA
> ----------
> ##########

Hmm, interesting.  They replace the whole field with '#', not just the
digit positions?  Because that's not what is happening in our code at
the moment, for the one case where we consider this at all:

regression=# select to_char('nan'::float8, '9999.9999EEEE');
    to_char
----------------
  ####.########
(1 row)

The EEEE path seems rather broken in some other ways as well:

regression=# select to_char('43.5'::float8, '9999.9999EEEE');
   to_char
-------------
  4.3500e+01
(1 row)

Since I did not say FM, why is it suppressing leading spaces here?

I'm starting to think that that code needs a wholesale rewrite
(not but what that's true of just about every part of formatting.c).

            regards, tom lane