Обсуждение: Since '2001-09-09 01:46:40'::timestamp microseconds are lost whenextracting epoch

Поиск
Список
Период
Сортировка

Since '2001-09-09 01:46:40'::timestamp microseconds are lost whenextracting epoch

От
Petr Fedorov
Дата:
Hello,

Steps to reproduce:

select extract(epoch from '2001-09-09 01:46:39.999999'::timestamp)

returns 999999999.999999 as expected

while

select extract(epoch from '2001-09-09 01:46:40.000021'::timestamp)

returns 1000000000.00002 - 1 microsecond is truncated.

Obviously, it is due to the fact that extract epoch returns double
precision which in turn has 15 decimal digits precision.

While there is a pretty simple workaround in C, that returns
microseconds since Unix epoch:

Datum
to_microseconds(PG_FUNCTION_ARGS) {
 Timestamp arg = PG_GETARG_TIMESTAMP(0)+946684800000000;
  PG_RETURN_INT64(arg);
}

I was not able to find the other way of doing that (i.e. without C
function).












Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch

От
Tom Lane
Дата:
Petr Fedorov <petr.fedorov@phystech.edu> writes:
> select extract(epoch from '2001-09-09 01:46:40.000021'::timestamp)
> returns 1000000000.00002 - 1 microsecond is truncated.
> Obviously, it is due to the fact that extract epoch returns double
> precision which in turn has 15 decimal digits precision.

I can't get very excited about this.  However, it might be worth
noting that v12 and HEAD print "1000000000.000021" as expected,
thanks to the Ryu float output code.  You can get that from older
branches as well if you set extra_float_digits = 1.

By my arithmetic, IEEE float8 ought to be able to represent
microseconds accurately out to about 285 years either way from the
1970 epoch, so for practical purposes it'll be fine for a long time.

            regards, tom lane



Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost whenextracting epoch

От
Thomas Munro
Дата:
On Sat, Nov 30, 2019 at 10:28 PM Petr Fedorov <petr.fedorov@phystech.edu> wrote:
> Obviously, it is due to the fact that extract epoch returns double
> precision which in turn has 15 decimal digits precision.

I guess this deviation from the SQL standard ("exact numeric") made
sense when PostgreSQL used double for timestamps, but would break a
lot of queries if we changed it.



Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Sat, Nov 30, 2019 at 10:28 PM Petr Fedorov <petr.fedorov@phystech.edu> wrote:
>> Obviously, it is due to the fact that extract epoch returns double
>> precision which in turn has 15 decimal digits precision.

> I guess this deviation from the SQL standard ("exact numeric") made
> sense when PostgreSQL used double for timestamps, but would break a
> lot of queries if we changed it.

Hmmm ... well, now that you mention it, would it really break things
if we made it return numeric?  There's an implicit cast to float8,
so it seems like queries requiring that type would still work.

There might be a performance-related argument against switching,
perhaps.

            regards, tom lane



Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost whenextracting epoch

От
Petr Fedorov
Дата:
It appears that extract epoch returns double precision, not float8.  And
the program below seems to be demonstrating that there are enough
'floating-point numbers' as defined by  IEEE-754 to represent
1000000000.000021 precisely enough:

#include <cmath>
#include <iostream>
#include <iomanip>
#include <limits>

int main() {
 double from = 1000000000.000020;
 std::cout << std::setprecision(56) << from  << " (" << std::hexfloat <<
from << ") " << std::endl;
 for(auto i = 0; i < 15; ++i) {
   double to = std::nextafter( from, std::numeric_limits<double>::max());
   std::cout << std::defaultfloat << to << std::hexfloat << " (" << to
<< ") " << std::endl;
   from = to;
 }
}

Outputs:

1000000000.00002002716064453125 (0x1.dcd65000000a8p+29)
1000000000.00002014636993408203125 (0x1.dcd65000000a9p+29)
1000000000.0000202655792236328125 (0x1.dcd65000000aap+29)
1000000000.00002038478851318359375 (0x1.dcd65000000abp+29)
1000000000.000020503997802734375 (0x1.dcd65000000acp+29)
1000000000.00002062320709228515625 (0x1.dcd65000000adp+29)
1000000000.0000207424163818359375 (0x1.dcd65000000aep+29)
1000000000.00002086162567138671875 (0x1.dcd65000000afp+29)
1000000000.0000209808349609375 (0x1.dcd65000000bp+29)
1000000000.00002110004425048828125 (0x1.dcd65000000b1p+29)
1000000000.0000212192535400390625 (0x1.dcd65000000b2p+29)
1000000000.00002133846282958984375 (0x1.dcd65000000b3p+29)
1000000000.000021457672119140625 (0x1.dcd65000000b4p+29)
1000000000.00002157688140869140625 (0x1.dcd65000000b5p+29)
1000000000.0000216960906982421875 (0x1.dcd65000000b6p+29)
1000000000.00002181529998779296875 (0x1.dcd65000000b7p+29)

I'm not an expert in floating point math but hopefully it means that no
type change is required - double precision can handle it. 

And since it works correctly on v12 for this particular date may be all
what is needed it to verify that it works for the other dates too! For
example what was changed in v12 (comparing to 11.6 I use) so extract
epoch works correctly?


02.12.2019 01:59, Tom Lane пишет:
> Thomas Munro <thomas.munro@gmail.com> writes:
>> On Sat, Nov 30, 2019 at 10:28 PM Petr Fedorov <petr.fedorov@phystech.edu> wrote:
>>> Obviously, it is due to the fact that extract epoch returns double
>>> precision which in turn has 15 decimal digits precision.
>> I guess this deviation from the SQL standard ("exact numeric") made
>> sense when PostgreSQL used double for timestamps, but would break a
>> lot of queries if we changed it.
> Hmmm ... well, now that you mention it, would it really break things
> if we made it return numeric?  There's an implicit cast to float8,
> so it seems like queries requiring that type would still work.
>
> There might be a performance-related argument against switching,
> perhaps.
>
>             regards, tom lane




Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost whenextracting epoch

От
Thomas Munro
Дата:
On Tue, Dec 3, 2019 at 12:08 AM Petr Fedorov <petr.fedorov@phystech.edu> wrote:
> It appears that extract epoch returns double precision, not float8.  And
> the program below seems to be demonstrating that there are enough
> 'floating-point numbers' as defined by  IEEE-754 to represent
> 1000000000.000021 precisely enough:

Double precision and float8 are different names for the same type in PostgreSQL.

> I'm not an expert in floating point math but hopefully it means that no
> type change is required - double precision can handle it.

Me neither, but the SQL standard requires us to use an exact numeric
type, so it's wrong on that level by definition.

It's also wrong because binary floating point numbers can't represent
0.000001 (one microsecond represented as seconds) exactly, and that's
our unit of counting for timestamps.  You can get pretty far by
thinking of the decimal number you see on the screen as the true
number and the double as a fuzzy internal storage or transport that
does the job just fine due to the round trip conversion guarantee
provided by DBL_DIG, but the double is still going to have the wrong
value in some cases.  As soon as you start doing any arithmetic or
comparisons with the double directly, interesting things can start to
happen to make the error visible and break things; for example
0.1::float8 + 0.2::float8 = 0.3::float8 is false.

> And since it works correctly on v12 for this particular date may be all
> what is needed it to verify that it works for the other dates too! For
> example what was changed in v12 (comparing to 11.6 I use) so extract
> epoch works correctly?

PostgreSQL 12 adopted a different algorithm[1] for converting float8
to text that can affect how many digits are shown, as Tom explained.
The manual has some notes about it[2].

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=02ddd499322ab6f2f0d58692955dc9633c2150fc
[2] https://www.postgresql.org/docs/12/datatype-numeric.html#DATATYPE-FLOAT