Re: [HACKERS] Patch: Avoid precision error in to_timestamp().

Поиск
Список
Период
Сортировка
От Erik Nordström
Тема Re: [HACKERS] Patch: Avoid precision error in to_timestamp().
Дата
Msg-id CAHuQZDQspkizCX9eiMaxV6hV66c_cvXE_hCVMVU2fCb2ZiWBuA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Patch: Avoid precision error in to_timestamp().  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi Tom, and others,

Thanks for the insightful feedback. You are right, the patch does suffer from overflow (and other possible issues when I think about it). Using rint(), as you suggest, helps in my original example case, but there are still cases when the output is not what you would expect. For instance, converting the Unix time 14864803242.312311 gives back the timestamp "2441-01-17 09:00:42.312312+01", even if using rint() (see below).

I guess precision-related errors are unavoidable when working with floating point numbers (especially large ones). But I am wondering if it is not desirable to avoid (or reduce) errors at least in the case when the input value can be accurately represented to the microsecond by a float (i.e., when rounded to microsecond, as in printf)? For example, splitting up the float into second and microsecond integers might help reduce errors, especially with large numbers. Something like this:

    ts_seconds = (int64)seconds;
    ts_microseconds = (int64)rint((seconds - ts_seconds) * USECS_PER_SEC);
    result = (ts_seconds - epoch_diff_seconds) * USECS_PER_SEC + ts_microseconds;

Note that stripping of the full seconds before scaling the microsecond fractional part will keep it more accurate, and it should solve the problem for the case above, including overflow.

Or, maybe I am overthinking this and the only proper solution is to provide a to_timestamp() that takes a microsecond integer? This certainly wouldn't hurt in any case an makes sense since the timestamp is itself an integer internally.

Anyway, I am attaching an updated version of the path. Below are some example outputs (including min and max allowed input) from original Postgres, with your rint() fix, and the included patch:

Original postgres:

test=# select to_timestamp(9224318015999);
          to_timestamp
---------------------------------
 294277-01-01 00:59:58.999552+01
(1 row)

test=# select to_timestamp(-210866803200);
          to_timestamp
---------------------------------
 4714-11-24 01:12:12+01:12:12 BC
(1 row)

test=# select to_timestamp(1486480324.236538);
         to_timestamp
-------------------------------
 2017-02-07 16:12:04.236537+01
(1 row)

test=# select to_timestamp(14864803242.312311);
         to_timestamp
-------------------------------
 2441-01-17 09:00:42.312312+01
(1 row)


With rint():

test=# select to_timestamp(9224318015999);
          to_timestamp
---------------------------------
 294277-01-01 00:59:58.999552+01
(1 row)

test=# select to_timestamp(-210866803200);
          to_timestamp
---------------------------------
 4714-11-24 01:12:12+01:12:12 BC
(1 row)

test=# select to_timestamp(1486480324.236538);
         to_timestamp
-------------------------------
 2017-02-07 16:12:04.236538+01
(1 row)

test=# select to_timestamp(14864803242.312311);
         to_timestamp
-------------------------------
 2441-01-17 09:00:42.312312+01
(1 row)


Included patch:

test=# select to_timestamp(9224318015999);
       to_timestamp
--------------------------
 294277-01-01 00:59:59+01
(1 row)

test=# select to_timestamp(-210866803200);
          to_timestamp
---------------------------------
 4714-11-24 01:12:12+01:12:12 BC
(1 row)

test=# select to_timestamp(1486480324.236538);
         to_timestamp
-------------------------------
 2017-02-07 16:12:04.236538+01
(1 row)

test=# select to_timestamp(14864803242.312311);
         to_timestamp
-------------------------------
 2441-01-17 09:00:42.312311+01
(1 row)


--Erik


On Wed, Feb 8, 2017 at 9:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> I wonder if we could make things better just by using rint() rather than
> a naive cast-to-integer.  The cast will truncate not round, and I think
> that might be what's mostly biting you.  Does this help for you?

> #ifdef HAVE_INT64_TIMESTAMP
> -             result = seconds * USECS_PER_SEC;
> +             result = rint(seconds * USECS_PER_SEC);
> #else

I poked around looking for possible similar issues elsewhere, and found
that a substantial majority of the datetime-related code already uses
rint() when trying to go from float to int representations.  As far as
I can find, this function and make_interval() are the only ones that
fail to do so.  So I'm now thinking that this is a clear oversight,
and both those places need to be patched to use rint().

                        regards, tom lane

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Sharma
Дата:
Сообщение: Re: [HACKERS] pageinspect: Hash index support
Следующее
От: David Fetter
Дата:
Сообщение: Re: [HACKERS] One-shot expanded output in psql using \gx