Обсуждение:

Поиск
Список
Период
Сортировка

От
Vitaly Burovoy
Дата:
Hackers,

I've just found a little bug: extracting "epoch" from the last 30
years before Postgres' "+Infinity" leads an integer overflow:

postgres=# SELECT x::timestamptz, extract(epoch FROM x::timestamptz)
postgres-# FROM
postgres-# (VALUES
postgres(#   ('294247-01-10 04:00:54.775805'),
postgres(#   ('294247-01-10 04:00:55.775806'),
postgres(#   ('294277-01-09 04:00:54.775806'), -- the last value before 'Inf'
postgres(#   ('294277-01-09 04:00:54.775807')  -- we've discussed, it
should be fixed
postgres(# ) as t(x);
                x                |     date_part
---------------------------------+-------------------
 294247-01-10 04:00:54.775805+00 |  9223372036854.78
 294247-01-10 04:00:55.775806+00 | -9223372036853.78
 294277-01-09 04:00:54.775806+00 | -9222425352054.78
 infinity                        |          Infinity
(4 rows)

With the attached patch it becomes positive:
                x                |    date_part
---------------------------------+------------------
 294247-01-10 04:00:54.775805+00 | 9223372036854.78
 294247-01-10 04:00:55.775806+00 | 9223372036855.78
 294277-01-09 04:00:54.775806+00 | 9224318721654.78
 infinity                        |         Infinity
(4 rows)

--
Best regards,
Vitaly Burovoy

Вложения

Re: Integer overflow in timestamp_part()

От
Tom Lane
Дата:
[ Please use a useful Subject: line in your posts. ]

Vitaly Burovoy <vitaly.burovoy@gmail.com> writes:
> I've just found a little bug: extracting "epoch" from the last 30
> years before Postgres' "+Infinity" leads an integer overflow:

Hmm.  I do not like the proposed patch much: it looks like it's
throwing away precision too soon, although given that the result of
SetEpochTimestamp can be cast to float exactly, maybe it doesn't matter.

More importantly, I seriously doubt that this is the only issue
for timestamps very close to the INT64_MAX boundary.  An example is
that we're not rejecting values that would correspond to DT_NOBEGIN
or DT_NOEND:

regression=# set timezone = 'PST8PDT';
SET
regression=# select '294277-01-08 20:00:54.775806-08'::timestamptz;          timestamptz           
---------------------------------294277-01-08 20:00:54.775806-08
(1 row)

regression=# select '294277-01-08 20:00:54.775807-08'::timestamptz;timestamptz 
-------------infinity
(1 row)

regression=# select '294277-01-08 20:00:54.775808-08'::timestamptz;timestamptz 
--------------infinity
(1 row)

regression=# select '294277-01-08 20:00:54.775809-08'::timestamptz;
ERROR:  timestamp out of range

Worse yet, that last error is coming from timestamptz_out, not
timestamptz_in; we're accepting a value we cannot store properly.
The converted value has actually overflowed to be equal to
INT64_MIN+1, and then timestamptz_out barfs because it's before
Julian day 0.  Other operations would incorrectly interpret it
as a date in the very far past.  timestamptz_in doesn't throw an
error until several hours later than this; it looks like the
problem is that tm2timestamp() worries about overflow in initially
calculating the converted value, but not about overflow in the
dt2local() rotation, and in any case it doesn't worry about not
producing DT_NOEND.

I'm inclined to think that a good solution would be to create an
artificial restriction to not accept years beyond, say, 100000 AD.
That would leave us with a lot of daylight to not have to worry
about corner-case overflows in timestamp arithmetic.  I'm not sure
though where we'd need to enforce such a restriction; certainly in
timestamp[tz]_in, but where else?
        regards, tom lane



Re: Integer overflow in timestamp_part()

От
Jim Nasby
Дата:
On 2/2/16 6:39 PM, Tom Lane wrote:
> I'm inclined to think that a good solution would be to create an
> artificial restriction to not accept years beyond, say, 100000 AD.
> That would leave us with a lot of daylight to not have to worry
> about corner-case overflows in timestamp arithmetic.  I'm not sure
> though where we'd need to enforce such a restriction; certainly in
> timestamp[tz]_in, but where else?

Probably some of the casts (I'd think at least timestamp->timestamptz). 
Maybe timestamp[tz]_recv. Most of the time*pl* functions. :/
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Integer overflow in timestamp_part()

От
Vitaly Burovoy
Дата:
On 2/2/16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [ Please use a useful Subject: line in your posts. ]

I'm so sorry, it was the first time I had forgotten to look at the
"Subject" field before I pressed the "Send" button.

> Vitaly Burovoy <vitaly.burovoy@gmail.com> writes:
>> I've just found a little bug: extracting "epoch" from the last 30
>> years before Postgres' "+Infinity" leads an integer overflow:
>
> Hmm.  I do not like the proposed patch much: it looks like it's
> throwing away precision too soon, although given that the result of
> SetEpochTimestamp can be cast to float exactly, maybe it doesn't matter.
>
> More importantly, I seriously doubt that this is the only issue
> for timestamps very close to the INT64_MAX boundary.  An example is
> that we're not rejecting values that would correspond to DT_NOBEGIN
> or DT_NOEND:
>
> regression=# set timezone = 'PST8PDT';
> SET
> regression=# select '294277-01-08 20:00:54.775806-08'::timestamptz;
>            timestamptz
> ---------------------------------
>  294277-01-08 20:00:54.775806-08
> (1 row)
>
> regression=# select '294277-01-08 20:00:54.775807-08'::timestamptz;
>  timestamptz
> -------------
>  infinity
> (1 row)
>
> regression=# select '294277-01-08 20:00:54.775808-08'::timestamptz;
>  timestamptz
> -------------
>  -infinity
> (1 row)
>
> regression=# select '294277-01-08 20:00:54.775809-08'::timestamptz;
> ERROR:  timestamp out of range
>
> Worse yet, that last error is coming from timestamptz_out, not
> timestamptz_in; we're accepting a value we cannot store properly.
> The converted value has actually overflowed to be equal to
> INT64_MIN+1, and then timestamptz_out barfs because it's before
> Julian day 0.  Other operations would incorrectly interpret it
> as a date in the very far past.  timestamptz_in doesn't throw an
> error until several hours later than this; it looks like the
> problem is that tm2timestamp() worries about overflow in initially
> calculating the converted value, but not about overflow in the
> dt2local() rotation, and in any case it doesn't worry about not
> producing DT_NOEND.

It is clear why it happens, and it was in my plans to insert checks
there according to the thread[1].

> I'm inclined to think that a good solution would be to create an
> artificial restriction to not accept years beyond, say, 100000 AD.

Well... We can limit it to the boundaries described at the
documentation page[2]: [4713 BC, 294276 AD].
It allows us be sure we will not break stamps that are stored (for any
reason) according to the documentation (in meaning of infinity, but
not exactly as 'infinity'::timestamptz).

Currently boundaries for timestamp[tz] are [4714-11-24+00 BC,
294277-01-09 04:00:54.775806+00]. One month to the lower boundary and
9 days to the upper one should be enough to represent it into int64
before applying time zone (+-15 hours) and check for boundaries
without an overflow.

> That would leave us with a lot of daylight to not have to worry
> about corner-case overflows in timestamp arithmetic.

Great! It was my next question because I desperated to find a solution
for finding a good corner-case for internal version of the
"to_timestamp" function (my not published yet WIP patch) that supports
+-Infinity::float8 as input to be symmertric with current
"extract('epoch'..."[3].

The exact value for current allowed maximal value ("294277-01-09
04:00:54.775806+00") should be 9224318721654.775806, but it cannot be
represented as float8. My experiments show there is "big" gap in
miliseconds (~0.002, but not 0.000001):
        src          |    representation
----------------------+------------------------9224318721654.774414 | 9224318721654.77343759224318721654.774415 |
9224318721654.7753906259224318721654.776367| 9224318721654.7753906259224318721654.776368 |
9224318721654.777343759224318721654.778320| 9224318721654.777343759224318721654.778321 | 9224318721654.779296875
 

So if it is possible to set an upper limit exact by a year boundary it
solves a lot of nerves.

> I'm not sure
> though where we'd need to enforce such a restriction; certainly in
> timestamp[tz]_in, but where else?
>
>             regards, tom lane

What to do with dates: [4713 BC, 5874897 AD]? Limit them to stamps
boundaries or leave them as is and forbid a conversion if they don't
fit into stamps?

There is also trouble with intervals. Currently it is impossible to do
(even without the overflow on extracting):

postgres=# select extract (epoch from '294277-01-09
04:00:54.775806+00'::timestamptz);   date_part
------------------9224318721654.78
(1 row)

postgres=# select to_timestamp(9224318721654.775390625); -- because of
..654.78 is rounded up; but we know the exact value         to_timestamp
---------------------------------294277-01-09 04:00:54.77539+00
(1 row)

... you get the error:

postgres=# select to_timestamp(9224318721654.775390625);
ERROR:  interval out of range
CONTEXT:  SQL function "to_timestamp" statement 1

... at the operation "$1 * '1 second'::pg_catalog.interval" in the
pg_catalog.to_timestamp function:

postgres=# select 9224318721654.775390625 * '1 second'::pg_catalog.interval;
ERROR:  interval out of range

... even with a valid 64bit signed integer:

postgres=# select 9223372036000 * '1 second'::pg_catalog.interval;
ERROR:  interval out of range

postgres=# select 7730941132799.99951 * '1 second'::interval as
max_interval_sec, to_timestamp(7730941132799.99951);   max_interval_sec     |          to_timestamp
-------------------------+---------------------------------2147483647:59:59.998976 | 246953-10-09 07:59:59.999023+00
(1 row)

despite the fact the documentation[2] defines a range of intervals as
[-178000000,178000000] years (that doesn't fit even into dates) with
resolution "1 microsecond / 14 digits"... I know the cause
((7730941132799+1)/3600 = 2^31), but it seems lack of a Note in the
documentation.

===
[1]http://www.postgresql.org/message-id/flat/CAKOSWNmLjs_2EyS4z_bDkp=RfU8s2M2P8joZVt18h+wfszWz7A@mail.gmail.com
[2]http://www.postgresql.org/docs/devel/static/datatype-datetime.html
[3]http://git.postgresql.org/pg/commitdiff/647d87c56ab6da70adb753c08d7cdf7ee905ea8a

-- 
Best regards,
Vitaly Burovoy



Re:

От
Bruce Momjian
Дата:
On Sat, Jan 30, 2016 at 11:06:10PM -0800, Vitaly Burovoy wrote:
> Hackers,
> 
> I've just found a little bug: extracting "epoch" from the last 30
> years before Postgres' "+Infinity" leads an integer overflow:
> 
> postgres=# SELECT x::timestamptz, extract(epoch FROM x::timestamptz)
> postgres-# FROM
> postgres-# (VALUES
> postgres(#   ('294247-01-10 04:00:54.775805'),
> postgres(#   ('294247-01-10 04:00:55.775806'),
> postgres(#   ('294277-01-09 04:00:54.775806'), -- the last value before 'Inf'
> postgres(#   ('294277-01-09 04:00:54.775807')  -- we've discussed, it
> should be fixed
> postgres(# ) as t(x);
>                 x                |     date_part
> ---------------------------------+-------------------
>  294247-01-10 04:00:54.775805+00 |  9223372036854.78
>  294247-01-10 04:00:55.775806+00 | -9223372036853.78
>  294277-01-09 04:00:54.775806+00 | -9222425352054.78
>  infinity                        |          Infinity
> (4 rows)
> 
> With the attached patch it becomes positive:
>                 x                |    date_part
> ---------------------------------+------------------
>  294247-01-10 04:00:54.775805+00 | 9223372036854.78
>  294247-01-10 04:00:55.775806+00 | 9223372036855.78
>  294277-01-09 04:00:54.775806+00 | 9224318721654.78
>  infinity                        |         Infinity
> (4 rows)

FYI, in 9.6 this will return an error:
test=> SELECT x::timestamptz, extract(epoch FROM x::timestamptz)FROM(VALUES('294247-01-10
04:00:54.775805'),('294247-01-1004:00:55.775806'),('294277-01-09 04:00:54.775806'), -- the last value before
'Inf'('294277-01-0904:00:54.775807')  -- we've discussed, it) as t(x);ERROR:  timestamp out of range: "294277-01-09
04:00:54.775806"

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +