INTERVAL overflow detection is terribly broken

Поиск
Список
Период
Сортировка
От Rok Kralj
Тема INTERVAL overflow detection is terribly broken
Дата
Msg-id CAMWF=HS++N9-NKsh-o5QSymvp0Np-VB0GdWwAt4uie1h8ZdNTQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: INTERVAL overflow detection is terribly broken  (Rok Kralj <rok.kralj@gmail.com>)
Re: INTERVAL overflow detection is terribly broken  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
Hi, after studying ITERVAL and having a long chat with RhoidumToad and StuckMojo on #postgresql, I am presenting you 3 bugs regarding INTERVAL.

As far as I understand, the Interval struct (binary internal representation) consists of:

int32 months
int32 days
int64 microseconds

1. OUTPUT ERRORS: Since 2^63 microseconds equals 2,562,047,788 > 2^31 hours, the overflow in pg_tm when displaying the value causes overflow. The value of Interval struct is actually correct, error happens only on displaying it.

SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours'
"-2147483644:00:00"

Even wireder:

SELECT INTERVAL '2147483647 hours' + '1 hour'
"--2147483648:00:00"

notice the double minus? Don't ask how I came across this two bugs.

2. OPERATION ERRORS: When summing two intervals, the user is not notified when overflow occurs:

SELECT INT '2147483647' + INT '1'
ERROR: integer out of range

SELECT INTERVAL '2147483647 days' + INTERVAL '1 day'
"-2147483648 days"

This should be analogous.

3. PARSER / INPUT ERRORS:

This is perhaps the hardest one to explain, since this is an architectural flaw. You are checking the overflows when parsing string -> pg_tm struct. However, at this point, the parser doesn't know, that weeks and days are going to get combined, or years are going to get converted to months, for example.

Unawarness of underlying Interval struct causes two types of suberrors:

a) False positive

SELECT INTERVAL '2147483648 microseconds'
ERROR:  interval field value out of range: "2147483648 microseconds"

This is not right. Microseconds are internally stored as 64 bit signed integer. The catch is: this amount of microseconds is representable in Interval data structure, this shouldn't be an error.

b) False negative

SELECT INTERVAL '1000000000 years'
"-73741824 years"

We don't catch errors like this, because parser only checks for overflow in pg_tm. If overflow laters happens in Interval, we don't seem to care.

4. POSSIBLE SOLUTIONS:

a) Move the overflow checking just after the conversion of pg_tm -> Interval is made. This way, you can accurately predict if the result is really not store-able.

b) Because of 1), you have to declare tm_hour as int64, if you want to use that for the output. But, why not use Interval struct for printing directly, without intermediate pg_tm?

5. Offtopic: Correct the documentation, INTERVAL data size is 16 bytes, not 12.

Rok Kralj

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: A better way than tweaking NTUP_PER_BUCKET
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: A better way than tweaking NTUP_PER_BUCKET