Re: Fix for edge case in date_bin() function

Поиск
Список
Период
Сортировка
От Moaaz Assali
Тема Re: Fix for edge case in date_bin() function
Дата
Msg-id CALkF+nsG9E39AQ-CJti9=GnD-S8NjGzQ=twKbVxuS+sygdU-2Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fix for edge case in date_bin() function  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Fix for edge case in date_bin() function  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Yeah you are right about the integer overflow.

Consider this query: select date_bin('15 minutes'::interval, timestamp '294276-12-30 10:24:00', timestamp '4000-12-20 23:00:00 BC');

It will return 294276-12-30 10:31:49.551616 when it should be 294276-12-30 10:15:00, which happens because source timestamp is close to INT64_MAX and origin timestamp is negative, causing an integer overflow.
So, the subsequent calculations are wrong.

I fixed the integer overflow and original bug and added test cases in my 3 patches in this reply below:

v2-0001: 
- Fixed both the original bug discussed and the integer overflow issues (and used your suggestion to store the modulo result).
- Any timestamp used will output the correct date_bin() result.
- I have used INT64 -> UINT64 mapping in order to ensure no integer overflows are possible.
- The only additional cost is 3 subtractions/addition to do the INT64 -> UINT64 mapping for timestamp, origin and final result.
- It is probably possible to tackle the integer overflow issue without casting but, from my attempts, it seemed much more convoluted and complex.
- Implemented the fix for both timestamp_bin() and timestamptz_bin().

v2-0002: 
- Added multiple test cases in timestamp.sql for integer overflow by testing with timestamps around INT64_MIN and INT64_MAX.
- Added test case for the original bug where source timestamp is a valid binned date already and does not require the additional stride interval subtraction.

v2-0003:
- Exactly the same as v2-0002 but for timestamptz.sql


Also, I would like to create a new patch on the 2024-03 commitfest, but since I just created my account yesterday I get this error:
"The site you are trying to log in to (commitfest.postgresql.org) requires a cool-off period between account creation and logging in. You have not passed the cool off period yet."

How long is the cool off period, so that I can create a new patch in the commitfest before submissions close after tomorrow.
Alternatively, is it possible for someone to open a new patch on my behalf linking this email thread, so it can be added to the 2024-03 commitfest?


Best regards,
Moaaz Assali


On Tue, Feb 27, 2024 at 11:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
> attempt to account for this that is probably redundant given the
> additional condition.  Also, can we avoid computing tm_diff %
> stride_usecs twice?  Maybe the compiler is smart enough to remove the
> common subexpression, but it's a mighty expensive computation if not.

I think we could do it like this:

    tm_diff = timestamp - origin;
    tm_modulo = tm_diff % stride_usecs;
    tm_delta = tm_diff - tm_modulo;
    /* We want to round towards -infinity when tm_diff is negative */
    if (tm_modulo < 0)
        tm_delta -= stride_usecs;

Excluding tm_modulo == 0 from the correction is what fixes the
problem.

> I'm also itching a bit over whether there are integer-overflow
> hazards here.  Maybe the range of timestamp is constrained enough
> that there aren't, but I didn't look hard.

Hmm, it's not.  For instance this triggers the overflow check in
timestamp_mi:

regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC';
ERROR:  interval out of range
regression=# \errverbose
ERROR:  22008: interval out of range
LOCATION:  timestamp_mi, timestamp.c:2832

So we ought to guard the subtraction that produces tm_diff similarly.
I suspect it's also possible to overflow int64 while computing
stride_usecs.

                        regards, tom lane
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Relation bulk write facility
Следующее
От: Vladimir Sitnikov
Дата:
Сообщение: Re: When extended query protocol ends?