Re: truncating timestamps on arbitrary intervals

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: truncating timestamps on arbitrary intervals
Дата
Msg-id CACPNZCuxeG1mP-6=qGJcMpHa8SmHaTXK-zXC8w19HWnu1N-zqA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: truncating timestamps on arbitrary intervals  (Isaac Morland <isaac.morland@gmail.com>)
Список pgsql-hackers
On Fri, Mar 13, 2020 at 7:48 PM Isaac Morland <isaac.morland@gmail.com> wrote:
>
> On Fri, 13 Mar 2020 at 03:13, John Naylor <john.naylor@2ndquadrant.com> wrote:
>
>> - align weeks to start on Sunday
>> select date_trunc_interval('7 days'::interval, TIMESTAMP '2020-02-11
>> 01:01:01.0', TIMESTAMP '1900-01-02');
>>  date_trunc_interval
>> ---------------------
>>  2020-02-09 00:00:00
>> (1 row)
>
>
> I'm confused by this. If my calendars are correct, both 1900-01-02 and 2020-02-11 are Tuesdays. So if the date being
adjustedand the origin are both Tuesday, shouldn't the day part be left alone when truncating to 7 days?
 

Thanks for taking a look! The non-intuitive behavior you found is
because the patch shifts the timestamp before converting to the
internal pg_tm type. The pg_tm type stores day of the month, which is
used for the calculation. It's not counting the days since the origin.
Then the result is shifted back.

To get more logical behavior, perhaps the optional parameter is better
as an offset instead of an origin. Alternatively (or additionally),
the function could do the math on int64 timestamps directly.

> Also, I'd like to confirm that the default starting point for 7 day periods (weeks) is Monday, per ISO.

That's currently the behavior in the existing date_trunc function,
when passed the string 'week'. Given that keyword, it calculates the
week of the year.

When using the proposed function with arbitrary intervals, it uses day
of the month, as found in the pg_tm struct. It doesn't treat 7 days
differently then 5 or 10 without user input (origin or offset), since
there is nothing special about 7 day intervals as such internally. To
show the difference between date_trunc, and date_trunc_interval as
implemented in v5 with no origin:

select date_trunc('week', d), count(*) from generate_series(
'2020-02-01'::timestamp, '2020-03-31', '1 day') d group by 1 order by
1;
     date_trunc      | count
---------------------+-------
 2020-01-27 00:00:00 |     2
 2020-02-03 00:00:00 |     7
 2020-02-10 00:00:00 |     7
 2020-02-17 00:00:00 |     7
 2020-02-24 00:00:00 |     7
 2020-03-02 00:00:00 |     7
 2020-03-09 00:00:00 |     7
 2020-03-16 00:00:00 |     7
 2020-03-23 00:00:00 |     7
 2020-03-30 00:00:00 |     2
(10 rows)

select date_trunc_interval('7 days'::interval, d), count(*) from
generate_series( '2020-02-01'::timestamp, '2020-03-31', '1 day') d
group by 1 order by 1;
 date_trunc_interval | count
---------------------+-------
 2020-02-01 00:00:00 |     7
 2020-02-08 00:00:00 |     7
 2020-02-15 00:00:00 |     7
 2020-02-22 00:00:00 |     7
 2020-02-29 00:00:00 |     1
 2020-03-01 00:00:00 |     7
 2020-03-08 00:00:00 |     7
 2020-03-15 00:00:00 |     7
 2020-03-22 00:00:00 |     7
 2020-03-29 00:00:00 |     3
(10 rows)

Resetting the day every month is counterintuitive if not broken, and
as I mentioned it might make more sense to use the int64 timestamp
directly, at least for intervals less than one month. I'll go look
into doing that.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: James Coleman
Дата:
Сообщение: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager