Re: date_trunc function in interval version

Поиск
Список
Период
Сортировка
От Przemysław Sztoch
Тема Re: date_trunc function in interval version
Дата
Msg-id ada834a3-3ad5-4153-80eb-531c5b5c913b@sztoch.pl
обсуждение исходный текст
Ответ на Re: date_trunc function in interval version  (Kirk Wolak <wolakk@gmail.com>)
Ответы Re: date_trunc function in interval version
Список pgsql-hackers


On 8/20/2025 9:52 PM, Kirk Wolak wrote:
On Mon, May 20, 2024 at 11:58 AM Przemysław Sztoch <przemyslaw@sztoch.pl> wrote:
Yasir wrote on 19.05.2024 00:03:
I would also like to thank Robert for presenting the matter in detail.

My function date_trunc ( interval, timestamp, ...) is similar to original function date_trunc ( text, timestamp ...) .

My extension only gives more granularity.
We don't have a jump from hour to day. We can use 6h and 12h. It's the same with minutes.
We can round to 30 minutes, 20 minutes, 15 minutes, etc.
...
 
Please, use it with timestamptz for '2 hours' or '3 hours' interval.

SET timezone TO 'Europe/Warsaw';
SELECT ts,
       date_bin('1 hour'::interval, ts, '0001-01-01 00:00:00') AS one_hour_bin,
       date_bin('2 hour'::interval, ts, '0001-01-01 00:00:00') AS two_hours_bin,
       date_bin('3 hour'::interval, ts, '0001-01-01 00:00:00') AS three_hours_bin
   FROM generate_series('2022-03-26 21:00:00+00'::timestamptz,
                        '2022-03-27 07:00:00+00'::timestamptz,
                        '30 min'::interval,
                        'Europe/Warsaw') AS ts;

           ts           |      one_hour_bin      |     two_hours_bin      |    three_hours_bin
------------------------+------------------------+------------------------+------------------------
 2022-03-26 22:00:00+01 | 2022-03-26 21:36:00+01 | 2022-03-26 21:36:00+01 | 2022-03-26 20:36:00+01
 2022-03-26 22:30:00+01 | 2022-03-26 21:36:00+01 | 2022-03-26 21:36:00+01 | 2022-03-26 20:36:00+01
 2022-03-26 23:00:00+01 | 2022-03-26 22:36:00+01 | 2022-03-26 21:36:00+01 | 2022-03-26 20:36:00+01
 2022-03-26 23:30:00+01 | 2022-03-26 22:36:00+01 | 2022-03-26 21:36:00+01 | 2022-03-26 20:36:00+01
 2022-03-27 00:00:00+01 | 2022-03-26 23:36:00+01 | 2022-03-26 23:36:00+01 | 2022-03-26 23:36:00+01
 2022-03-27 00:30:00+01 | 2022-03-26 23:36:00+01 | 2022-03-26 23:36:00+01 | 2022-03-26 23:36:00+01
 2022-03-27 01:00:00+01 | 2022-03-27 00:36:00+01 | 2022-03-26 23:36:00+01 | 2022-03-26 23:36:00+01
 2022-03-27 01:30:00+01 | 2022-03-27 00:36:00+01 | 2022-03-26 23:36:00+01 | 2022-03-26 23:36:00+01
 2022-03-27 03:00:00+02 | 2022-03-27 01:36:00+01 | 2022-03-27 01:36:00+01 | 2022-03-26 23:36:00+01
 2022-03-27 03:30:00+02 | 2022-03-27 01:36:00+01 | 2022-03-27 01:36:00+01 | 2022-03-26 23:36:00+01
 2022-03-27 04:00:00+02 | 2022-03-27 03:36:00+02 | 2022-03-27 01:36:00+01 | 2022-03-27 03:36:00+02
 2022-03-27 04:30:00+02 | 2022-03-27 03:36:00+02 | 2022-03-27 01:36:00+01 | 2022-03-27 03:36:00+02
 2022-03-27 05:00:00+02 | 2022-03-27 04:36:00+02 | 2022-03-27 04:36:00+02 | 2022-03-27 03:36:00+02
 2022-03-27 05:30:00+02 | 2022-03-27 04:36:00+02 | 2022-03-27 04:36:00+02 | 2022-03-27 03:36:00+02
 2022-03-27 06:00:00+02 | 2022-03-27 05:36:00+02 | 2022-03-27 04:36:00+02 | 2022-03-27 03:36:00+02
 2022-03-27 06:30:00+02 | 2022-03-27 05:36:00+02 | 2022-03-27 04:36:00+02 | 2022-03-27 03:36:00+02
 2022-03-27 07:00:00+02 | 2022-03-27 06:36:00+02 | 2022-03-27 06:36:00+02 | 2022-03-27 06:36:00+02
 2022-03-27 07:30:00+02 | 2022-03-27 06:36:00+02 | 2022-03-27 06:36:00+02 | 2022-03-27 06:36:00+02
 2022-03-27 08:00:00+02 | 2022-03-27 07:36:00+02 | 2022-03-27 06:36:00+02 | 2022-03-27 06:36:00+02
 2022-03-27 08:30:00+02 | 2022-03-27 07:36:00+02 | 2022-03-27 06:36:00+02 | 2022-03-27 06:36:00+02
 2022-03-27 09:00:00+02 | 2022-03-27 08:36:00+02 | 2022-03-27 08:36:00+02 | 2022-03-27 06:36:00+02
(21 rows)

We have 36 minutes offset (historical time change).

If we use origin from current year, we have wrong value after DST too:
SET timezone TO 'Europe/Warsaw';
SELECT ts,
       date_bin('1 hour'::interval, ts, '0001-01-01 00:00:00') AS one_hour_bin,
       date_bin('2 hour'::interval, ts, '0001-01-01 00:00:00') AS two_hours_bin,
       date_bin('3 hour'::interval, ts, '0001-01-01 00:00:00') AS three_hours_bin
   FROM generate_series('2022-03-26 21:00:00+00'::timestamptz,
                        '2022-03-27 07:00:00+00'::timestamptz,
                        '30 min'::interval,
                        'Europe/Warsaw') AS ts;^C
postgres=# \e
           ts           |      one_hour_bin      |     two_hours_bin      |    three_hours_bin
------------------------+------------------------+------------------------+------------------------
 2022-03-26 22:00:00+01 | 2022-03-26 22:00:00+01 | 2022-03-26 22:00:00+01 | 2022-03-26 21:00:00+01
 2022-03-26 22:30:00+01 | 2022-03-26 22:00:00+01 | 2022-03-26 22:00:00+01 | 2022-03-26 21:00:00+01
 2022-03-26 23:00:00+01 | 2022-03-26 23:00:00+01 | 2022-03-26 22:00:00+01 | 2022-03-26 21:00:00+01
 2022-03-26 23:30:00+01 | 2022-03-26 23:00:00+01 | 2022-03-26 22:00:00+01 | 2022-03-26 21:00:00+01
 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01
 2022-03-27 00:30:00+01 | 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01
 2022-03-27 01:00:00+01 | 2022-03-27 01:00:00+01 | 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01
 2022-03-27 01:30:00+01 | 2022-03-27 01:00:00+01 | 2022-03-27 00:00:00+01 | 2022-03-27 00:00:00+01
 2022-03-27 03:00:00+02 | 2022-03-27 03:00:00+02 | 2022-03-27 03:00:00+02 | 2022-03-27 00:00:00+01
 2022-03-27 03:30:00+02 | 2022-03-27 03:00:00+02 | 2022-03-27 03:00:00+02 | 2022-03-27 00:00:00+01
 2022-03-27 04:00:00+02 | 2022-03-27 04:00:00+02 | 2022-03-27 03:00:00+02 | 2022-03-27 04:00:00+02
 2022-03-27 04:30:00+02 | 2022-03-27 04:00:00+02 | 2022-03-27 03:00:00+02 | 2022-03-27 04:00:00+02
 2022-03-27 05:00:00+02 | 2022-03-27 05:00:00+02 | 2022-03-27 05:00:00+02 | 2022-03-27 04:00:00+02
 2022-03-27 05:30:00+02 | 2022-03-27 05:00:00+02 | 2022-03-27 05:00:00+02 | 2022-03-27 04:00:00+02
 2022-03-27 06:00:00+02 | 2022-03-27 06:00:00+02 | 2022-03-27 05:00:00+02 | 2022-03-27 04:00:00+02
 2022-03-27 06:30:00+02 | 2022-03-27 06:00:00+02 | 2022-03-27 05:00:00+02 | 2022-03-27 04:00:00+02
 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02
 2022-03-27 07:30:00+02 | 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02
 2022-03-27 08:00:00+02 | 2022-03-27 08:00:00+02 | 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02
 2022-03-27 08:30:00+02 | 2022-03-27 08:00:00+02 | 2022-03-27 07:00:00+02 | 2022-03-27 07:00:00+02
 2022-03-27 09:00:00+02 | 2022-03-27 09:00:00+02 | 2022-03-27 09:00:00+02 | 2022-03-27 07:00:00+02
(21 rows)

--
Przemysław Sztoch | Mobile +48 509 99 00 66

Forgive me, I saw this in the CF and wanted to review it because this looked useful.
I cannot tell from your output what the differences would be vs. your proposed date_trunc().
I was actually expecting columns:  RowNum(), ts, date_bin, new date_trunc() 
Where you explained the differences (maybe using the row number).

It appears your issue is the 36 Minutes.  And it does beg the question "Where is that coming from".

Finally, I assume that: even if you could fix it by using "AT UTC" to do the grouping, and then change it to 'Europe/Warsaw'...
That 36 minutes probably creeps back in.
It "Feels" like the wrong answer, considering the inputs.

Finally... NOBODY Chimed in after you provided this evidence.  Was this accepted as proof, or was MORE expected?

Kirk

1. date_bin works good if you do not have changed time zone (for example from summer to winter time).

date_bin simply adds constant interval - if you want to round your time to 3 hours, 6 hours or 12 hours then you have problem if you want to pass time zone changing point, because some times you want to add interval without lack hour or add interval with extra hour.

original date_trunc works very good with DST problem, but has limited granularity, you can't round timestamp to 5 min, 10 min, 30 min, 3 hours, 6 hours etc.

My data_trunc version with interval as period argument is able to correctly overcome the time points at which the change from summer to winter time and vice versa occurred for custom periods.

Additionally, it does not require specifying the origin time, which can sometimes be very difficult to determine. You can't simple use '0001-01-01 00:00:00' because it is problematic for some timezones.

Then it must be defined differently for each time zone. This complicates queries when you work with different zones.

--
Przemysław Sztoch | Mobile +48 509 99 00 66


В списке pgsql-hackers по дате отправления: