Re: Strategies for converting UTC data to local windows for arbitrary resolutions and timezones

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Strategies for converting UTC data to local windows for arbitrary resolutions and timezones
Дата
Msg-id EB0DA992-5D41-470B-A521-41706A6E3EBA@gmail.com
обсуждение исходный текст
Ответ на Re: Strategies for converting UTC data to local windows for arbitrary resolutions and timezones  (Lincoln Swaine-Moore <lswainemoore@gmail.com>)
Ответы Re: Strategies for converting UTC data to local windows for arbitrary resolutions and timezones
Список pgsql-general
> On 4 Oct 2023, at 17:58, Lincoln Swaine-Moore <lswainemoore@gmail.com> wrote:
>
> > SELECT
> >     sub.gs AS ts_in_utc
> >     ,sub.gs AT TIME ZONE 'America/New_York' AS ts_in_local_tz
> >     ,date_bin('1 days', sub.gs AT TIME ZONE 'America/New_York',
> > '2023-01-01')
> > FROM (SELECT generate_series('2023-11-03 00:00:00Z'::timestamptz,
> > '2023-11-07 00:00:00Z'::timestamptz, '5 minutes'::interval) AS gs) AS sub
> > WHERE
> >     sub.gs >= '2023-11-05 00:00:00 America/New_York'::timestamptz AND
> > sub.gs < '2023-11-06 00:00:00 America/New_York'::timestamptz
>
> I believe this query will be funky around DST borders, because `sub.gs AT TIME ZONE 'America/New_York'` will be
localizedin a way that erases the difference between hours with different offsets, which are genuinely different. For
instance,I ran this and there are two rows within it that look like:  
>
> ` 2023-11-05 05:00:00+00 | 2023-11-05 01:00:00 | 2023-11-05 00:00:00`
> and
> ` 2023-11-05 06:00:00+00 | 2023-11-05 01:00:00 | 2023-11-05 00:00:00`
>
> I think that the non-unique second column will pose an issue for the date binning at a resolution finer than 1 day.

What I do in such cases is to add an extra column with the UTC timestamp to serve as a linear scale to the local
timestamps.That also helps with ordering buckets in reports and such during DST changes (especially the ones where an
hourrepeats). 

Filtering in the queries occurs on the UTC scale, with the local timestamps calculated back to UTC, so that it doesn’t
matterwhether the local time has 23, 24, 25 or 24.5 or 23.5 or whatever number of hours on a date-range - it all maps
backbecause UTC always has 24 hours. 

Something that I also do is to create calendar tables and views for the buckets, with 2 timestamps per bucket: the
startof the bucket and the start of the next bucket. That gives you a range to put actual timestamps between (not
BETWEENbetween, because that’s inclusive). You can store and index that, as opposed to generated results using
generate_series- basically I materialise those. 

For hours and quarter hours I found it to be fairly convenient to base a view on a join between a date calendar and an
(quarterof an) hour per UTC day table, but materialising that with some indexes may perform better (at the cost of disk
space).I do materialise that currently, but our database server doesn’t have a lot of memory so I’m often not hitting
thecache and performance suffers a bit (infrastructure is about to change for the better though). 

Regards,

Alban Hertroys
--
There is always an exception to always.







В списке pgsql-general по дате отправления:

Предыдущее
От: Steve Crawford
Дата:
Сообщение: Re: Strategies for converting UTC data to local windows for arbitrary resolutions and timezones
Следующее
От: Lincoln Swaine-Moore
Дата:
Сообщение: Re: Strategies for converting UTC data to local windows for arbitrary resolutions and timezones