Обсуждение: Inconsistent results in timestamp/interval comparison

Поиск
Список
Период
Сортировка

Inconsistent results in timestamp/interval comparison

От
albrecht.dress@posteo.de
Дата:
Hi all,

I run the “official” deb package postgresql-16 v. 16.2-1.pgdg120+2 on a
Debian Bookworm system, and observed a confusing behavior in a
calculation with time stamps and intervals.

To reproduce, consider the following trivial example:

<snip>
create table testtab (t1 timestamp without time zone);
insert into testtab values ('2022-02-27 11:46:33'), ('2022-03-11
23:39:17'), ('2022-03-21 17:49:02');
test=# select now(), t1, (now() - t1) >= '2 years'::interval, now() >=
(t1 + '2 years'::interval) from testtab;
               now              |         t1          | ?column? |
?column?
-------------------------------+---------------------+----------+----------
  2024-03-04 12:59:39.796969+01 | 2022-02-27 11:46:33 | t        | t
  2024-03-04 12:59:39.796969+01 | 2022-03-11 23:39:17 | t        | f
  2024-03-04 12:59:39.796969+01 | 2022-03-21 17:49:02 | f        | f
(3 Zeilen)
</snip>

According to the documentation, Table 9.31, IMHO both comparisons should
produce the same results, as

timestamp - timestamp → interval
timestamp + interval → timestamp

i.e.

(now() - t1) >= '2 years'::interval    # add t1 on both sides of the
comparison
now() >= (t1 + '2 years'::interval)

As only the second example is wrong for the 1st comparison method, this
might indicate some rounding and/or insufficient precision issue.

Or did I miss something here?

Thanks in advance,
Albrecht.




Re: Inconsistent results in timestamp/interval comparison

От
Francisco Olarte
Дата:
On Mon, 4 Mar 2024 at 13:10, <albrecht.dress@posteo.de> wrote:
> According to the documentation, Table 9.31, IMHO both comparisons should
> produce the same results, as

> timestamp - timestamp → interval
> timestamp + interval → timestamp
Your problem may be due to interval comparison.

Intervals are composed of months, days and seconds, as not every month
has 30 days and not every day has 86400 seconds, so to compare them
you have to normalize them somehow, which can lead to bizarre results.

=> select '2 years'::interval > '1 year 362 days'::interval;
 ?column?
----------
 f
(1 row)

=> select '2 years'::interval > '1 year 359 days'::interval;
 ?column?
----------
 t
(1 row)

=> select '2 years'::interval > '1 year 360 days'::interval;
 ?column?
----------
 f
(1 row)

=> select '2 years'::interval = '1 year 360 days'::interval;
 ?column?
----------
 t
(1 row)

If you want to do point in time arithmetic, you will be better of by
extracting epoch from your timestamps and substracting that. Intervals
are more for calendar arithmetic on the type "set me a date two
months, three days and four hours from the last".

Francisco Olarte.



Re: Inconsistent results in timestamp/interval comparison

От
Alban Hertroys
Дата:

On Mon, 4 Mar 2024 at 13:46, Francisco Olarte <folarte@peoplecall.com> wrote:
On Mon, 4 Mar 2024 at 13:10, <albrecht.dress@posteo.de> wrote:
> According to the documentation, Table 9.31, IMHO both comparisons should
> produce the same results, as

> timestamp - timestamp → interval
> timestamp + interval → timestamp
Your problem may be due to interval comparison.

Intervals are composed of months, days and seconds, as not every month
has 30 days and not every day has 86400 seconds, so to compare them
you have to normalize them somehow, which can lead to bizarre results.

=> select '2 years'::interval > '1 year 362 days'::interval;
 ?column?
----------
 f
(1 row)

=> select '2 years'::interval > '1 year 359 days'::interval;
 ?column?
----------
 t
(1 row)

=> select '2 years'::interval > '1 year 360 days'::interval;
 ?column?
----------
 f
(1 row)

=> select '2 years'::interval = '1 year 360 days'::interval;
 ?column?
----------
 t
(1 row)

If you want to do point in time arithmetic, you will be better of by
extracting epoch from your timestamps and substracting that. Intervals
are more for calendar arithmetic on the type "set me a date two
months, three days and four hours from the last".

Francisco Olarte.

To elaborate, justify_interval(t) shows how the length of the interval ends up when there is no timestamp to base the end of the interval on:

=> with testtab(t1) as (
        select cast(v as timestamp with time zone)
        from (values ('2022-02-27 11:46:33'), ('2022-03-11 23:39:17'), ('2022-03-21 17:49:02')) x(v)
)
select now(), t1, now() - t1 "now()-t1", justify_interval(now() -t1)        
from testtab;
             now              |           t1           |        now()-t1         |           justify_interval            
------------------------------+------------------------+-------------------------+---------------------------------------
 2024-03-04 13:00:31.00386+00 | 2022-02-27 11:46:33+00 | 736 days 01:13:58.00386 | 2 years 16 days 01:13:58.00386
 2024-03-04 13:00:31.00386+00 | 2022-03-11 23:39:17+00 | 723 days 13:21:14.00386 | 2 years 3 days 13:21:14.00386
 2024-03-04 13:00:31.00386+00 | 2022-03-21 17:49:02+00 | 713 days 19:11:29.00386 | 1 year 11 mons 23 days 19:11:29.00386
(3 rows)

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: Inconsistent results in timestamp/interval comparison

От
albrecht.dress@posteo.de
Дата:
Am 04.03.2024 13:45 schrieb Francisco Olarte:
> Intervals are composed of months, days and seconds, as not every month
> has 30 days and not every day has 86400 seconds, so to compare them
> you have to normalize them somehow, which can lead to bizarre results.

Ah, I see, thanks for the explanation.  I had the (apparently wrong)
impression that Postgres _internally_ always uses numerical values (i.e.
the equivalent of EXTRACT(EPOCH …)) for such calculations.  My bad…

However, a clarification in the docs might be helpful!

> If you want to do point in time arithmetic, you will be better of by
> extracting epoch from your timestamps and substracting that.

I can confirm that using the query

select now(), t1, extract(epoch from now() - t1) >= extract (epoch from
'2 years'::interval), now() >= (t1 + '2 years'::interval) from testtab;

produces consistent results.

Thanks a lot for your help,
Albrecht.



Re: Inconsistent results in timestamp/interval comparison

От
Francisco Olarte
Дата:
On Mon, 4 Mar 2024 at 14:06, <albrecht.dress@posteo.de> wrote:
> Am 04.03.2024 13:45 schrieb Francisco Olarte:
> > Intervals are composed of months, days and seconds, as not every month
> > has 30 days and not every day has 86400 seconds, so to compare them
> > you have to normalize them somehow, which can lead to bizarre results.
> Ah, I see, thanks for the explanation.  I had the (apparently wrong)
> impression that Postgres _internally_ always uses numerical values (i.e.
> the equivalent of EXTRACT(EPOCH …)) for such calculations.  My bad…
> However, a clarification in the docs might be helpful!

Ah, the elusive timestamp/interval clarification. Intervals being
three numbers, and timestamp(tz) being just a fancy one with fancy
text conversions and arithmetic rules take a lot of this list
bandwidth.

I work with telephony and similar things with use a lot of durations,
which I just store in numeric/real/integer columns, but due to history
and some reporting convenience I store points in time as timestamps.
Adding is easy, just do ts_col + duration_col * '1 second'::interval,
but I'm really looking forward to convert ts[tz] to numbers too, so I
can substract them. Substracting is a bit more hairy, but extract
epoch on the result normally solves it, as they are justified
intervals. But I just made some memory refreshing queries and they
still manage to surprise me, and I've been using postgres from before
sql.

Francisco Olarte.