Обсуждение: BUG #15551: Date/Time comparison not correct when the comparison isinside join clause and involves "+" or "-"

Поиск
Список
Период
Сортировка

BUG #15551: Date/Time comparison not correct when the comparison isinside join clause and involves "+" or "-"

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15551
Logged by:          Lei He
Email address:      whereverlei@gmail.com
PostgreSQL version: 11.1
Operating system:   macOS Mojave 10.14
Description:

dil=# select version();
                                                     version
                                     
------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.1 on x86_64-apple-darwin18.2.0, compiled by Apple LLVM
version 10.0.0 (clang-1000.11.45.5), 64-bit
(1 row)



I want to compute growth of something quarterly. Here is the simplified
data:

with code_time as (
  select code, "time" from (values
    ('000001', date '2016-12-31'),
    ('000001', date '2017-03-31'),
    ('000001', date '2017-06-30'),
    ('000001', date '2017-09-30'),
    ('000001', date '2017-12-31'),
    ('000001', date '2018-03-31'),
    ('000001', date '2018-06-30'),
    ('000001', date '2018-09-30'),
    ('000001', date '2018-12-31'),
    ('000002', date '2016-12-31'),
    ('000002', date '2017-03-31'),
    ('000002', date '2017-06-30'),
    ('000002', date '2017-09-30'),
    ('000002', date '2017-12-31'),
    ('000002', date '2018-03-31'),
    ('000002', date '2018-06-30'),
    ('000002', date '2018-09-30'),
    ('000002', date '2018-12-31')
  ) t("code", "time")
)

The result I need is like this:
  code  | cur_quarter | pre_quarter 
--------+-------------+-------------
 000001 | 2017-03-31  | 2016-12-31
 000001 | 2017-06-30  | 2017-03-31
 000001 | 2017-09-30  | 2017-06-30
 000001 | 2017-12-31  | 2017-09-30
 000001 | 2018-03-31  | 2017-12-31
 000001 | 2018-06-30  | 2018-03-31
 000001 | 2018-09-30  | 2018-06-30
 000001 | 2018-12-31  | 2018-09-30
 000002 | 2017-03-31  | 2016-12-31
 000002 | 2017-06-30  | 2017-03-31
 000002 | 2017-09-30  | 2017-06-30
 000002 | 2017-12-31  | 2017-09-30
 000002 | 2018-03-31  | 2017-12-31
 000002 | 2018-06-30  | 2018-03-31
 000002 | 2018-09-30  | 2018-06-30
 000002 | 2018-12-31  | 2018-09-30

If I use query
select ct1.code, ct1."time" cur_quarter, ct2."time" pre_quarter from
code_time ct1
join (select code, "time" from code_time) ct2 on ct1.code = ct2.code and
ct1."time" = ct2."time" + interval '3 months'
order by ct1.code, ct1."time";

I got:
  code  | cur_quarter | pre_quarter 
--------+-------------+-------------
 000001 | 2017-03-31  | 2016-12-31
 000001 | 2017-06-30  | 2017-03-31
 000001 | 2017-09-30  | 2017-06-30
 000001 | 2018-03-31  | 2017-12-31
 000001 | 2018-06-30  | 2018-03-31
 000001 | 2018-09-30  | 2018-06-30
 000002 | 2017-03-31  | 2016-12-31
 000002 | 2017-06-30  | 2017-03-31
 000002 | 2017-09-30  | 2017-06-30
 000002 | 2018-03-31  | 2017-12-31
 000002 | 2018-06-30  | 2018-03-31
 000002 | 2018-09-30  | 2018-06-30

Note: the fourth quarter (*-12-31) of all years are missing.

If I use query
select ct1.code, ct1."time" cur_quarter, ct2."time" pre_quarter from
code_time ct1
join (select code, "time" from code_time) ct2 on ct1.code = ct2.code and
ct2."time" = ct1."time" - interval '3 months'
order by ct1.code, ct1."time";

I got:
  code  | cur_quarter | pre_quarter 
--------+-------------+-------------
 000001 | 2017-03-31  | 2016-12-31
 000001 | 2017-09-30  | 2017-06-30
 000001 | 2017-12-31  | 2017-09-30
 000001 | 2018-03-31  | 2017-12-31
 000001 | 2018-09-30  | 2018-06-30
 000001 | 2018-12-31  | 2018-09-30
 000002 | 2017-03-31  | 2016-12-31
 000002 | 2017-09-30  | 2017-06-30
 000002 | 2017-12-31  | 2017-09-30
 000002 | 2018-03-31  | 2017-12-31
 000002 | 2018-09-30  | 2018-06-30
 000002 | 2018-12-31  | 2018-09-30

Note: the second quarter (*-06-30) of all years are missing.

Only if I use query
select ct1.code, ct1."time" cur_quarter, ct2."time" pre_quarter from
code_time ct1
join (select code, "time" from code_time) ct2 on ct1.code = ct2.code and (
ct1."time" = ct2."time" + interval '3 months' or ct2."time" = ct1."time" -
interval '3 months' )
order by ct1.code, ct1."time";

can I get the expected result:
  code  | cur_quarter | pre_quarter 
--------+-------------+-------------
 000001 | 2017-03-31  | 2016-12-31
 000001 | 2017-06-30  | 2017-03-31
 000001 | 2017-09-30  | 2017-06-30
 000001 | 2017-12-31  | 2017-09-30
 000001 | 2018-03-31  | 2017-12-31
 000001 | 2018-06-30  | 2018-03-31
 000001 | 2018-09-30  | 2018-06-30
 000001 | 2018-12-31  | 2018-09-30
 000002 | 2017-03-31  | 2016-12-31
 000002 | 2017-06-30  | 2017-03-31
 000002 | 2017-09-30  | 2017-06-30
 000002 | 2017-12-31  | 2017-09-30
 000002 | 2018-03-31  | 2017-12-31
 000002 | 2018-06-30  | 2018-03-31
 000002 | 2018-09-30  | 2018-06-30
 000002 | 2018-12-31  | 2018-09-30


The only difference among these query lies in the join clause involving date
/ time comparison having date / time operators being used:
1st: ct1."time" = ct2."time" + interval '3 months'
2nd: ct2."time" = ct1."time" - interval '3 months'
3rd: ct1."time" = ct2."time" + interval '3 months' or ct2."time" =
ct1."time" - interval '3 months'

It seems to me all the 3 conditions are logically same and should have the
same result.


>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:

 PG> Note: the fourth quarter (*-12-31) of all years are missing.

So the problem here is that you're trying to identify quarters by the
_last_ date, not by the _first_ date.

If you add 3 months to Sept 30th, you get Dec 30th, NOT Dec 31st.
It so happens that adding 3 months to Mar 31st gives Jun 30th, and
adding 3 months to Jun 30th gives Sep 30th, and adding 3 months to
Dec 31st gives Mar 31st, so it's only the one case that fails here.

But if you identify the quarter by its _first_ date, you have no
problem.

date_trunc('quarter', somedate::timestamp) can help with this, but
remember to cast the date to timestamp (without time zone) in the call,
otherwise you'll get incorrect results due to timezone issues.

 PG> The only difference among these query lies in the join clause
 PG> involving date / time comparison having date / time operators being
 PG> used:

 PG> 1st: ct1."time" = ct2."time" + interval '3 months'
 PG> 2nd: ct2."time" = ct1."time" - interval '3 months'
 PG> 3rd: ct1."time" = ct2."time" + interval '3 months' or ct2."time" =
 PG> ct1."time" - interval '3 months'

 PG> It seems to me all the 3 conditions are logically same and should
 PG> have the same result.

Well, in the presence of months of variable lengths, they clearly are
not equivalent and cannot be. So no bug here.

-- 
Andrew (irc:RhodiumToad)