Обсуждение: Interval arithmetic should emit interval in canonical format
It's hard to argue that the current behaviour is wrong, but it's worth a try. First I'd appreciate the "official" reasons why Postgres prefers to keep interval values in non-canonical form, like '1 day -23:37:00' instead of '00:23:00'. I understand it has something to do with a year/month/day not being exactly 365-days/30-days/24-hours, and/or operations involving interval and 'timestamp with time zone'. But since it's not explicitly spelled out in docs or in code (at least I didn't find it in the obvious places), seeking explanation here. I understand that the answers may obviate any change in behaviour I am requesting below. The interval arithmetic operations may also yield non-canonical values, and IMHO the 'interval op interval' or 'interval op scalar' expressions should yield an interval in canonical form. For eg. postgres=# select '6 days 00:16:00'::interval - '5 days 23:53:00'::interval as result; result -----------------1 day -23:37:00 postgres=# select '6 days 00:16:00'::interval + '5 days 23:53:00'::interval as result; result ------------------11 days 24:09:00 I cannot think of a use case where the above results are any better than emitting '00:23:00' and '12 days 00:09:00', respectively. We may not be able to turn every interval datum into canonical form, but at least the intervals produced as a result of interval operators can be converted to canonical form to reduce surprises for users. I may even go as far as proposing rounding up 24-hours into a day, but not round up days into months or months into years. I was surprised by the presence of non-canonical form of interval in a sorted-by-interval result set. The intervals were computed within the query, using 'timestamp without time zone' values in a table. # select ... result -------- ... 00:23:00 00:23:00 1 day -23:37:00 00:23:00 00:22:00 ... The ordering above demonstrates that Postgres _does_ consider '1 day -23:37:00' == '00:23:00', then it seems pointless to confuse the user by showing two different representations of the same datum. This also increases the code complexity required in applications/ORMs to parse interval data's text representation. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB : www.EnterpriseDB.com : The Enterprise PostgreSQL Company
Gurjeet Singh <gurjeet@singh.im> writes: > The interval arithmetic operations may also yield non-canonical > values, and IMHO the 'interval op interval' or 'interval op scalar' > expressions should yield an interval in canonical form. You're mistaken. > postgres=# select '6 days 00:16:00'::interval - '5 days > 23:53:00'::interval as result; > result > ----------------- > 1 day -23:37:00 > postgres=# select '6 days 00:16:00'::interval + '5 days > 23:53:00'::interval as result; > result > ------------------ > 11 days 24:09:00 > I cannot think of a use case where the above results are any better > than emitting '00:23:00' and '12 days 00:09:00', respectively. If that's what you want, use the justify_hours function. But that's discarding information, so we're not going to force users to only be able to get that form. The reason why Postgres distinguishes '1 day' from '24 hours' is pretty much the same as the reason it distinguishes '1 month' from '30 days': adding those expressions to datetime values can produce different results. For example, since 2014-03-09 was a daylight-savings transition day in my zone (US/Eastern), regression=# select '2014-03-08 00:00'::timestamptz + '2 days'::interval; ?column? ------------------------2014-03-10 00:00:00-04 (1 row) regression=# select '2014-03-08 00:00'::timestamptz + '48 hours'::interval; ?column? ------------------------2014-03-10 01:00:00-04 (1 row) As for months vs. days: regression=# select '2014-07-01'::date + '1 month'::interval; ?column? ---------------------2014-08-01 00:00:00 (1 row) regression=# select '2014-07-01'::date + '30 days'::interval; ?column? ---------------------2014-07-31 00:00:00 (1 row) > The ordering above demonstrates that Postgres _does_ consider '1 day > -23:37:00' == '00:23:00', then it seems pointless to confuse the user > by showing two different representations of the same datum. Intervals are really three separate scalar values internally (months, days, seconds). There isn't any way to handle that fully in a linear sort order, so the comparison operators fall back to assuming 1 day is equal to 24 hours (and 1 month is equal to 30 days). But that doesn't make them the same for all purposes. regards, tom lane