Patch for ISO-8601-Interval Input and output.

Поиск
Список
Период
Сортировка
От Ron Mayer
Тема Patch for ISO-8601-Interval Input and output.
Дата
Msg-id 48E4A30C.3000503@cheapcomplexdevices.com
обсуждение исходный текст
Ответ на Patch for SQL-Standard Interval output and decoupling DateStyle from IntervalStyle  (Ron Mayer <rm_pg@cheapcomplexdevices.com>)
Ответы Re: Patch for ISO-8601-Interval Input and output.  ("Brendan Jurd" <direvus@gmail.com>)
Список pgsql-hackers
Ron Mayer wrote:
> Tom Lane wrote:
>> In fact, given that we are now
>> somewhat SQL-compliant on interval input, a GUC that selected
>> PG traditional, SQL-standard, or ISO 8601 interval output format seems
>> like it could be a good idea.

This patch (that works on top of the IntervalStyle patch I
posted earlier today) adds support for ISO8601 standard[0]
"Time Interval" "Durations" of the "format with designators"
(section 4.4.4.2.1).   The other ISO 8601 types of intervals
deal with start and end points, so this one seemed most relevant.

It builds on a patch I had earlier submitted back in 2003[1],
where people noted that we wanted sql-standard intervals
first; but I see that ISO 8601 intervals did make it to the
todo list.

I updated the docs, but I still need to add regression tests,
so it's still a WIP, but I think the code's ready (I've been
using an older version of this patch internally since '03) so
I'd appreciate feedback.

[0] http://isotc.iso.org/livelink/livelink/4021199/ISO_8601_2004_E.zip?func=doc.Fetch&nodeid=4021199
[1] http://archives.postgresql.org/pgsql-patches/2003-09/msg00121.php
[2] http://wiki.postgresql.org/wiki/Todo


*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
***************
*** 1975,1980 **** January 8 04:05:06 1999 PST
--- 1975,1996 ----
       </para>

       <para>
+       Alternatively, <type>interval</type> values can be written as
+       ISO 8601 time intervals, using the "Format with time-unit designators",
+       or PnYnMnDTnHnMnS.   This format always starts with the character
+       <literal>'P'</>, followed by a string of values followed by single
+       character time-unit designators.  A <literal>'T'</> separates the
+       date and time parts of the interval.
+       In this format, <literal>'n'</> gets replaced by a number, and
+        <literal>Y</> represents years,
+        <literal>M</> (in the date part) months,
+        <literal>D</> months,
+        <literal>H</> hours,
+        <literal>M</> (in the time part) minutes,
+        and <literal>S</> seconds.
+      </para>
+
+      <para>
        Internally <type>interval</> values are stored as months, days,
        and seconds. This is done because the number of days in a month
        varies, and a day can have 23 or 25 hours if a daylight savings
***************
*** 2224,2231 **** January 8 04:05:06 1999 PST

      <para>
       The output format of the interval types can be set to one of the four
!      styles <literal>sql_standard</>,
!      <literal>postgres</>, or <literal>postgres_verbose</>.The default
       is the <literal>postgres</> format.
       <xref
       linkend="interval-style-output-table"> shows examples of each
--- 2240,2247 ----

      <para>
       The output format of the interval types can be set to one of the four
!      styles <literal>sql_standard</>, <literal>iso_8601</>,
!      <literal>postgres</>, or <literal>postgres_verbose</>.  The default
       is the <literal>postgres</> format.
       <xref
       linkend="interval-style-output-table"> shows examples of each
***************
*** 2244,2249 **** January 8 04:05:06 1999 PST
--- 2260,2281 ----
      </para>

      <para>
+      The <literal>iso_8601</> style will output ISO 8601
+      time intervals using the "format with time-unit designators"
+      This format always starts with the character
+       <literal>'P'</>, followed by a string of values followed by single
+       character time-unit designators.  A <literal>'T'</> separates the
+       date and time parts of the interval.
+       In this format, <literal>'n'</> gets replaced by a number, and
+        <literal>Y</> represents years,
+        <literal>M</> (in the date part) months,
+        <literal>D</> months,
+        <literal>H</> hours,
+        <literal>M</> (in the time part) minutes,
+        and <literal>S</> seconds.
+     </para>
+
+     <para>
       The <literal>postgres</> style will output intervals that match
       the style PostgreSQL 8.3 outputed when the <xref linkend="guc-datestyle">
       parameter was set to <literal>ISO</>.
***************
*** 2274,2283 **** January 8 04:05:06 1999 PST
            <entry>-1-2 +3 -4:05:06</entry>
           </row>
           <row>
            <entry>postgres</entry>
            <entry>1 year 2 mons</entry>
            <entry>3 days 04:05:06</entry>
!           <entry> -1 years -2 mons +3 days -04:05:06</entry>
           </row>
           <row>
            <entry>postgres_verbose</entry>
--- 2306,2321 ----
            <entry>-1-2 +3 -4:05:06</entry>
           </row>
           <row>
+           <entry>iso_8601</entry>
+           <entry>P1Y2M</entry>
+           <entry>P3DT4H5M6</entry>
+           <entry>P-1Y-2M3DT-4H-5M-6</entry>
+          </row>
+          <row>
            <entry>postgres</entry>
            <entry>1 year 2 mons</entry>
            <entry>3 days 04:05:06</entry>
!           <entry>-1 year -2 mons 3 days -04:05:06</entry>
           </row>
           <row>
            <entry>postgres_verbose</entry>
***************
*** 2293,2299 **** January 8 04:05:06 1999 PST
       Note that <literal>sql_standard</> style will only produce strictly
       standards-conforming string sliterals when given a strictly SQL-standard interval
       value - meaning that it needs to be a pure year-month or datetime
!      interval and not mix positive and negative components.
       </para>

     </sect2>
--- 2331,2339 ----
       Note that <literal>sql_standard</> style will only produce strictly
       standards-conforming string sliterals when given a strictly SQL-standard interval
       value - meaning that it needs to be a pure year-month or datetime
!      interval and not mix positive and negative components, and
!      <literal>iso_8601</> style will only product strictly ISO-8601 standard
!      strings when given values that can be represented as ISO-8601 intervals.
       </para>

     </sect2>
*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***************
*** 248,253 **** assign_intervalstyle(const char *value, bool doit, GucSource source)
--- 248,257 ----
      {
          newIntervalStyle = INTSTYLE_SQL_STANDARD;
      }
+     else if (pg_strcasecmp(value, "iso_8601") == 0)
+     {
+         newIntervalStyle = INTSTYLE_ISO_8601;
+     }
      else
      {
          ereport(GUC_complaint_elevel(source),
*** a/src/backend/utils/adt/datetime.c
--- b/src/backend/utils/adt/datetime.c
***************
*** 2723,2728 **** DecodeSpecial(int field, char *lowtoken, int *val)
--- 2723,2906 ----
  }


+ /*
+  * A small helper function to avoid cut&paste code in DecodeIso8601Interval
+  */
+ static void adjust_fval(double fval,struct pg_tm * tm, fsec_t *fsec, int scale)
+ {
+     int    sec;
+     if (fval == 0) return;
+     fval       *= scale;
+     sec            = fval;
+     tm->tm_sec += sec;
+ #ifdef HAVE_INT64_TIMESTAMP
+     *fsec       += ((fval - sec) * 1000000);
+ #else
+     *fsec       += (fval - sec);
+ #endif
+ }
+
+
+ /* DecodeISO8601Interval()
+  *
+  *  Check if it's a ISO 8601 Section 5.5.4.2 "Representation of
+  *  time-interval by duration only."
+  *  Basic extended format:  PnYnMnDTnHnMnS
+  *                          PnW
+  *  For more info.
+  *  http://www.astroclark.freeserve.co.uk/iso8601/index.html
+  *  ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
+  *
+  *  Examples:  P1D  for 1 day
+  *             PT1H for 1 hour
+  *             P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min
+  *
+  *  The first field is exactly "p" or "pt" it may be of this type.
+  *
+  *  Returns DTERR_BAD_FORMAT if the field is not of this type.
+  *
+  *  It pretty strictly checks the spec, with the two exceptions
+  *  that a week field ('W') may coexist with other units, and that
+  *  this function allows decimals in fields other than the least
+  *  significant units.
+  */
+ int
+ DecodeISO8601Interval(char *str, struct pg_tm * tm, fsec_t *fsec)
+ {
+     char    unit;
+     int        fmask = 0,
+             tmask;
+     int        val;
+     double    fval;
+     int        datepart = true;
+     int        negative = false;
+
+     tm->tm_year = 0;
+     tm->tm_mon = 0;
+     tm->tm_mday = 0;
+     tm->tm_hour = 0;
+     tm->tm_min = 0;
+     tm->tm_sec = 0;
+     *fsec = 0;
+
+     /*
+      * An ISO 8601 "time-interval by duration only" must start
+      * with a 'P'.  If it contains a date-part, 'p' will be the
+      * only character in the field.  If it contains no date part
+      * it will contain exactly to characters 'PT' indicating a
+      * time part.
+      * Anything else does not match an ISO 8601 basic interval
+      * and will be treated like a traditional postgresql interval.
+      */
+     if (!(str[0] == 'P'))
+     {
+         return DTERR_BAD_FORMAT;
+     }
+     str++;
+
+     /*
+      * Process pairs of fields at a time.
+      */
+     while (*str)
+     {
+         if (*str == 'T')
+         {
+             datepart = false;
+             str++;
+             continue;
+         }
+         if (! (isdigit(*str) || *str == '-' || *str=='.') )
+             return DTERR_BAD_FORMAT;
+
+         negative = (*str == '-');
+         fval     = strtod(str, &str);
+         val         = fval;
+         fval     = fval - val;
+
+         unit = *str;
+         str++;
+
+         if (datepart)
+         {
+             switch (unit) /* Y M D W */
+             {
+                 case 'D':
+                     tm->tm_mday += val;
+                     adjust_fval(fval,tm,fsec, 86400);
+                     tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY));
+                     break;
+
+                 case 'W':
+                     tm->tm_mday += val * 7;
+                     adjust_fval(fval,tm,fsec,7 * 86400);
+                     tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY));
+                     break;
+
+                 case 'M':
+                     tm->tm_mon += val;
+                     adjust_fval(fval,tm,fsec,30 * 86400);
+                     tmask = DTK_M(MONTH);
+                     break;
+
+                 case 'Y':
+                     /*
+                      * Why can fractional months produce seconds,
+                      * but fractional years can't?  Well the older
+                      * interval code below has the same property
+                      * so this one follows the other one too.
+                      */
+                     tm->tm_year += val;
+                     if (fval != 0)
+                         tm->tm_mon += (fval * 12);
+                     tmask = ((fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR));
+                     break;
+
+                 default:
+                     return DTERR_BAD_FORMAT;  /* not a vald ISO8601 date unit prefix */
+             }
+         }
+         else
+         {
+             switch (unit) /* H M S */
+             {
+                 case 'S':
+                     tm->tm_sec += val;
+                     adjust_fval(fval,tm,fsec,1);
+                     tmask = DTK_M(SECOND);
+                     break;
+                 case 'M':
+                     tm->tm_min += val;
+                     adjust_fval(fval,tm,fsec,60);
+                     tmask = DTK_M(MINUTE);
+                     break;
+                 case 'H':
+                     tm->tm_hour += val;
+                     adjust_fval(fval,tm,fsec,3600);
+                     tmask = DTK_M(HOUR);
+                     break;
+                 default:
+                     return DTERR_BAD_FORMAT;  /* not a vald ISO8601 time unit prefix */
+             }
+         }
+         fmask |= tmask;
+     }
+
+     if (*fsec != 0)
+     {
+         int            sec;
+ #ifdef HAVE_INT64_TIMESTAMP
+         sec = (*fsec / INT64CONST(1000000));
+         *fsec -= (sec * INT64CONST(1000000));
+ #else
+         TMODULO(*fsec, sec, 1e0);
+ #endif
+         tm->tm_sec += sec;
+     }
+     return (fmask != 0) ? 0 : -1;
+ }
+
+
+
  /* DecodeInterval()
   * Interpret previously parsed fields for general time interval.
   * Returns 0 if successful, DTERR code if bogus input detected.
***************
*** 2732,2737 **** DecodeSpecial(int field, char *lowtoken, int *val)
--- 2910,2919 ----
   *
   * Allow ISO-style time span, with implicit units on number of days
   *    preceding an hh:mm:ss field. - thomas 1998-04-30
+  *
+  * Allow ISO-8601 style "Representation of time-interval by duration only"
+  *  of the format 'PnYnMnDTnHnMnS' and 'PnW' - ron mayer 2003-08-30
+  *  also deprecate the not-really-iso-like style from 1998-04-30
   */
  int
  DecodeInterval(char **field, int *ftype, int nf, int range,
***************
*** 3605,3613 **** EncodeDateTime(struct pg_tm * tm, fsec_t fsec, int *tzp, char **tzn, int style,
      return TRUE;
  }

! /*
!  * small helper funciton to avoid copy&paste of this ifdef below
   */
  static void
  AppendFsec(char * cp,fsec_t fsec)
  {
--- 3787,3803 ----
      return TRUE;
  }

! /*
!  * Small helper functions to avoid cut&paste in EncodeInterval below
   */
+
+ static char *
+ AddISO8601IntervalPart(char * cp, int value, char units)
+ {
+     if (value==0) return cp;
+     return cp + sprintf(cp,"%d%c",value,units);
+ }
+
  static void
  AppendFsec(char * cp,fsec_t fsec)
  {
***************
*** 3647,3652 **** EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
--- 3837,3846 ----
      bool        is_nonzero = FALSE;
      char       *cp = str;

+     /*
+      * These are used so much below, copy them to local variables to make
+      * the code shorter and more readable.
+      */
      int year  = tm->tm_year;
      int mon   = tm->tm_mon;
      int mday  = tm->tm_mday;
***************
*** 3730,3735 **** EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
--- 3924,3957 ----
              }
              break;
          }
+         case INTSTYLE_ISO_8601:
+             if ((year == 0) && (mon == 0) && (mday == 0) &&
+                 (hour == 0) && (min == 0) && (sec  == 0) &&
+                 (fsec        == 0))
+             {
+                 sprintf(cp,"PT0S");
+                 break;
+             }
+             cp += sprintf(cp,"P");
+             cp = AddISO8601IntervalPart(cp,year,'Y');
+             cp = AddISO8601IntervalPart(cp,mon ,'M');
+             cp = AddISO8601IntervalPart(cp,mday,'D');
+             if ((hour != 0) || (min != 0) ||
+                 (sec  != 0) || (fsec       != 0))
+             {
+                 cp += sprintf(cp,"T");
+             }
+             cp = AddISO8601IntervalPart(cp,hour,'H');
+             cp = AddISO8601IntervalPart(cp,min ,'M');
+             if ((sec != 0) || (fsec != 0))
+             {
+                 cp += sprintf(cp, "%d", sec);
+                 AppendFsec(cp,fsec);
+                 TrimTrailingZeros(cp);
+                 cp += strlen(cp);
+                 cp += sprintf(cp,"S");
+             }
+             break;

          /* compatible with postgresql 8.3 when DateStyle = 'iso' */
          case INTSTYLE_POSTGRES:
*** a/src/backend/utils/adt/timestamp.c
--- b/src/backend/utils/adt/timestamp.c
***************
*** 627,632 **** interval_in(PG_FUNCTION_ARGS)
--- 627,635 ----
                            ftype, MAXDATEFIELDS, &nf);
      if (dterr == 0)
          dterr = DecodeInterval(field, ftype, nf, range, &dtype, tm, &fsec);
+     if (dterr == DTERR_BAD_FORMAT)
+         dterr = DecodeISO8601Interval(str, tm, &fsec);
+
      if (dterr != 0)
      {
          if (dterr == DTERR_FIELD_OVERFLOW)
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 1956,1963 **** psql_completion(char *text, int start, int end)
          else if (pg_strcasecmp(prev2_wd, "IntervalStyle") == 0)
          {
              static const char *const my_list[] =
!             {"postgres","postgres_verbose", "sql_standard", NULL};
!
              COMPLETE_WITH_LIST(my_list);
          }
          else if (pg_strcasecmp(prev2_wd, "GEQO") == 0)
--- 1956,1962 ----
          else if (pg_strcasecmp(prev2_wd, "IntervalStyle") == 0)
          {
              static const char *const my_list[] =
!             {"postgres","postgres_verbose", "sql_standard", "iso_8601", NULL};
              COMPLETE_WITH_LIST(my_list);
          }
          else if (pg_strcasecmp(prev2_wd, "GEQO") == 0)
*** a/src/include/miscadmin.h
--- b/src/include/miscadmin.h
***************
*** 197,206 **** extern int    DateOrder;
--- 197,208 ----
   *   INTSTYLE_POSTGRES             Like Postgres8.3 when DateStyle = 'iso'
   *   INTSTYLE_POSTGRES_VERBOSE     Like Postgres8.3 when DateStyle = 'sql'
   *   INTSTYLE_SQL_STANDARD         SQL standard interval literals
+  *   INTSTYLE_ISO_8601             Specifies ISO-8601-basic formatted intervals
   */
  #define INTSTYLE_POSTGRES             0
  #define INTSTYLE_POSTGRES_VERBOSE     1
  #define INTSTYLE_SQL_STANDARD         2
+ #define INTSTYLE_ISO_8601             3

  extern int    IntervalStyle;

*** a/src/include/utils/datetime.h
--- b/src/include/utils/datetime.h
***************
*** 292,297 **** extern int DecodeTimeOnly(char **field, int *ftype,
--- 292,300 ----
  extern int DecodeInterval(char **field, int *ftype,
                 int nf, int range, int *dtype,
                 struct pg_tm * tm, fsec_t *fsec);
+ extern int DecodeISO8601Interval(char *str,
+                struct pg_tm * tm, fsec_t *fsec);
+
  extern void DateTimeParseError(int dterr, const char *str,
                     const char *datatype);


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: trivial patch for pg_freespacemap documentation
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Allow pg_regress to be run outside the build tree.