Обсуждение: [DOCS] to_char(): 'FM' also suppresses *trailing* zeroes

Поиск
Список
Период
Сортировка

[DOCS] to_char(): 'FM' also suppresses *trailing* zeroes

От
Erwin Brandstetter
Дата:
In Table 9-27. "Template Pattern Modifiers for Numeric Formatting" it says:

FM | prefix fill mode (suppress leading zeroes and padding blanks) | FM9999

https://www.postgresql.org/docs/current/static/functions-formatting.html#FUNCTIONS-FORMATTING-NUMERICMOD-TABLE

In fact, 'FM' also suppresses *trailing* zeroes after the comma. To fix, this might be changed to:

   suppress insignificant zeroes and padding blanks

Or:

   suppress padding zeroes and blanks

Regards
Erwin Brandstetter

Re: [DOCS] to_char(): 'FM' also suppresses *trailing* zeroes

От
Tom Lane
Дата:
Erwin Brandstetter <brsaweda@gmail.com> writes:
> In Table 9-27. "Template Pattern Modifiers for Numeric Formatting" it says:
> FM | prefix fill mode (suppress leading zeroes and padding blanks) | FM9999

> In fact, 'FM' also suppresses *trailing* zeroes after the comma. To fix,
> this might be changed to:
>    suppress insignificant zeroes and padding blanks

Not necessarily.  A bit of experimentation says that it also matters
whether you use "0" or "9" as the format character:

regression=# select to_char(0.1, '0.9999');
 to_char
---------
  0.1000
(1 row)

regression=# select to_char(0.1, 'FM0.9999');
 to_char
---------
 0.1
(1 row)

regression=# select to_char(0.1, '0.9900');
 to_char
---------
  0.1000
(1 row)

regression=# select to_char(0.1, 'FM0.9900');
 to_char
---------
 0.1000
(1 row)

regression=# select to_char(0.1, 'FM00.99009');
 to_char
---------
 00.1000
(1 row)

It's also worth noting the existing examples

regression=# select to_char(-0.1, '99.99');
 to_char
---------
   -.10
(1 row)

regression=# select to_char(-0.1, 'FM99.99');
 to_char
---------
 -.1
(1 row)


So it appears to me that the bit you point out is flat out backwards;
what FM actually suppresses is trailing zeroes not leading zeroes.

I'm tempted to propose that in table 9-26, we need to write

    9    digit position (can be dropped if insignificant)

    0    digit position (cannot be dropped, even if insignificant)

and then in 9-27 say

    FM    fill mode: suppress trailing zeroes and padding spaces

Also, in between those two tables, I see

      * 9 results in a value with the same number of digits as there are
        9s. If a digit is not available it outputs a space.

This seems outright wrong per the above examples, and in any case is not
very useful since it doesn't explain the difference from "0".  Perhaps
rewrite as

    * 0 specifies a digit position that will always be printed,
      even if it contains a leading/trailing zero.  9 also specifies
      a digit position, but if it is a leading zero then it will be
      replaced by a space, while if it is a trailing zero and fill mode
      is specified then it will be deleted.

(I wonder how closely that agrees with Oracle's behavior ...)

            regards, tom lane


Re: [DOCS] to_char(): 'FM' also suppresses *trailing* zeroes

От
Erwin Brandstetter
Дата:
On 28 August 2017 at 22:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Erwin Brandstetter <brsaweda@gmail.com> writes:
> In Table 9-27. "Template Pattern Modifiers for Numeric Formatting" it says:
> FM | prefix fill mode (suppress leading zeroes and padding blanks) | FM9999

> In fact, 'FM' also suppresses *trailing* zeroes after the comma. To fix,
> this might be changed to:
>    suppress insignificant zeroes and padding blanks

Not necessarily.  A bit of experimentation says that it also matters
whether you use "0" or "9" as the format character:

So it appears to me that the bit you point out is flat out backwards;
what FM actually suppresses is trailing zeroes not leading zeroes.

I'm tempted to propose that in table 9-26, we need to write

        9       digit position (can be dropped if insignificant)

        0       digit position (cannot be dropped, even if insignificant)

and then in 9-27 say

        FM      fill mode: suppress trailing zeroes and padding spaces

Also, in between those two tables, I see

      * 9 results in a value with the same number of digits as there are
        9s. If a digit is not available it outputs a space.

This seems outright wrong per the above examples, and in any case is not
very useful since it doesn't explain the difference from "0".  Perhaps
rewrite as

        * 0 specifies a digit position that will always be printed,
          even if it contains a leading/trailing zero.  9 also specifies
          a digit position, but if it is a leading zero then it will be
          replaced by a space, while if it is a trailing zero and fill mode
          is specified then it will be deleted.


Leading zeros are removed completely for 9 with FM. And without FM they are actually replaced with spaces before the sign. Maybe:

  * 0 specifies a digit position that will always be printed, even as insignificant zero.
    9 also specifies a digit position, but leading zeros are replaced with spaces before the optional sign. And if fill mode is specified (FM modifier) insignificant leading and trailing zeros are removed completely.

I created a more comprehensive test matrix for to_char(numeric, text) to aid in a quick review of the FM modifier:

CREATE TEMP TABLE t(t_id int, template text);
INSERT INTO t(t_id, template)
VALUES
     ( 1, '00.00')
   , ( 2, '09.90')
   , ( 3, '90.09')
   , ( 4, '99.99')
   , ( 5, '090.090')
   , ( 6, '909.909')
   , (11, 'FM00.00')
   , (12, 'FM09.90')
   , (13, 'FM90.09')
   , (14, 'FM99.99')
   , (15, 'FM090.090')
   , (16, 'FM909.909');

CREATE TEMP TABLE v (v_id int, val numeric);
INSERT INTO v
VALUES
     (1,   -0.1  )
   , (2,    0.12 )
   , (3,   -0.123)
   , (4,   -1.1  )
   , (5,   12.12 )
   , (6, -123.123);

SELECT val, template, to_char(val, template)
FROM   v, t
ORDER  BY v_id, t_id;

To test online:
 
I think the root of the confusion is that the FM does for date/time formatting what the manual says (table 9-25):

FM | prefix fill mode (suppress leading zeroes and padding blanks) | FMMonth

test=# SELECT to_char(timestamp '0910-09-03 01:00:03.04', template)
test-# FROM  (
test(#    VALUES
test(#      ('YYYY-MM-DD HH24:MI:SS.US')
test(#    , ('FMYYYY-MM-DD HH24:MI:SS.US')
test(#    , ('FMYYYY-MM-DD FMHH24:MI:SS.US')
test(#    , ('FMYYYY-FMMM-FMDD FMHH24:FMMI:FMSS.FMUS')
test(#    ) t(template);
          to_char
----------------------------
 0910-09-03 01:00:03.040000
 910-09-03 01:00:03.040000
 910-09-03 1:00:03.040000
 910-9-3 1:0:3.040000


Note how trailing zeros are not suppressed for microseconds. (It would seem to make more sense to suppress those, though.)

The same explanation is given for numeric formatting (table 9-27):

FM prefix | fill mode (suppress leading zeroes and padding blanks) | FM9999

But it does not apply there - like you pointed out. The exact behavior seems to be:

For the pattern character 0 leading and trailing zeroes are always printed, no matter what.
For the pattern character 9 ...
    without FM modifier
        leading zeros are replaced with padding blanks (before the sign if it's there).
        trailing zeros after the decimal point are printed.
    with FM modifier
        leading and trailing zeros are removed (unless overruled by a 0).
There is additional blank padding for the sign if not printed - also removed with FM.

Regards
Erwin

Re: [DOCS] to_char(): 'FM' also suppresses *trailing* zeroes

От
Erwin Brandstetter
Дата:
On 3 September 2017 at 19:52, Erwin Brandstetter <brsaweda@gmail.com> wrote:
I think the root of the confusion is that the FM does for date/time formatting what the manual says (table 9-25):

FM | prefix fill mode (suppress leading zeroes and padding blanks) | FMMonth

test=# SELECT to_char(timestamp '0910-09-03 01:00:03.04', template)
test-# FROM  (
test(#    VALUES
test(#      ('YYYY-MM-DD HH24:MI:SS.US')
test(#    , ('FMYYYY-MM-DD HH24:MI:SS.US')
test(#    , ('FMYYYY-MM-DD FMHH24:MI:SS.US')
test(#    , ('FMYYYY-FMMM-FMDD FMHH24:FMMI:FMSS.FMUS')
test(#    ) t(template);
          to_char
----------------------------
 0910-09-03 01:00:03.040000
 910-09-03 01:00:03.040000
 910-09-03 1:00:03.040000
 910-9-3 1:0:3.040000


Note how trailing zeros are not suppressed for microseconds. (It would seem to make more sense to suppress those, though.)

The same explanation is given for numeric formatting (table 9-27):

FM prefix | fill mode (suppress leading zeroes and padding blanks) | FM9999

But it does not apply there - like you pointed out. The exact behavior seems to be:

For the pattern character 0 leading and trailing zeroes are always printed, no matter what.
For the pattern character 9 ...
    without FM modifier
        leading zeros are replaced with padding blanks (before the sign if it's there).
        trailing zeros after the decimal point are printed.
    with FM modifier
        leading and trailing zeros are removed (unless overruled by a 0).
There is additional blank padding for the sign if not printed - also removed with FM.


This leads to a general problem of this manual page.

There is detailed explanation for "Template Pattern Modifiers for Date/Time Formatting".
But when it gets to "Template Pattern Modifiers for Numeric Formatting", the only explanation is this preceding statement:

> Certain modifiers can be applied to any template pattern to alter its behavior. For example, FM9999 is the 9999 pattern with the FM modifier. Table 9-27 shows the modifier patterns for numeric formatting.

Which indicates that above explanation for date/time modifiers would apply in the same (or analog) way to numeric modifiers. But that's not the case (at least) for FM. One more notable, undeclared difference, under table-25 it says:

FM suppresses leading zeroes and trailing blanks that would otherwise be added to make the output of a pattern be fixed-width. In PostgreSQL, FM modifies only the next specification, while in Oracle FM affects all subsequent specifications, and repeated FM modifiers toggle fill mode on and off.

That seems accurate for date/time - see above demo for SELECT to_char(timestamp '0910-09-03 01:00:03.04', template)
(It's just not clear how "the next specification" is defined exactly.) But numeric formatting is completely different. The FM modifier can be added anywhere and any number of times, the result is always the same. And it's also not (just) about "leading zeroes and trailing blanks" like we worked out above:

test=# SELECT *, to_char(numeric '1.1', template)
test-# FROM  (
test(#    VALUES
test(#      ('9,999.999')
test(#    , ('FM9,999.999')
test(#    , ('FM9,999.FM999')
test(#    , ('FM9,FM9FM9FM9.FM9FM9FM9')
test(#    , ('9,999.FM999')  -- !!!
test(#    , ('9,999.999FM')  -- !!!
test(#    , ('9,999.99FM9')  -- !!!
test(#    ) t(template);
        template         |  to_char
-------------------------+------------
 9,999.999               |      1.100
 FM9,999.999             | 1.1
 FM9,999.FM999           | 1.1
 FM9,FM9FM9FM9.FM9FM9FM9 | 1.1
 9,999.FM999             | 1.1
 9,999.999FM             | 1.1
 9,999.99FM9             | 1.1


Looks like a bug, a documentation bug or a combination of both.
If it's indeed the intended behavior (?) there should be separate explanation under table 9-27.

Regards
Erwin