Re: Bug in to_timestamp().

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: Bug in to_timestamp().
Дата
Msg-id CAKFQuwbyB0w2_g9y5FGebBMUU66v-v_BqRyV29tSwG2rbWJb2g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Bug in to_timestamp().  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Mon, Jun 20, 2016 at 8:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 13, 2016 at 12:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jun 13, 2016 at 12:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> amul sul <sul_amul@yahoo.co.in> writes:
>>> It's look like bug in to_timestamp() function when format string has more whitespaces compare to input string, see below:
>>
>> No, I think this is a case of "input doesn't match the format string".
>>
>> As a rule of thumb, using to_timestamp() for input that could be parsed
>> just fine by the standard timestamp input function is not a particularly
>> good idea.  to_timestamp() is meant to deal with input that is in a
>> well-defined format that happens to not be parsable by timestamp_in.
>> This example doesn't meet either of those preconditions.
>
> I think a space in the format string should skip a whitespace
> character in the input string, but not a non-whitespace character.
> It's my understanding that these functions exist in no small part for
> compatibility with Oracle, and Oracle declines to skip the digit '1'
> on the basis of an extra space in the format string, which IMHO is the
> behavior any reasonable user would expect.

So Amul and I are of one opinion and Tom is of another.  Anyone else
have an opinion?


​At least Tom's position has the benefit of being consistent with current behavior.  The current implementation doesn't actually care what literal value you specify - any non-special character consumes a single token from the input, period.

SELECT TO_TIMESTAMP('2016-06-13 15:43:36', 'YYYY/MM/DD--HH24:MI:SS');
SELECT TO_TIMESTAMP('2016-06-13 15:43:36', 'YYYY/MM/DD-HH24:MI:SS');

Both the above exhibit the same behavior as if you used a space instead of the hyphen as the group separator.

The documentation should be updated to make this particular dynamic more clear.

I don't see changing the general behavior of these "date formatting" functions a worthwhile endeavor.​  Adding a less-liberal "parse_timestamp" function I could get behind.

IOW, the user seems to be happy with the fact that the "/" in the date matches his "-" but them complains that the space matches the number "1".  You don't get to have it both ways.

[re-reads the third usage note]

Or maybe you do.  We already define space as a being a greedy operator (when not used in conjunction with FX).  A greedy space-space sequence makes little sense on its face and if we are going to be helpful here we should treat it as a single greedy space matcher.

Note that "returns an error because to_timestamp expects one space only" is wrong - it errors because only a single space is captured and then the attempt to parse '    JUN' using "MON" fails.  The following query doesn't fail though it exhibits the same space discrepancy (it just gives the same "wrong" result).

SELECT TO_TIMESTAMP('2016-06-13 15:43:36', 'FXYYYY/MM/DD  HH24:MI:SS');

Given that we already partially special-case the space expression I'd be inclined to consider Robert's and Amul's position on the matter.  I think I'd redefine our treatment of space to be "zero or more" instead of "one or more" and require that it only match a literal space in the input.

Having considered that, I'm not convinced its worth a compatibility break.  I'd much rather deprecate these <to_*> versions and write slightly-less-liberal versions named <parse_*>.

In any case I'd called the present wording a bug. Instead:

A single space consumes a single token of input and then, unless the FX modifier is present, consumes zero or more subsequent literal spaces.  Thus, using two spaces in a row without the FX modifier, while allowed, is unlikely to give you a satisfactory result.  The first space will consume all available consecutive spaces so that the second space will be guaranteed to consume a non-space token from the input.

David J.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Bug in to_timestamp().
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: parallel.c is not marked as test covered