Re: to_date() and to_timestamp() with negative years

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: to_date() and to_timestamp() with negative years
Дата
Msg-id d8b6393e-750a-8b92-2cd9-6004808dc188@aklaver.com
обсуждение исходный текст
Ответ на Re: to_date() and to_timestamp() with negative years  (Bryn Llewellyn <bryn@yugabyte.com>)
Список pgsql-general
On 11/4/21 10:29 AM, Bryn Llewellyn wrote:
>> /adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> wrote:/
>>
>> Not sure how this can be handled in a totally predictable way given 
>> the unpredictable ways in which datetime strings are formatted?
>>
>> The only thing I can say it is it points out that when working with 
>> datetimes settling on a standard format is your best defense against 
>> unpredictable results.
> 
> Thank you very much, again, for your help with my seemingly endless 
> nitpicking questions on this matter, Adrian. Here's the most favorable 
> conclusion that I can draw:
> 

> 3. The rules are hard to understand and the PG doc gives insufficient 
> detail to allow the outcomes in corner cases like you just showed us to 
> be predicted confidently. Some users seek to understand the rules by 
> reading PG's source code.

I would say that is because datetimes in string formats are often hard 
to understand as a result of folks inventing their own formats.

As an example a commit message of mine from some years ago:

"
Fix issue with date parsing of Javascript dates coming from browser
on Windows. This occurred in both Firefox and Chrome. The issue being
that the date had a timezone of Pacific Standard Time instead of PST 
like it does from a Linux machine. dateutils choked on that timezone and 
therefore the date was not parsed. The fix was to add fuzzy=True to the 
dateutils.parse(). This allows dateutils to skip over anything it does 
not understand and parse the rest. This works as the date does include 
the correct tz offset. For the record the date format of concern is-Tue 
Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)
"

Can't remember what version of Windows this was. The dates ended up in a 
Postgres database via Python code in a Django app. This fix is one of 
the reasons I really like the Python dateutils library. The solution being:

from dateutil.parser import parse

parse('Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)', 
fuzzy=True)
datetime.datetime(2015, 12, 8, 0, 0, tzinfo=tzoffset(None, 28800))

To see what it is doing:

parse('Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)', 
fuzzy_with_tokens=True)

(datetime.datetime(2015, 12, 8, 0, 0, tzinfo=tzoffset(None, 28800)),
  (' ', ' ', ' ', ' (Pacific Standard Time)'))

where everything after the datetime are tokens it ignored.

I include this mainly as illustration that data clean up maybe more 
practical before it ever hits the database and in a library that is 
specialized for the task at hand.

> 
> 4. Certainly, the rules set a significant parsing challenge. You hint 
> that they might even prevent a complete practical solution to be 
> implemented.

Yes that would depend on someone coding an AI that can fully understand 
people and what there intentions where from a string value. Given the 
answers I get when asking people directly what they intended, I'm not 
holding my breath.

> 
> 5. None of this matters when the app designer has the freedom to define 
> how date-time values will be provided, as text values, by user 
> interfaces or external systems. In these cases, the complexity can be 
> controlled by edict and correct solutions can be painlessly implemented 
> and tested. Not a day goes by that I don't have to enter a date value at 
> a UI. And in every case, a UI gadget constrains my input and makes its 
> individual fields available to the programmer without the need for 
> parsing—so there's reason to be optimistic. The programmer can easily 
> build the text representation of the date-time value to conform to the 
> simple rules that the overall application design specified.

Yep, constraining the imagination of the end user solves a lot of problems.

> 
> 6. In some cases text data that's to be converted arrives in a format 
> that cannot be influenced. And it might contain locutions like we've 
> been discussing ("zero" meaning "1 BC", unfavorable paradigms for 
> separating tokens, and the like). In these cases, the diligent 
> programmer might, just, be able to use the full arsenal of available 
> tools to implement a scheme that faultlessly parses the input. But the 
> likelihood of bugs seems to be pretty big.

 From my experience that pretty much defines all aspects of programming.
> 
> I'll say "case closed, now" — from my side, at least.


-- 
Adrian Klaver
adrian.klaver@aklaver.com



В списке pgsql-general по дате отправления:

Предыдущее
От: Christophe Pettus
Дата:
Сообщение: Re: Streaming replication versus Logical replication
Следующее
От: Michel Pelletier
Дата:
Сообщение: An self-contained example "expanded" C data type