Обсуждение: Allow to_date() and to_timestamp() to accept localized month names

Поиск
Список
Период
Сортировка

Allow to_date() and to_timestamp() to accept localized month names

От
Mattia
Дата:
Hi,
attached is a patch which adds support to localized month names in
to_date() and to_timestamp() functions.

The patch is fairly simple but I want to discuss the approach and
implementation:

Using the TM modifier as in to_char() was already discussed some years
ago: 10710.1202170898@sss.pgh.pa.us [1]

I thought about reusing from_char_seq_search() but localized month
names use different capitalization according to the language grammar,
so I used pg_strncasecmp to do the match.

Regression tests with TM modifier are difficult since one should have
the locale used for the test installed on his system.

Usage example:
postgres=# set lc_time to 'fr_FR';
SET
postgres=# select to_date('22 janvier 2016', 'DD TMMonth YYYY');
  to_date
------------
 2016-01-22
(1 row)

[1] https://www.postgresql.org/message-id/10710.1202170898%40sss.pgh.pa.us

Thanks
Mattia

Вложения

Re: Allow to_date() and to_timestamp() to accept localized month names

От
Tom Lane
Дата:
Mattia <mattia@p2pforum.it> writes:
> attached is a patch which adds support to localized month names in
> to_date() and to_timestamp() functions.

Seems like a fine goal.

> I thought about reusing from_char_seq_search() but localized month
> names use different capitalization according to the language grammar,
> so I used pg_strncasecmp to do the match.

pg_str(n)casecmp is really only meant to handle comparisons of ASCII
strings; it will definitely not succeed in case-folding multibyte
characters.  That's not a big problem for to_date's existing usages
but I'm afraid it will be for non-English month names.  I think you'll
need another solution there.  You might have to resort to what citext
does, namely apply the full lower() transformation, at least whenever
the data string actually contains MB characters.

> Regression tests with TM modifier are difficult since one should have
> the locale used for the test installed on his system.

I suspect you'll have to give up on putting much about this into the
standard regression tests.  We've used not-run-by-default test scripts
in some similar cases (eg collate.linux.utf8.sql), but personally I think
those are 99% a waste of time, precisely because they never actually
get run by anyone but the author.
        regards, tom lane