[HACKERS] Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags.Should it?

Поиск
Список
Период
Сортировка
От Noah Misch
Тема [HACKERS] Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags.Should it?
Дата
Msg-id 20170930070112.GA357926@rfd.leadboat.com
обсуждение исходный текст
Ответ на [HACKERS] Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags.Should it?  (Noah Misch <noah@leadboat.com>)
Ответы Re: [HACKERS] Re: CREATE COLLATION does not sanitize ICU's BCP 47language tags. Should it?  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Sep 25, 2017 at 08:26:21AM +0000, Noah Misch wrote:
> On Tue, Sep 19, 2017 at 07:01:47PM -0700, Peter Geoghegan wrote:
> > On Tue, Sep 19, 2017 at 5:52 PM, Peter Eisentraut
> > <peter.eisentraut@2ndquadrant.com> wrote:
> > > On 9/18/17 18:46, Peter Geoghegan wrote:
> > >> As I pointed out a couple of times already [1], we don't currently
> > >> sanitize ICU's BCP 47 language tags within CREATE COLLATION.
> > >
> > > There is no requirement that the locale strings for ICU need to be BCP
> > > 47.  ICU locale names like 'de@collation=phonebook' are also acceptable.
> > 
> > Right. But, we only document that BCP 47 is supported by Postgres.
> > Maybe we can use get_icu_language_tag()/uloc_toLanguageTag() to ensure
> > that we end up with BCP 47, even when the user happens to specify the
> > legacy syntax. Should we be "canonicalizing" legacy ICU locale strings
> > as BCP 47, too?
> > 
> > > The reason they are not validated is that, as you know, ICU accepts any
> > > locale string as valid.  You appear to have found a way to do some
> > > validation, but I would like to see that code.
> > 
> > As I mentioned, I'm simply calling
> > get_icu_language_tag()/uloc_toLanguageTag() to do that sanitization.
> > The code to get the extra sanitization is completely trivial.
> > 
> > I didn't post the patch that generates the errors in my opening e-mail
> > because I'm not confident it's correct just yet. And, I think that I
> > see a bigger problem: we pass a string that is almost certainly a BCP
> > 47 string to ucol_open() from within pg_newlocale_from_collation(). We
> > do so despite the fact that ucol_open() apparently doesn't accept BCP
> > 47 syntax locale strings until ICU 54 [1]. Seems entirely possible
> > that this accounts for the problems you saw on ICU 4.2, back when we
> > were still creating keyword variants (I guess that the keyword
> > variants seem very "BCP 47-ish" to me).
> > 
> > I really think we need to add some kind of debug mode that makes ICU
> > optionally spit out a locale display name at key points. We need this
> > to gain confidence that the behavior that ICU provides actually
> > matches what is expected across ICU different versions for different
> > locale formats. We talked about this as a user-facing feature before,
> > which can wait till v11; I just want this to debug problems like this
> > one.
> > 
> > [1] https://ssl.icu-project.org/apiref/icu4c/ucol_8h.html#a4721e4c0a519bb0139a874e191223590
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Peter,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
> 
> [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  On the worst
week to be violating open item policies.  Kindly send a status update within
24 hours, and include a date for your subsequent status update.  Refer to the
policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: [HACKERS] extension build issue with PostgreSQL 10 on Centos6
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] 64-bit queryId?