Обсуждение: More Norwegian trouble

Поиск
Список
Период
Сортировка

More Norwegian trouble

От
Heikki Linnakangas
Дата:
There was a long thread on the trouble that the Windows "Norwegian 
(Bokmål)" locale name causes, because the locale name is not pure ASCII. 
That was fixed by mapping "Norwegian (Bokmål)" to a pure-ASCII alias of 
it, "norwegian-bokmal". 
(http://www.postgresql.org/message-id/20140915230427.2486.29437@wrigleys.postgresql.org)

I just upgraded my Windows toolchain, and as a leftover from developing 
that patch still had my locale set to Norwegian (Bokmål). To my 
surprise, after a rebuild, initdb failed:

FATAL:  new collation (Norwegian_Norway.1252) is incompatible with the 
collation  of the template database (norwegian-bokmal_Norway.1252)
HINT:  Use the same collation as in the template database, or use 
template0 as t emplate.
STATEMENT:  CREATE DATABASE template0 IS_TEMPLATE = true 
ALLOW_CONNECTIONS = fal se;

It works when I pass a locale to initdb explicitly; it only breaks if 
the system locale is set to "Norwegian (Bokmål)", and I let initdb to 
use the default.

At first I suspected Noah's commit 
6fdba8ceb071a3512d5685f1cd4f971ab4d562d1, but reverting that made no 
difference.

So unfortunately, that patch that I committed earlier did not completely 
fix this issue. It looks like setlocale() is quite brain-dead on what 
the canonical spelling of that locale is:

setlocale(LC_COLLATE, NULL) -> "Norwegian (Bokmål)_Norway"

but:

setlocale(LC_COLLATE, "norwegian-bokmal_Norway") -> "Norwegian_Norway")

Apparently the behavior changed when I upgraded the toolchain. IIRC, I 
used to use "Microsoft Windows SDK 7.1", with "Microsoft Visual C++ 
Compilers 2010 Standard Edition" that came with it. I'm now using 
"Microsoft Visual Studio Community Edition 2013 Update 4", with 
"Microsoft Visual C++ Compilers 2010 SP Standard". I don't know what 
part of the upgrade broke this. Could also have been something else; I 
don't keep track of my build environment that carefully.

Now, what should we do about this? I'd like to know if others are seeing 
this, with whatever compiler versions you are using. In particular, I 
wonder if the builds included in the EnterpriseDB installers are 
experiencing this.

Perhaps the nicest fix would be to change the mapping code to map the 
problematic locale name to "Norwegian_Norway" instead of 
"norwegian-bokmal". That's assuming that it is in fact the same locale, 
and that it's accepted on all supported Windows versions. Another option 
is to also map "Norwegian_Norway" to "norwegian-bokmal_Norway", even 
though "Norwegian_Norway" doesn't contain any ASCII characters and 
wouldn't be a problem as such. That seems like a safer option.

It would be good to do something about this before the next minor 
release, as the original mapping commit has not been released yet.

- Heikki



Re: More Norwegian trouble

От
Noah Misch
Дата:
On Thu, Jan 08, 2015 at 04:37:37PM +0200, Heikki Linnakangas wrote:
> setlocale(LC_COLLATE, NULL) -> "Norwegian (Bokmål)_Norway"
> 
> but:
> 
> setlocale(LC_COLLATE, "norwegian-bokmal_Norway") -> "Norwegian_Norway")

> Apparently the behavior changed when I upgraded the toolchain. IIRC, I used
> to use "Microsoft Windows SDK 7.1", with "Microsoft Visual C++ Compilers
> 2010 Standard Edition" that came with it. I'm now using "Microsoft Visual
> Studio Community Edition 2013 Update 4", with "Microsoft Visual C++
> Compilers 2010 SP Standard". I don't know what part of the upgrade broke
> this. Could also have been something else; I don't keep track of my build
> environment that carefully.

MSVCR110 (Visual Studio 2012) locale handling departed significantly from that
of its predecessors; see comments at IsoLocaleName().

> Now, what should we do about this? I'd like to know if others are seeing
> this, with whatever compiler versions you are using.

VS2012 x64 behaves roughly as you describe:

setlocale(LC_COLLATE, NULL)                        -> "Norwegian (Bokmål)_Norway"
setlocale(LC_COLLATE, "norwegian-bokmal_Norway")   -> "Norwegian_Norway.1252"
setlocale(LC_COLLATE, "Norwegian_Norway")          -> "Norwegian_Norway.1252"
setlocale(LC_COLLATE, "Norwegian (Bokmål)_Norway") -> "Norwegian (Bokmål)_Norway"

I see the traditional behavior with 64-bit MinGW-w64 (MSVCRT):

setlocale(LC_COLLATE, NULL)                      -> "Norwegian (Bokmål)_Norway"
setlocale(LC_COLLATE, "norwegian-bokmal_Norway") -> "Norwegian (Bokmål)_Norway"
setlocale(LC_COLLATE, "Norwegian_Norway")        -> "Norwegian (Bokmål)_Norway"

> In particular, I wonder
> if the builds included in the EnterpriseDB installers are experiencing this.

I strongly suspect those builds use VS2012 for some of the newer branches, so
they will be affected.

> Perhaps the nicest fix would be to change the mapping code to map the
> problematic locale name to "Norwegian_Norway" instead of "norwegian-bokmal".
> That's assuming that it is in fact the same locale, and that it's accepted
> on all supported Windows versions.

I bet it is always accepted and always refers to the same locale.  IIRC,
interpretation of these names falls entirely within the CRT.  Windows system
libraries have no concept of these naming schemes.

> It would be good to do something about this before the next minor release,
> as the original mapping commit has not been released yet.

+1



Re: More Norwegian trouble

От
Heikki Linnakangas
Дата:
On 01/16/2015 09:13 AM, Noah Misch wrote:
> On Thu, Jan 08, 2015 at 04:37:37PM +0200, Heikki Linnakangas wrote:
>> setlocale(LC_COLLATE, NULL) -> "Norwegian (Bokmål)_Norway"
>>
>> but:
>>
>> setlocale(LC_COLLATE, "norwegian-bokmal_Norway") -> "Norwegian_Norway")
>
>> Apparently the behavior changed when I upgraded the toolchain. IIRC, I used
>> to use "Microsoft Windows SDK 7.1", with "Microsoft Visual C++ Compilers
>> 2010 Standard Edition" that came with it. I'm now using "Microsoft Visual
>> Studio Community Edition 2013 Update 4", with "Microsoft Visual C++
>> Compilers 2010 SP Standard". I don't know what part of the upgrade broke
>> this. Could also have been something else; I don't keep track of my build
>> environment that carefully.
>
> MSVCR110 (Visual Studio 2012) locale handling departed significantly from that
> of its predecessors; see comments at IsoLocaleName().
>
>> Now, what should we do about this? I'd like to know if others are seeing
>> this, with whatever compiler versions you are using.
>
> VS2012 x64 behaves roughly as you describe:
>
> setlocale(LC_COLLATE, NULL)                        -> "Norwegian (Bokmål)_Norway"
> setlocale(LC_COLLATE, "norwegian-bokmal_Norway")   -> "Norwegian_Norway.1252"
> setlocale(LC_COLLATE, "Norwegian_Norway")          -> "Norwegian_Norway.1252"
> setlocale(LC_COLLATE, "Norwegian (Bokmål)_Norway") -> "Norwegian (Bokmål)_Norway"
>
> I see the traditional behavior with 64-bit MinGW-w64 (MSVCRT):
>
> setlocale(LC_COLLATE, NULL)                      -> "Norwegian (Bokmål)_Norway"
> setlocale(LC_COLLATE, "norwegian-bokmal_Norway") -> "Norwegian (Bokmål)_Norway"
> setlocale(LC_COLLATE, "Norwegian_Norway")        -> "Norwegian (Bokmål)_Norway"
>
>> In particular, I wonder
>> if the builds included in the EnterpriseDB installers are experiencing this.
>
> I strongly suspect those builds use VS2012 for some of the newer branches, so
> they will be affected.
>
>> Perhaps the nicest fix would be to change the mapping code to map the
>> problematic locale name to "Norwegian_Norway" instead of "norwegian-bokmal".
>> That's assuming that it is in fact the same locale, and that it's accepted
>> on all supported Windows versions.
>
> I bet it is always accepted and always refers to the same locale.  IIRC,
> interpretation of these names falls entirely within the CRT.  Windows system
> libraries have no concept of these naming schemes.

Ok thanks for checking. I've committed a fix that way, mapping 
"Norwegian (Bokmål)_Norway" to "Norwegian_Norway". The 
"norwegian-bokmal" alias isn't used for anything anymore.

- Heikki