Обсуждение: setlocale() on Windows is broken

Поиск
Список
Период
Сортировка

setlocale() on Windows is broken

От
Heikki Linnakangas
Дата:
While looking through old emails, I bumped into this:

http://archives.postgresql.org/message-id/25219.1303306707@sss.pgh.pa.us

To recap, setlocale() on Windows is broken for locale names that contain 
dots or apostrophes in the country name. That includes "Hong Kong 
S.A.R.", "Macau S.A.R.", and "U.A.E." and "People's Republic of China".

In April, I put in a hack to initdb to map those problematic names to 
aliases that don't contain dots:

People's Republic of China -> China
Hong Kong S.A.R. -> HKG
U.A.E. -> ARE
Macau S.A.R. -> ZHM

However, Hiroshi pointed out in the thread linked above that that 
doesn't completely solve the problem. If you set locale to "HKG", for 
example, setlocale(LC_ALL, NULL) still returns the full name, "Hong Kong 
S.A.R.", and if you feed that back to setlocale() it fails. In 
particular, check_locale() uses "saved = setlocale(LC_XXX, NULL)" to get 
the current value, and tries to restore it later with "setlocale(LC_XXX, 
saved)".


At first, I thought I should revert my hack in initdb, since it's not 
fully solving the problem anyway. But it doesn't really help - you run 
into the same issue if you set locale to one of those aliases manually. 
And that's exactly what users will have to do if we don't map those 
locales automatically.

Microsoft should fix their bug. I don't have much faith in that 
happening, however. So, I think we should move the mapping from initdb 
to somewhere in src/port, so that the mapping is done every time 
setlocale() is called. That would fix the problem with check_locale(): 
even though "setlocale(LC_XXX, NULL)" returns a value that won't work, 
the setlocale() call to restore it would map it to an alias that does 
work again.

In addition to that, I think we should check the return value of 
setlocale() in check_locale(), and throw a warning if restoring the old 
locale fails. The session's locale will still be screwed, but at least 
you'll know if it happens.

I'll go write a patch for that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: setlocale() on Windows is broken

От
Heikki Linnakangas
Дата:
On 31.08.2011 16:05, Heikki Linnakangas wrote:
> While looking through old emails, I bumped into this:
>
> http://archives.postgresql.org/message-id/25219.1303306707@sss.pgh.pa.us
>
> To recap, setlocale() on Windows is broken for locale names that contain
> dots or apostrophes in the country name. That includes "Hong Kong
> S.A.R.", "Macau S.A.R.", and "U.A.E." and "People's Republic of China".
>
> In April, I put in a hack to initdb to map those problematic names to
> aliases that don't contain dots:
>
> People's Republic of China -> China
> Hong Kong S.A.R. -> HKG
> U.A.E. -> ARE
> Macau S.A.R. -> ZHM
>
> However, Hiroshi pointed out in the thread linked above that that
> doesn't completely solve the problem. If you set locale to "HKG", for
> example, setlocale(LC_ALL, NULL) still returns the full name, "Hong Kong
> S.A.R.", and if you feed that back to setlocale() it fails. In
> particular, check_locale() uses "saved = setlocale(LC_XXX, NULL)" to get
> the current value, and tries to restore it later with "setlocale(LC_XXX,
> saved)".
>
>
> At first, I thought I should revert my hack in initdb, since it's not
> fully solving the problem anyway. But it doesn't really help - you run
> into the same issue if you set locale to one of those aliases manually.
> And that's exactly what users will have to do if we don't map those
> locales automatically.
>
> Microsoft should fix their bug. I don't have much faith in that
> happening, however. So, I think we should move the mapping from initdb
> to somewhere in src/port, so that the mapping is done every time
> setlocale() is called. That would fix the problem with check_locale():
> even though "setlocale(LC_XXX, NULL)" returns a value that won't work,
> the setlocale() call to restore it would map it to an alias that does
> work again.
>
> In addition to that, I think we should check the return value of
> setlocale() in check_locale(), and throw a warning if restoring the old
> locale fails. The session's locale will still be screwed, but at least
> you'll know if it happens.

I've committed a patch along those lines.

It turned out to be pretty difficult to reproduce user-visible buggy 
behavior caused by this bug, so for the sake of the archives, here's a 
recipe on that:

1. Set system locale to "Chinese_Hong Kong S.A.R..950"

2. initdb -D data --locale="Arabic_ARE"

3. Launch psql.
  CREATE TABLE foo (a text);  INSERT INTO foo VALUES ('a'), ('A');
  -- Verify that the order is 'a', 'A'  SELECT * FROM foo ORDER BY a;
  -- This fails, as it should  CREATE DATABASE postgres WITH LC_COLLATE='C' TEMPLATE=template0;
  -- This also fails, as it should  CREATE DATABASE postgres WITH LC_COLLATE='C' TEMPLATE=template0;
  -- The order returned by this is now wrong: 'A', 'a'  SELECT * FROM foo ORDER BY a;

It's a bizarre looking sequence, but that does it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com