Обсуждение: Differences in Unicode handling on Mac vs Linux?

Поиск
Список
Период
Сортировка

Differences in Unicode handling on Mac vs Linux?

От
Matt Daw
Дата:
Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
I'm blocked because my Mac is rejecting SQL that works on our Linux
production servers. Here's a simple case:

# select * from shots where sg_poznÁmka is NULL;
ERROR:  column "sg_pozn�mka" does not exist
LINE 1: select * from shots where sg_poznÁmka is NULL;

... as far as I can tell, all my encodings are consistent on both
sides, I've checked LC_COLLATE, LC_CTYPE, client_encoding,
server_encoding and the database encodings. I'm running 9.0.13 on both
machines (and I tried 9.2.4 on my Mac).

Anything else I could double-check? Or are there any known Mac-related
Unicode issues?

Thanks!

Matt


Re: Differences in Unicode handling on Mac vs Linux?

От
Tom Lane
Дата:
Matt Daw <matt@shotgunsoftware.com> writes:
> Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
> I'm blocked because my Mac is rejecting SQL that works on our Linux
> production servers. Here's a simple case:

> # select * from shots where sg_poznÁmka is NULL;
> ERROR:  column "sg_pozn�mka" does not exist
> LINE 1: select * from shots where sg_poznÁmka is NULL;

Hm ... what does "\d shots" say about the spelling of the column name?

> Anything else I could double-check? Or are there any known Mac-related
> Unicode issues?

OS X's Unicode locales are pretty crummy.  I'm suspicious that there's
some sort of case-folding inconsistency here, but it's hard to say more
(especially since you didn't actually tell us *which* locales you've
selected on each machine).  If it is that, as a short-term fix it might
help to double-quote the column name.

            regards, tom lane


Re: Differences in Unicode handling on Mac vs Linux?

От
Matt Daw
Дата:
> Hm ... what does "\d shots" say about the spelling of the column name?

\d shots is the same on both systems:

 sg_poznÁmka                                      | text
         |


> OS X's Unicode locales are pretty crummy.  I'm suspicious that there's
> some sort of case-folding inconsistency here, but it's hard to say more
> (especially since you didn't actually tell us *which* locales you've
> selected on each machine).  If it is that, as a short-term fix it might
> help to double-quote the column name.

The locales are set to "en_US.UTF-8" and encodings to "UTF8". Double
quoting does solve the column case, but it's not helping with the
Rails generated:

SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull
              FROM pg_attribute a LEFT JOIN pg_attrdef d
                ON a.attrelid = d.adrelid AND a.attnum = d.adnum
             WHERE a.attrelid =
'asset_sg_kdo_dělá____assigned_to__connections'::regclass
               AND a.attnum > 0 AND NOT a.attisdropped
             ORDER BY a.attnum

... that produces:

ERROR:  relation "asset_sg_kdo_d�l�____assigned_to__connections" does not exist

\d produces:

 public | asset_sg_kdo_dělá____assigned_to__connections
| table    | matt


For the short term, I think I'll boot up a Linux VM to troubleshoot my
production bug... but I'll submit a bug report with repro steps.

Thanks Tom!

Matt


Re: Differences in Unicode handling on Mac vs Linux?

От
Ian Lawrence Barwick
Дата:
2013/6/3 Tom Lane <tgl@sss.pgh.pa.us>:
> Matt Daw <matt@shotgunsoftware.com> writes:
>> Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
>> I'm blocked because my Mac is rejecting SQL that works on our Linux
>> production servers. Here's a simple case:
>
>> # select * from shots where sg_poznÁmka is NULL;
>> ERROR:  column "sg_pozn�mka" does not exist
>> LINE 1: select * from shots where sg_poznÁmka is NULL;
>
> Hm ... what does "\d shots" say about the spelling of the column name?
>
>> Anything else I could double-check? Or are there any known Mac-related
>> Unicode issues?
>
> OS X's Unicode locales are pretty crummy.  I'm suspicious that there's
> some sort of case-folding inconsistency here, but it's hard to say more
> (especially since you didn't actually tell us *which* locales you've
> selected on each machine).  If it is that, as a short-term fix it might
> help to double-quote the column name.

I can recreate something similar (OS X 10.7, 9.3beta1):

postgres=# CREATE TABLE shots (id int);
CREATE TABLE
postgres=# SHOW client_encoding ;
 client_encoding
-----------------
 UTF8
(1 row)

postgres=# select * from shots where col_ä is NULL;
ERROR:  column "col_�" does not exist
LINE 1: select * from shots where col_ä is NULL;

The corresponding log output is:

ERROR:  column "col_<E3><A4>" does not exist at character 27
STATEMENT:  select * from shots where col_ä is NULL;

Double-quoting the column name does seem to "work":

postgres=# select * from shots where "col_ä" is NULL;
ERROR:  column "col_ä" does not exist
LINE 1: select * from shots where "col_ä" is NULL;

The only language/locale settings I see in my environment are:

LANG=en_GB.UTF-8
__CF_USER_TEXT_ENCODING=0x1F6:0:2


Regards

Ian Barwick