BUG #16222: [[:print:]] doesn't correctly handle Emoji skin tone modifiers on MacOS

Поиск
Список
Период
Сортировка
Искать

BUG #16222: [[:print:]] doesn't correctly handle Emoji skin tone modifiers on MacOS

От:
PG Bug reporting form <noreply@postgresql.org>
Дата:
The following bug has been logged on the website:

Bug reference:      16222
Logged by:          Mack Earnhardt
Email address:      mack@agilereasoning.com
PostgreSQL version: 11.6
Operating system:   MacOS Catalina
Description:        

On Linux heroku-18, these expressions both eval true:

select '✌'~'\A[[:print:]]*\Z';
select '✌🏻'~'\A[[:print:]]*\Z';

On MacOS Catalina, the 1st evals true but the 2nd evals false.

Re: BUG #16222: [[:print:]] doesn't correctly handle Emoji skin tone modifiers on MacOS

От:
Tom Lane <tgl@sss.pgh.pa.us>
Дата:
PG Bug reporting form  writes:
> On Linux heroku-18, these expressions both eval true:

> select '✌'~'\A[[:print:]]*\Z';
> select '✌🏻'~'\A[[:print:]]*\Z';

> On MacOS Catalina, the 1st evals true but the 2nd evals false.

This is entirely a function of what your operating system's
locale support does.  So it could be that you chose the wrong
LC_CTYPE setting for the macOS database -- in C locale, for
example, "false" is the right answer.  However, we've observed
that macOS's UTF8-based locales seem pretty brain-dead about
handling of multibyte characters :-(.  So it's likely that this
boils down to being Apple's bug.  I haven't detected any interest
on their part in improving their POSIX locale support, unfortunately.

			regards, tom lane


Re: BUG #16222: [[:print:]] doesn't correctly handle Emoji skin tonemodifiers on MacOS

От:
Mack Earnhardt <mack@agilereasoning.com>
Дата:
Hi Tom,

You’re correct. I thought the fact that Terminal and Vim both display correct-ish was enough to rule out the OS. It wasn’t.

The database LC_CTYPE is set to en_US.UTF-8, as is my bash terminal. When I put the two queries in a text file and use `egrep '^[[:print:]]+$’`, only the first line is recognized.

Thanks for helping me narrow this down!

-M

> On Jan 21, 2020, at 12:52 PM, Tom Lane  wrote:
> 
> PG Bug reporting form  writes:
>> On Linux heroku-18, these expressions both eval true:
> 
>> select '✌'~'\A[[:print:]]*\Z';
>> select '✌🏻'~'\A[[:print:]]*\Z';
> 
>> On MacOS Catalina, the 1st evals true but the 2nd evals false.
> 
> This is entirely a function of what your operating system's
> locale support does.  So it could be that you chose the wrong
> LC_CTYPE setting for the macOS database -- in C locale, for
> example, "false" is the right answer.  However, we've observed
> that macOS's UTF8-based locales seem pretty brain-dead about
> handling of multibyte characters :-(.  So it's likely that this
> boils down to being Apple's bug.  I haven't detected any interest
> on their part in improving their POSIX locale support, unfortunately.
> 
> 			regards, tom lane



FAQ