Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL)

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL)
Дата
Msg-id CAOR=d=0taNujCoFDQ6g=n-yLD2xOfnE0dsu8AykgNqBK5+gQcQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL)  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL)  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-general
On Wed, Aug 29, 2012 at 11:43 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Wed, Aug 29, 2012 at 10:31:21AM -0700, Aleksey Tsalolikhin wrote:
>> On Wed, Aug 29, 2012 at 9:45 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> > citext unfortunately doesn't allow for index optimization of LIKE
>> > queries, which IMNSHO defeats the whole purpose.  to the best way
>> > remains to use lower() ...
>> > this will be index optimized and fast as long as you specified C
>> > locale for your database.
>>
>> What is the difference between C and en_US.UTF8, please?  We see that
>> the same query (that invokes a sort) runs 15% faster under the C
>> locale.  The output between C and en_US.UTF8 is identical.  We're
>> considering moving our database from en_US.UTF8 to C, but we do deal
>> with internationalized text.
>
> Well, C has reduced overhead for string comparisons, but obviously
> doesn't work well for international characters.  The single-byte
> encodings have somewhat less overhead than UTF8.  You can try using C
> locales for databases that don't require non-ASCII characters.

I think you're confusing encodings with locales.  C is a locale. You
can have a database with a locale of C and UTF-8 encoding.

create database clocale_utf8 encoding='UTF8' LC_COLLATE= 'C' template=template0;

\l
     Name     |  Owner   | Encoding  |   Collate   |    Ctype    |
Access privileges
--------------+----------+-----------+-------------+-------------+-----------------------
 clocale_utf8 | smarlowe | UTF8      | C           | en_US.UTF-8 |


SQL_ASCII is the encoding equivalent of C locale, but it also allows
multi-byte characters.


В списке pgsql-general по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL)
Следующее
От: Vincent Veyron
Дата:
Сообщение: Re: Dropping a column on parent table doesn't propagate to children?