Re: ICU integration

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: ICU integration
Дата
Msg-id CAM3SWZQ1uSbrVqmQAqLCtTTrM4Q47=9QByJALKnsyPAxdxJbcw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ICU integration  (Dave Page <dpage@pgadmin.org>)
Список pgsql-hackers
On Fri, Sep 9, 2016 at 6:39 AM, Dave Page <dpage@pgadmin.org> wrote:
> Looking back at my old emails, apparently ICU 5.0 and later include
> ucol_strcollUTF8() which avoids the need to convert UTF-8 characters
> to 16 bit before sorting. RHEL 6 has the older 4.2 version of ICU.

At the risk of stating the obvious, there is a reason why ICU
traditionally worked with UTF-16 natively. It's the same reason why
many OSes and application frameworks (e.g., Java) use UTF-16
internally, even though UTF-8 is much more popular on the web. Which
is: there are certain low-level optimizations possible that are not
possible with UTF-8.

I'm not saying that it would be just as good if we were to not use the
UTF-8 optimized stuff that ICU now has. My point is that it's not
useful to prejudge whether or not performance will be acceptable based
on a factor like this, which is ultimately just an implementation
detail. The ICU patch either performs acceptably as a substitute for
something like glibc, or it does not.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)