Re: ICU integration

Поиск
Список
Период
Сортировка
От Doug Doole
Тема Re: ICU integration
Дата
Msg-id CAP6UvaMTJYCxSBqhOnMwTS-vu=u7wvut-3k6TQ4eddtnSd4a1Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ICU integration  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
This isn't a problem for Postgres, or at least wouldn't be right now,
because we don't have case insensitive collations.

I was wondering if Postgres might be that way. It does avoid the RI constraint problem, but there are still troubles with range based predicates. (My previous project wanted case/accent insensitive collations, so we got to deal with it all.)
 
So, we use a strcmp()/memcmp() tie-breaker when strcoll() indicates equality, while also making the general notion of text equality actually mean binary equality.

We used a similar tie breaker in places. (e.g. Index keys needed to be identical, not just equal. We also broke ties in sort to make its behaviour more deterministic.)

I would like to get case insensitive collations some day, and was
really hoping that ICU would help. That being said, the need for a
strcmp() tie-breaker makes that hard. Oh well.

Prior to adding ICU to my previous project, it had the assumption that equal meant identical as well. It turned out to be a lot easier to break this assumption than I expected, but that code base had religiously used its own string comparison function for user data - strcmp()/memcmp() was never called for user data. (I don't know if the same can be said for Postgres.) We found that very few places needed to be aware of values that were equal but not identical. (Index and sort were the big two.)

Hopefully Postgres will be the same.

--
Doug Doole

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: SELECT FOR UPDATE regression in 9.5
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Optimization for lazy_scan_heap