Re: sortsupport for text

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: sortsupport for text
Дата
Msg-id CAEYLb_XpBGAUA=UJ=W2rK8eyMBYYAwBbqy0Q7W-ziMb+qwn1-w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: sortsupport for text  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: sortsupport for text  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 17 June 2012 17:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm not sure I agree with this decision; why should we presume to know
>> better than the glibc locale what constitutes equality?
>
> The killer reason why it must be like that is that you can't use hash
> methods on text if text equality is some unknown condition subtly
> different from bitwise equality.

Fair enough, but I doubt that we need to revert the changes made in
this commit to texteq in addition to the changes I'd like to see in
order to be semantically self-consistent. That is because there is
often a distinction made between equality and equivalence, and we
could adopt this distinction. strcoll() could be said to be just
making a representation that its two arguments are equivalent (and not
necessarily equal) when it returns 0. This distinction is explicitly
made in the C++ standard library, and failing to understand it can
result in bugs:

http://www.cplusplus.com/reference/algorithm/equal_range/

Note the use of the word "equivalent" rather than "equal" in the text.
equal_range is a bit of a misnomer.

This distinction is important enough to have warranted an entire
subsection of the book "Effective STL" by Scott Meyers, a
well-respected expert on the language. This comes up more often than
you'd think - "std::set::insert" determines if an element already
exists (to know if it must replace it) based on equivalency (usually,
though not necessarily, defined in terms of operator< ), whereas the
"find" algorithm finds elements based on equality (operator==).

> My recollection is that there were
> some other problems as well, but I'm too lazy to search the archives
> for you.

Fair enough. I'll search for it myself later. I'm about to head out now.

>> It's seems very likely that the main
>> one was the then-need to guard against poor quality qsort()
>> implementations that went quadratic in the face of lots of duplicates,
>
> No, I don't recall that that had anything to do with it.

Oh, okay. It looked very much like the "avoid equality at all costs"
thing you still see some of in tuplesort.c .

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: libpq compression
Следующее
От: Tom Lane
Дата:
Сообщение: Re: sortsupport for text