Re: sortsupport for text

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: sortsupport for text
Дата
Msg-id CAEYLb_UU=h=oFs+b96uNPJTbiQvsnHJPs5kV7pbPCxxKoTNPzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: sortsupport for text  (Florian Pflug <fgp@phlo.org>)
Список pgsql-hackers
On 21 June 2012 11:40, Florian Pflug <fgp@phlo.org> wrote:
> At this point, my theory is that your choice of "random" strings prevents
> strxfrm() from ever winning over strcoll(). The reason being that you pick
> each letter uniformly distributed from a-z, resulting in a probability of
> two string differing in the first of 1 - 1/26 =~ 96%. Thus, even for
> extremely complex collation rules, strcoll() probably only needs to compare
> a few characters to determine the order. Whereas strxfrm() has to compute
> the whole sort key, no matter what.

Good point.

> The question is thus, how good a model are your "random" strings for the
> input of a typical sorting step in postgres? My guess is, a quite good one
> actually, since people probably don't deal with a lot of very similar strings
> very often. Which makes we wonder if using strxfrm() during sorting wouldn't
> be a net loss, all things considered…

Well, I think the answer to that has to be no. I posted a simple
postgres text -> strxfrm blob bytea SQL-callable wrapper function a
few days ago that clearly wins by quite a bit on some sample queries
against the dellstore sample database, which has a representative
sample set. Completely uniformly-distributed data actually isn't
representative of the real world at all. Normal distributions abound.

This C++ program was never intended to justify the general utility of
using strxfrm() for sorting, which I don't believe is in question. I
just wanted to get some idea about the performance characteristics of
using strxfrm() to traverse an index.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Florian Pflug
Дата:
Сообщение: Re: sortsupport for text
Следующее
От: "Etsuro Fujita"
Дата:
Сообщение: Re: not null validation option in contrib/file_fdw