Re: B-Tree support function number 3 (strxfrm() optimization)

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: B-Tree support function number 3 (strxfrm() optimization)
Дата
Msg-id CA+Tgmoa=y2RjZjAkWjp2Z+7n3tQmxZQtuikHsNpcWrB6tsT7wg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: B-Tree support function number 3 (strxfrm() optimization)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: B-Tree support function number 3 (strxfrm() optimization)  (Stephen Frost <sfrost@snowman.net>)
Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Mon, Apr 7, 2014 at 1:23 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> As an utterly trivial point, I
> find the naming to be less than ideal: "poorman" is not a term I want
> to enshrine in our code.  That's not very descriptive of what the
> patch is actually doing even if you know what the idiom means, and
> people whose first language - many of whom do significant work on our
> code - may not.

To throw out one more point that I think is problematic, Peter's
original email on this thread gives a bunch of examples of strxfrm()
normalization that all different in the first few bytes - but so do
the underlying strings.  I *think* (but don't have time to check right
now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
junk and then 8 bytes per character in the input string - so comparing
the first 8 bytes of the strxfrm()'d representation would amount to
comparing part of the first byte.  If for any reason the first byte is
the same (or similar enough) on many of the input strings, then this
will probably work out to be slower rather than faster.  Even if other
platforms are more space-efficient (and I think at least some of them
are), I think it's unlikely that this optimization will ever pay off
for strings that don't differ in the first 8 bytes.  And there are
many cases where that could be true a large percentage of the time
throughout the input, e.g. YYYY-MM-DD HH:MM:SS timestamps stored as
text.  It seems like that the patch pessimizes those cases, though of
course there's no way to know without testing.

Now it *may well be* that after doing some research and performance
testing we will conclude that either no commonly-used platforms show
any regressions or that the regressions that do occur are discountable
in view of the benefits to more common cases to the benefits.  I just
don't think mid-April is the right time to start those discussions
with the goal of a 9.4 commit; and I also don't think committing
without having those discussions is very prudent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)