Re: [v9.2] make_greater_string() does not return a string in some cases

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: [v9.2] make_greater_string() does not return a string in some cases
Дата	29 октября 2011 г. 20:36:16
Msg-id	29562.1319920566@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: [v9.2] make_greater_string() does not return a string in some cases (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: [v9.2] make_greater_string() does not return a string in some cases (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Oct 29, 2011 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Ummm ... why do the incrementer functions think they need to restore the
>> previous value on failure?  AFAICS that's a waste of code and cycles,
>> since there is only one caller and it doesn't care in the least.

> Well, it might not be strictly necessary for pg_utf8_increment() and
> pg_eucjp_increment(), but it's clearly necessary for the generic
> incrementer function for exactly the same reason it was needed in the
> old coding.  I suppose we could weaken the rule to "you must leave a
> valid character behind rather than a bunch of bytes that doesn't
> encode to a character", but the cycle savings are negligible and the
> current rule seems both simpler and more bullet-proof.

No, it's *not* necessary any more, AFAICS.  make_greater_string discards
the data and overwrites it with a null instantly upon getting a failure
return from the incrementer.  The reason we used to need it was that we
did pg_mbcliplen after failing to increment, but now we do that before
we ever increment anything, so we already know the length of the last
character.  It doesn't matter whether those bytes are still valid or
contain garbage.

>> I'm also quite distressed that you ignored my advice to limit the number
>> of combinations tried. �This patch could be horribly slow when dealing
>> with wide characters, eg think what will happen when starting from
>> U+10000.

> Uh, I think it will try at most one character in that position and
> then truncate away that character entirely, per my last email on this
> topic (to which you never responded):
> http://archives.postgresql.org/pgsql-hackers/2011-09/msg01195.php

Oh!  You are right, I was expecting it to try multiple characters at the
same position before truncating the string.  This change seems to have
lobotomized things rather thoroughly.  What is the rationale for that?
As an example, when dealing with a single-character string, it will fail
altogether if the next code value sorts out-of-order, so this seems to
me to be a rather large step backwards.

I think we ought to go back to the previous design of incrementing till
failure and then truncating, which puts the onus on the incrementer to
make a reasonable tradeoff of how many combinations to try per character
position.  There's a simple tweak we could make to the patch to limit
that: once we've maxed out a lower-order byte of a multibyte char,
*don't reset it to minimum*, just move on to incrementing the next
higher byte.  This preserves the old property that the maximum number of
combinations tried is bounded by 256 * string's length in bytes.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "Mr. Aaron W. Swenson"
Дата: 29 октября 2011 г., 20:29:35
Сообщение: Re: Add socket dir to pg_config..?

Следующее

От: Hitoshi Harada
Дата: 29 октября 2011 г., 21:34:57
Сообщение: Re: pgsql_fdw, FDW for PostgreSQL server

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [v9.2] make_greater_string() does not return a string in some cases

Предыдущее

Следующее