Re: [v9.2] make_greater_string() does not return a string in some cases

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [v9.2] make_greater_string() does not return a string in some cases
Дата
Msg-id CA+TgmoavQP28OY0QRkXSQiS131u2sEFc7e72yU3x=pjr0BPU=w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [v9.2] make_greater_string() does not return a string in some cases  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [v9.2] make_greater_string() does not return a string in some cases
Список pgsql-hackers
On Fri, Sep 23, 2011 at 8:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Anyway, I won't stand in the way of the patch as long as it's modified
>>> to limit the number of values considered for any one character position
>>> to something reasonably small.
>
>> I think that limit in both the old and new code is 1, except that the
>> new code does it more efficiently.
>
>> Am I confused?
>
> Yes, or else I am.  Consider a 4-byte UTF8 character at the end of the
> string.  The existing code increments the last byte up to 255 (rejecting
> everything past 0xBF), then gives up and truncates that character away.
> So the maximum number of tries for that character position is between 0
> and 127 depending on what the original character was (with at most 63 of
> the incremented values getting past the verifymbstr test).
>
> The proposed patch is going to iterate through all Unicode code points
> up to U+7FFFFF before giving up.  Since it's possible that we need to
> increment something further left to succeed at all, this doesn't seem
> like a good plan.

I think you're misreading the code.  It does this:

while (len > 0)
{   boring stuff;   if (charincfunc(lastchar, charlen))   {       more boring stuff;       if (we made a greater
string)          return it;       cleanup;   }  truncate away last character; 
}

I don't see how that's ever going to try more than one character in
the same position.

What may be confusing you is that the old code has two loops: an outer
loop that tests whether we've made a greater string, and an inner loop
that tests whether we've made a validly encoded string at all.  In the
new code, at least in the UTF-8 case, the inner loop is GONE
altogether.  Instead of iterating until we construct a valid
character, we just use our mad UTF-8 skillz to assemble one, and
return it.

Or else I need to go drink a few cups of tea and look at this again.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Lou Picciano
Дата:
Сообщение: Re: Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
Следующее
От: Robert Haas
Дата:
Сообщение: Re: new createuser option for replication role