Re: [v9.2] make_greater_string() does not return a string in some cases

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [v9.2] make_greater_string() does not return a string in some cases
Дата
Msg-id CA+Tgmoax-SHNgHe77cJZGsqgsB+Z=n_jzQZ5h0RG1+NcWGHkBg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [v9.2] make_greater_string() does not return a string in some cases  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [v9.2] make_greater_string() does not return a string in some cases
Re: [v9.2] make_greater_string() does not return a string in some cases
Список pgsql-hackers
On Thu, Sep 22, 2011 at 12:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I'm a bit perplexed as to why we can't find a non-stochastic way of doing this.
>
> [ collations suck ]

Ugh.

> Now, having said that, I'm starting to wonder again why it's worth our
> trouble to fool with encoding-specific incrementers.  The exactness of
> the estimates seems unlikely to be improved very much by doing this.

Well, so the problem is that the frequency with which the algorithm
fails altogether seems to be disturbingly high for certain kinds of
characters.  I agree it might not be that important to get the
absolutely best next string, but it does seem important not to fail
outright.  Kyotaro Horiguchi gives the example of UTF-8 characters
ending with 0xbf.

>>>> One random idea I have is - instead of generating > and < clauses,
>>>> could we define a "prefix match" operator - i.e. a ### b iff substr(a,
>>>> 1, length(b)) = b?  We'd need to do something about the selectivity,
>>>> but I don't see why that would be a problem.
>
>>> The problem is that you'd need to make that a btree-indexable operator.
>
>> Well, right.  Without that, there's not much point.  But do you think
>> that's prohibitively difficult?
>
> The problem is that you'd just be shifting all these same issues into
> the btree index machinery, which is not any better equipped to cope with
> them, and would not be a good place to be adding overhead.

My thought was that it would avoid the need to do any character
incrementing at all.  You could just start scanning forward as if the
operator were >= and then stop when you hit the first string that
doesn't have the same initial substring.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Online base backup from the hot-standby
Следующее
От: Greg Stark
Дата:
Сообщение: Re: [v9.2] make_greater_string() does not return a string in some cases