Possible solution for LIKE optimization

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Possible solution for LIKE optimization
Дата
Msg-id Pine.LNX.4.30.0108052253470.10846-100000@peter.localdomain
обсуждение исходный текст
Ответы Re: Possible solution for LIKE optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Possible solution for LIKE optimization  (Giles Lean <giles@nemeton.com.au>)
Список pgsql-hackers
I have had an idea how the LIKE optimization problem could be solved.
The problem was that given the expression
   A LIKE (P || '%')

where A is a column reference and P is a constant not containing
wildcards, we wanted to find X and Y such that
   X <= A and A <= Y

where X and Y are calculated from P and the bound is usefully tight.

Note that X <= A is really strcoll(X, A) <= 0 and strcoll(X, A) is really
strcmp(strxfrm(X), strxfrm(A)) (the prototype of strxfrm() is different in
C, of course).

Let $<=$ be a non-locale based comparison (i.e., plain strcmp()).

The we can say that
   strxfrm(P) $<=$ strxfrm(A) and   strxfrm(A) $<=$ (strxfrm(P) + 1)

where "+ 1" means adding 1 to the last character and accounting for
overflow (like considering the string a base-256 number).  Basically, this
is the funny-collation-safe equivalent to A LIKE 'foo%' yielding 'foo' <=
A <= 'fop'.

We'd need to implement the strxfrm() function in SQL and the $<=$
operator, both of which are trivial.  The index would have to be in terms
of strxfrm().  There might be other issues, but they could be solved
algorithmically, I suppose.

What do you think?

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hannu Krosing
Дата:
Сообщение: Re: Re: Name for new VACUUM
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Possible solution for LIKE optimization