Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing?

Поиск
Список
Период
Сортировка
От Stuart Woolford
Тема Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing?
Дата
Msg-id 99110613104200.00731@test.macmillan.co.nz
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Firstly, damb you guys are good, please accept my strongest complements for the
response time on this issue!

On Sat, 06 Nov 1999, Tom Lane wrote:
> "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> > Reviewing my email logs from June, most of the work on this has to do with
> > people who needs locales, and potentially multibyte character sets. Tom
> > Lane is of the opinion that this particular optimization needs to be moved
> > out of the parser, and deeper into the planner or optimizer/rewriter,
> > so a good fix may be some ways out.
>
> Actually, that part is already done: addition of the index-enabling
> comparisons is gone from the parser and is now done in the optimizer,
> which has a whole bunch of benefits (one being that the comparison
> clauses don't get added to the query unless they are actually used
> with an index!).
>
> But the underlying LOCALE problem still remains: I don't know a good
> character-set-independent method for generating a "just a little bit
> larger" string to use as the righthand limit.  If anyone out there is
> an expert on foreign and multibyte character sets, some help would
> be appreciated.  Basically, given that we know the LIKE or regex
> pattern can only match values beginning with FOO, we want to generate
> string comparisons that select out the range of values that begin with
> FOO (or, at worst, a slightly larger range).  In USASCII locale it's not
> hard: you can do
>     field >= 'FOO' AND field < 'FOP'
> but it's not immediately obvious how to make this idea work reliably
> in the presence of odd collation orders or multibyte characters...

how about something along the lines of:

file >='FOO' and field='FOO.*'

ie, terminate once the search fails on a match of the static left-hand-side
followed by anything (although I have the feeling this does not fit into your
execution system..), and a simple regex type check be added to the scan
validation code?

>
> BTW: the \377 hack is actually wrong for USASCII too, since it'll
> exclude a data value like 'FOO\377x' which should be included.

That's why I pointed out that in my particular case, I only have alpha and
numeric data in the database, so it is safe, it's certainly no general solution.

--
------------------------------------------------------------
Stuart Woolford, stuartw@newmail.net
Unix Consultant.
Software Developer.
Supra Club of New Zealand.
------------------------------------------------------------

В списке pgsql-general по дате отправления:

Предыдущее
От: Stuart Woolford
Дата:
Сообщение: Re: [GENERAL] indexed regex select optimisation missing?
Следующее
От: Stuart Woolford
Дата:
Сообщение: more] indexed regex select optimisations?