Re: WIP: index support for regexp search

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: WIP: index support for regexp search
Дата
Msg-id CAPpHfdtkPgtDANjAXMnyAycpsahgGedQZ7VU+KfW6Y_5Jx1O=g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: index support for regexp search  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: WIP: index support for regexp search
Список pgsql-hackers
On Wed, Jan 23, 2013 at 7:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> On 23.01.2013 09:36, Alexander Korotkov wrote:
>> On Wed, Jan 23, 2013 at 6:08 AM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>> The biggest problem is that I really don't care for the idea of
>>> contrib/pg_trgm being this cozy with the innards of regex_t.

>> The only option I see now is to provide a method like "export_cnfa" which
>> would export corresponding CNFA in fixed format.

> Yeah, I think that makes sense. The transformation code in trgm_regexp.c
> would probably be more readable too, if it didn't have to deal with the
> regex guts representation of the CNFA. Also, once you have intermediate
> representation of the original CNFA, you could do some of the
> transformation work on that representation, before building the
> "tranformed graph" containing trigrams. You could eliminate any
> non-alphanumeric characters, joining states connected by arcs with
> non-alphanumeric characters, for example.

It's not just the CNFA though; the other big API problem is with mapping
colors back to characters.  Right now, that not only knows way too much
about a part of the regex internals we have ambitions to change soon,
but it also requires pg_wchar2mb_with_len() and lowerstr(), neither of
which should be known to the regex library IMO.  So I'm not sure how we
divvy that up sanely.  To be clear: I'm not going to insist that we have
to have a clean API factorization before we commit this at all.  But it
worries me if we don't even know how we could get to that, because we
are going to need it eventually.

Now, we probably don't have enough of time before 9.3 to solve an API problem :(. It's likely we have to choose either commit to 9.3 without clean API factorization or postpone it to 9.4.

------
With best regards,
Alexander Korotkov. 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Support for REINDEX CONCURRENTLY
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Writable foreign tables: how to identify rows