Re: WIP: index support for regexp search

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: WIP: index support for regexp search
Дата
Msg-id 50AABA7B.8060302@fuzzy.cz
обсуждение исходный текст
Ответ на Re: WIP: index support for regexp search  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: WIP: index support for regexp search  (Alexander Korotkov <aekorotkov@gmail.com>)
Re: WIP: index support for regexp search  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
On 19.11.2012 22:58, Alexander Korotkov wrote:
> Hi!
>
> New version of patch is attached. Changes are following:
> 1) Right way to convert from pg_wchar to multibyte.
> 2) Optimization of producing CFNA-like graph on trigrams (produce
> smaller, but equivalent, graphs in less time).
> 3) Comments and refactoring.

Hi,

thanks for the updated message-id. I've done the initial review:

1) Patch applies fine to the master.

2) It's common to use upper-case names for macros, but trgm.h defines  macro "iswordchr" - I see it's moved from
trgm_op.cbut maybe we  could make it a bit more correct? 

3) I see there are two '#ifdef KEEPONLYALNUM" blocks right next to each  other in trgm.h - maybe it'd be a good idea to
jointhem? 

4) The two new method prototypes at the end of trgm.h use different  indendation than the rest (spaces only instead of
tabs).

5) There are no regression tests / updated docs (yet).

6) It does not compile - I do get a bunch of errors like this

trgm_regexp.c:73:2: error: expected specifier-qualifier-list before
‘TrgmStateKey’
trgm_regexp.c: In function ‘addKeys’:
trgm_regexp.c:291:24: error: ‘TrgmState’ has no member named ‘keys’
trgm_regexp.c:304:10: error: ‘TrgmState’ has no member named ‘keys’
...

It seems this is cause by the order of typedefs in trgm_regexp.c, namely
TrgmState referencing TrgmStateKey before it's defined. Moving the
TrgmStateKey before TrgmState fixed the issue (I'm using gcc-4.5.4  but
I think it's not compiler-dependent.)

7) Once fixed, it seems to work

CREATE EXTENSION pg_trgm ;
CREATE TABLE TEST (val TEXT);
INSERT INTO test      SELECT md5(i::text) FROM generate_series(1,1000000) s(i);
CREATE INDEX trgm_idx ON test USING gin (val gin_trgm_ops);
ANALYZE test;

EXPLAIN SELECT * FROM test WHERE val ~ '.*qqq.*';

                          QUERY PLAN
---------------------------------------------------------------------Bitmap Heap Scan on test  (cost=16.77..385.16
rows=100width=33)  Recheck Cond: (val ~ '.*qqq.*'::text)  ->  Bitmap Index Scan on trgm_idx  (cost=0.00..16.75 rows=100
                                    width=0)        Index Cond: (val ~ '.*qqq.*'::text) 
(4 rows)

but I do get a bunch of NOTICE messages with debugging info (no matter
if the GIN index is used or not, so it's somewhere in the common regexp
code). But I guess that's due to WIP status.

regards
Tomas



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Erik Rijkers"
Дата:
Сообщение: Re: WIP: index support for regexp search
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: WIP: index support for regexp search