Re: How to enable partial matching on a GIN index

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: How to enable partial matching on a GIN index
Дата
Msg-id 2444.1452544417@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: How to enable partial matching on a GIN index  (Chris Spencer <chrisspen@gmail.com>)
Ответы Re: How to enable partial matching on a GIN index  (Chris Spencer <chrisspen@gmail.com>)
Список pgsql-novice
Chris Spencer <chrisspen@gmail.com> writes:
>> What's your grounds for claiming that?

> It returns no results whenever I use partial search terms.

Oh; that's got nothing to do with whether an index is used or not.
The index just makes it faster.

> If I search for,
> say, "hospital" it returns results containing the exact word "hospital",
> but if I search for "hosp" it returns nothing. The doc page explaining that
> PG "can" do partial matches, and not that it "does" do partial matches, led
> me to believe this is the expected default behaviour. Is this not the case?

There are a few problems with your example ...

> ALTER TABLE mytable ADD COLUMN search_index tsvector;
> CREATE INDEX mytable_search_index_gin ON mytable USING gin(search_index);
> INSERT INTO mytable (name, search_index) VALUES ('hospital',
> plainto_tsquery('pg_catalog.english', 'hospital'));

That couldn't have worked as written, because tsvector != tsquery.
I assume you meant to_tsvector() in the INSERT.

> SELECT * FROM mytable WHERE (search_index) @@
> (plainto_tsquery('pg_catalog.english', 'hospital')); -- returns results

Right ...

> SELECT * FROM mytable WHERE (search_index) @@
> (plainto_tsquery('pg_catalog.english', 'hosp')); -- returns nothing

This isn't right, you have to use

    to_tsquery('pg_catalog.english', 'hosp:*')

to specify a partial match.  You didn't give a partial match indicator
(":*"), and if you had, plainto_tsquery() would have stripped it off,
because it throws away all punctuation.

Another pitfall to keep in mind is that stemming may result in searches
not matching that look like they should.  For example, what's really
stored in your tsvector column for this example is

# select to_tsvector('pg_catalog.english', 'hospital');
 to_tsvector
-------------
 'hospit':1
(1 row)

and a prefix match has to match or be a prefix of that string exactly.
So you might expect to_tsquery('pg_catalog.english', 'hospita:*')
to match that entry, but it won't.  (In some cases, stemming of the query
word will hide this effect, which is why a search for 'hospital' works;
but in this example 'hospita' doesn't look enough like an English word
to trigger removal of 'a'.)

If you're confused, it always pays to look at the actual outputs of
the to_tsvector and to_tsquery functions.  There's no particular magic
after that point, it's just string matching; but the word-break and
stemmer functions can do surprising things.

If you expect to be doing mostly prefix-type searches, you might end
up deciding you want to use the "simple" text search configuration,
which I'm pretty sure does no stemming at all.

            regards, tom lane


В списке pgsql-novice по дате отправления:

Предыдущее
От: Chris Spencer
Дата:
Сообщение: Re: How to enable partial matching on a GIN index
Следующее
От: Chris Spencer
Дата:
Сообщение: Re: How to enable partial matching on a GIN index