Re: Shrinking TSvectors
От | Howard News |
---|---|
Тема | Re: Shrinking TSvectors |
Дата | |
Msg-id | 5703D051.2020404@selestial.com обсуждение исходный текст |
Ответ на | Re: Shrinking TSvectors (Artur Zakirov <a.zakirov@postgrespro.ru>) |
Список | pgsql-general |
On 05/04/2016 15:15, Artur Zakirov wrote: > On 05.04.2016 14:37, Howard News wrote: >> Hi, >> >> does anyone have any pointers for shrinking tsvectors >> >> I have looked at the contents of some of these fields and they contain >> many details that are not needed. For example... >> >> "'+1':935,942 '-0500':72 '-0578':932 '-0667':938 '-266':937 '-873':944 >> '-9972':945 '/partners/application.html':222 >> '/partners/program/program-agreement.pdf':271 >> '/partners/reseller.html':181,1073 '01756':50,1083 '07767':54,1087 >> '1':753,771 '12':366 '14':66 (...)" >> >> I am not interested in keeping the numbers or urls in the indexes. >> >> Thanks, >> >> Howard. >> >> > > Hello, > > You need create a new text search configuration. Here is an example of > commands: > > CREATE TEXT SEARCH CONFIGURATION public.english_cfg ( > PARSER = default > ); > ALTER TEXT SEARCH CONFIGURATION public.english_cfg > ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, > word, hword, hword_part > WITH pg_catalog.english_stem; > > Instead of the "pg_catalog.english_stem" you can use your own dictionary. > > Lets compare new configuration with the embedded configuration > "pg_catalog.english": > > postgres=# select to_tsvector('english_cfg', 'home -9972 > /partners/application.html /partners/program/program-agreement.pdf'); > to_tsvector > ------------- > 'home':1 > (1 row) > > postgres=# select to_tsvector('english', 'home -9972 > /partners/application.html /partners/program/program-agreement.pdf'); > to_tsvector > ----------------------------------------------------------------------------------------------- > > '-9972':2 '/partners/application.html':3 > '/partners/program/program-agreement.pdf':4 'home':1 > (1 row) > > > You can get some additional information about configurations using \dF+: > > postgres=# \dF+ english > Text search configuration "pg_catalog.english" > Parser: "pg_catalog.default" > Token | Dictionaries > -----------------+-------------- > asciihword | english_stem > asciiword | english_stem > email | simple > file | simple > float | simple > host | simple > hword | english_stem > hword_asciipart | english_stem > hword_numpart | simple > hword_part | english_stem > int | simple > numhword | simple > numword | simple > sfloat | simple > uint | simple > url | simple > url_path | simple > version | simple > word | english_stem > > postgres=# \dF+ english_cfg > Text search configuration "public.english_cfg" > Parser: "pg_catalog.default" > Token | Dictionaries > -----------------+-------------- > asciihword | english_stem > asciiword | english_stem > hword | english_stem > hword_asciipart | english_stem > hword_part | english_stem > word | english_stem > Thanks Artur, Thats amazing! Postgres never ceases to amaze me. And the same goes for the contributors to this list.
В списке pgsql-general по дате отправления: