Re: 8.3 full text search docs
От | Albert Cervera i Areny |
---|---|
Тема | Re: 8.3 full text search docs |
Дата | |
Msg-id | 200710140024.49931.albert@nan-tic.com обсуждение исходный текст |
Ответ на | 8.3 full text search docs (andy <andy@squeakycode.net>) |
Список | pgsql-hackers |
Andy, note that documentation is discussed in the pgsql-docs list and patches usually are submitted to the pgsql-patches list. Nice to see both new sections, by the way. A Diumenge 14 Octubre 2007, andy va escriure: > I have two doc updates I'd like to offer. I see we have two example > sections: creating rule-based dict's and creating parsers. When I was > starting I would have liked to see an example usage. > > I'd like to offer: "example usage" and "Upgrading". > This is my first draft, if anyone has suggestions I'd be interested in > hearing them. Also, I'm not sure where or who to send this to, so I > hope -hacker is ok. > > ----- Example Usage ----- > Staring a new project with Full Text Searching is easy. There is > nothing to install anymore, its all built in (in fact, don't install the > contrib module tsearch2 because it will conflict with the tsearch2 built > into the core). > > We need to add a new column of type tsvector to the table you'd like to > search. In this example we'll use a table called notes. If your table > exists use: > > alter table notes add searchvec tsvector; > > If not use: > > create table notes ( > rowid integer, > note text, > searchvec tsvector > ); > > The searchvec column is what we will use for searching, so you probably > want to create an index on it... from another place in the manual: > > (http://www.postgresql.org/docs/8.3/static/textsearch-indexes.html) > > GiST indexes are very good for dynamic data and fast if the number of > unique words (lexemes) is under 100,000, > GIN indexes are best for static data with +100,000 lexemes. > > For our example, I'll create a gist index: > > create index notesvec on notes using gist(searchvec); > > If you have existing data, we need to fill the searchvec column: > > update notes set searchvec = to_tsvector(note); > > After the update, any rows are inserted or updated will not have their > searchvec column set automatically, for that we need to create a trigger: > > create trigger notevecupdate > before insert or update on notes > for each row > execute procedure > tsvector_update_trigger(searchvec, 'pg_catalog.english', note); > > > Some data: > insert into notes(rowid, note) values(1, 'this is a test'); > > insert into notes(rowid, note) > values(2, 'I do not like green eggs and ham'); > > insert into notes(rowid, note) values(3, 'the cat in the hat'); > > insert into notes(rowid, note) > values(4, 'rage against the dying of the light'); > > And now we can query it: > > select * from notes where searchvec @@ to_tsquery('light'); > > or > > select * from notes, to_tsquery('test') as q where searchvec @@ q; > > writing it this way lets you reuse the tsquery "q" like this: > > select note, ts_rank(searchvec, q) > from notes, to_tsquery('test') as q > where searchvec @@ q > order by ts_rank(searchvec, q); > ----- Example Usage ----- > > > > ----- Upgrade from prior versions ----- > > When tsearch2 was put into core, some functions and types were renamed, > among other things. A simple backup and restore will not work to > migrate your database from versions below 8.3 to 8.3. > > In general, the way to do it is backup all your data without the > tsearch2 stuff, restore just the data, then recreate the tsearch2 stuff > by hand. > > (Its easier to think of this as an upgrade from tsearch2 to tsearch3, > but without the whole renaming it to tsearch3 thing) > > To make it a little easier, there is a way using the pg_restore to > selective restore everything that is not in the old tsearch2. > > First you must use "pg_dump -Fc" to backup your existing database. > > Then we will create an item list of things we want to restore using this > perl script. It will strip out all the things that look like tsearch2, > and return (to stdout) a list of things you should restore. > > For example: > > pg_dump -Fc -h oldserver -f ubberbase.bak ubberbase > perl maketoc.pl ubberbase.bak > toc > # now restore just the items in toc > pg_restore -Fc --create -d postgres --use-list toc -f ubberbse.bak > > There is one thing that will fail, that's the trigger you used to update > the tsvector column. Its because the function tsearch2 was renamed to > tsvector_update_trigger. You'll need to recreate the trigger by hand. > > Now that the structures and data are restored, you'll need to go through > and redo the tsearch2 stuff by hand. > > After you get the database fixed up, you'll also need to update your > client programs (php, perl, etc). For the most part, just renameing > things (like rank to ts_rank) should be all that's required. > > > Oleg Bartunov has an incomplete list of items that have been renamed: > > http://www.sai.msu.su/~megera/wiki/Tsearch2_83_changes > > ----- Upgrade from prior versions ----- -- Albert Cervera i Areny http://www.NaN-tic.com
В списке pgsql-hackers по дате отправления: