Обсуждение: Questions about TSearch2 and PG 8.2
Hello! I'm in the process of upgrading our PostgreSQL 8.1 databases to PostgreSQL 8.2.4. I have stumbled over a minor issue withthe upgrade and some helpful suggestions here: http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain-that-is-tsearch2-8.1-8.2-upgrading.html I shall try tonight with an plain SQL dump, but as some of my DBs are quite large, I usually use the custom dump format.As I would like to move the tsearch2-stuff in ist own schema as suggested, I tried using a restore list. I'd liketo report that everything works as expected, but I've got a slight problem with the custom schema part. I created thetarget-db, created a schema tsearch2 and installed the tsearch2-functions, operators, configuration and whatnot into thisnew schema. Then I edited the restore list so that the tsearch2-bits would not be created from the dump file again. However,the binary-dump tries to create the textindex-columns with a tsvector-type which explicitly references the publicschema. Instead of CREATE TABLE someschema.article ( id integer, mytext text, idxfti tsvector ); it tries to create the table like this CREATE TABLE someschema.article ( id integer, mytext text, idxfti public.tsvector ); As the tsvector-type is defined in the tsearch2-schema, this is bound to fail, even with the search_path set to include thetsearch2-schema. I assume that this happens because the table article is not in the same schema as the original tsvector-typeand the default search_path is being ignored on the dump in order to be on the safe side. This "double-checking"breaks the migration in my case, however, so is there some way that would allow me to change the table definitionon restore from using just tsvector instead of the explicit public.tsvector? I already tried editing the binarydump, but that just resulted in a corrupted dump-file. I there's no other way, I'll go the plain dump route, of course,but I'd just like to check this issue. My second question concerns the new Gin (Generalized Inverted Index) index type. Is it stable enough for production yet andwould it yield a high enough performance gain in comparison the GiST? Does it make much sense using a Gin-index alongsidethe GiST-one? Would we need to change anything in the application code in order to make use of Gin - like using where idxfti @> to_tsquery('default_german', 'Fundstück') instead of where idxfti @@ to_tsquery('default_german', 'Fundstück') ? The docs here http://www.sai.msu.su/~megera/wiki/Gin are still a bit thin, so any hint to some further examples would begreatly appreciated. Kind regards Markus Computec Media AG Sitz der Gesellschaft und Registergericht: Fürth (HRB 8818) Vorstandsmitglieder: Johannes S. Gözalan (Vorsitzender) und Niels Herrmann Vorsitzender des Aufsichtsrates: Jürg Marquard Umsatzsteuer-Identifikationsnummer: DE 812 575 276
On Mon, 30 Apr 2007, Markus Wollny wrote: > Hello! > > I'm in the process of upgrading our PostgreSQL 8.1 databases to PostgreSQL 8.2.4. I have stumbled over a minor issue withthe upgrade and some helpful suggestions here: > http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain-that-is-tsearch2-8.1-8.2-upgrading.html > > I shall try tonight with an plain SQL dump, but as some of my DBs are quite large, I usually use the custom dump format.As I would like to move the tsearch2-stuff in ist own schema as suggested, I tried using a restore list. I'd liketo report that everything works as expected, but I've got a slight problem with the custom schema part. I created thetarget-db, created a schema tsearch2 and installed the tsearch2-functions, operators, configuration and whatnot into thisnew schema. Then I edited the restore list so that the tsearch2-bits would not be created from the dump file again. However,the binary-dump tries to create the textindex-columns with a tsvector-type which explicitly references the publicschema. > > Instead of > > CREATE TABLE someschema.article > ( > id integer, > mytext text, > idxfti tsvector > ); > > it tries to create the table like this > > CREATE TABLE someschema.article > ( > id integer, > mytext text, > idxfti public.tsvector > ); > > As the tsvector-type is defined in the tsearch2-schema, this is bound to fail, even with the search_path set to includethe tsearch2-schema. I assume that this happens because the table article is not in the same schema as the originaltsvector-type and the default search_path is being ignored on the dump in order to be on the safe side. This "double-checking"breaks the migration in my case, however, so is there some way that would allow me to change the table definitionon restore from using just tsvector instead of the explicit public.tsvector? I already tried editing the binarydump, but that just resulted in a corrupted dump-file. I there's no other way, I'll go the plain dump route, of course,but I'd just like to check this issue. I think you need plain sql dump. > > My second question concerns the new Gin (Generalized Inverted Index) index type. Is it stable enough for production yetand would it yield a high enough performance gain in comparison the GiST? Does it make much sense using a Gin-index alongsidethe GiST-one? Would we need to change anything in the application code in order to make use of Gin - like using > > where idxfti @> to_tsquery('default_german', 'Fundst?ck') > > instead of > > where idxfti @@ to_tsquery('default_german', 'Fundst?ck') No, use @@ operator with Gin as well, you may need @@@ operator for Gin, if you use weights in tsquery. btw, read http://www.sai.msu.su/~megera/postgres/fts/doc for information about Gin+Gist index. It's true, that the best combination is GiST for online stuff and Gin for archived. Also, if you read russian there are some papers available http://www.sai.msu.su/~megera/postgres/talks/ > > ? The docs here http://www.sai.msu.su/~megera/wiki/Gin are still a bit thin, so any hint to some further examples wouldbe greatly appreciated. See above > > Kind regards > > Markus > > > Computec Media AG > Sitz der Gesellschaft und Registergericht: F?rth (HRB 8818) > Vorstandsmitglieder: Johannes S. G?zalan (Vorsitzender) und Niels Herrmann > Vorsitzender des Aufsichtsrates: J?rg Marquard > Umsatzsteuer-Identifikationsnummer: DE 812 575 276 > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On Monday 30 April 2007 13:20, Markus Wollny wrote: > Hello! > > I'm in the process of upgrading our PostgreSQL 8.1 databases to PostgreSQL > 8.2.4. I have stumbled over a minor issue with the upgrade and some helpful > suggestions here: > http://people.planetpostgresql.org/xzilla/index.php?/archives/291-The-pain- >that-is-tsearch2-8.1-8.2-upgrading.html > > I shall try tonight with an plain SQL dump, but as some of my DBs are quite > large, I usually use the custom dump format. As I would like to move the > tsearch2-stuff in ist own schema as suggested, I tried using a restore > list. I'd like to report that everything works as expected, but I've got a > slight problem with the custom schema part. I created the target-db, > created a schema tsearch2 and installed the tsearch2-functions, operators, > configuration and whatnot into this new schema. Then I edited the restore > list so that the tsearch2-bits would not be created from the dump file > again. However, the binary-dump tries to create the textindex-columns with > a tsvector-type which explicitly references the public schema. > > Instead of > > CREATE TABLE someschema.article > ( > id integer, > mytext text, > idxfti tsvector > ); > > it tries to create the table like this > > CREATE TABLE someschema.article > ( > id integer, > mytext text, > idxfti public.tsvector > ); > > As the tsvector-type is defined in the tsearch2-schema, this is bound to > fail, even with the search_path set to include the tsearch2-schema. I > assume that this happens because the table article is not in the same > schema as the original tsvector-type and the default search_path is being > ignored on the dump in order to be on the safe side. This "double-checking" > breaks the migration in my case, however, so is there some way that would > allow me to change the table definition on restore from using just tsvector > instead of the explicit public.tsvector? I already tried editing the binary > dump, but that just resulted in a corrupted dump-file. I there's no other > way, I'll go the plain dump route, of course, but I'd just like to check > this issue. > I think you want to split it into 2 parts... do a schema only dump in plain sql to get the schema loaded, then use the custom format to pull the data over. > My second question concerns the new Gin (Generalized Inverted Index) index > type. Is it stable enough for production yet and would it yield a high > enough performance gain in comparison the GiST? Does it make much sense > using a Gin-index alongside the GiST-one? Would we need to change anything > in the application code in order to make use of Gin - like using > > where idxfti @> to_tsquery('default_german', 'Fundstück') > > instead of > > where idxfti @@ to_tsquery('default_german', 'Fundstück') > > ? The docs here http://www.sai.msu.su/~megera/wiki/Gin are still a bit > thin, so any hint to some further examples would be greatly appreciated. > Gin is certainly stable enough for production use, we're using it on some projects at OmniTI at least. I've got some gin/gist information up as well at http://people.planetpostgresql.org/xzilla/index.php?/archives/278-PostgreSQL-full-text-search-testing.html, also be sure check the trackback links... and also I think magnus had a post on gin/gist wrt the postgresql.org website. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL