Re: pg_migrator and an 8.3-compatible tsvector data type

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: pg_migrator and an 8.3-compatible tsvector data type
Дата
Msg-id 200905291816.n4TIGJM23816@momjian.us
обсуждение исходный текст
Ответ на Re: pg_migrator and an 8.3-compatible tsvector data type  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: pg_migrator and an 8.3-compatible tsvector data type  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-hackers
Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > Bruce,
> >> The ordering of the lexems was changed:
> 
> > What does that get us in terms of performance etc.?
> 
> It was changed to support partial-match tsvector queries.  Without it,
> a partial match query would have to scan entire tsvectors instead
> of applying binary search.  I don't know if Oleg and Teodor did any
> actual performance tests on the size of the hit, but it seems like
> it could be pretty awful for large documents.

I started thinking about the performance issues of the tsvector changes.
Teodor gave me this code for conversion that basically does:
qsort_arg((void *) ARRPTR(t), t->size, sizeof(WordEntry), cmpLexeme, (void*) t);

So, basically, every time there is a cast we have to do a sort, which
for a large document would yield poor performance, and because we are
not storing the sorted result, it happens for every access;  this might
be an unacceptable performance burden.

So, one idea would be, instead of a cast, have pg_migrator rebuild the
tsvector columns with ALTER TABLE, so then the 8.4 index code could be
used.  But then we might as well just tell the users to migrate the
tsvector tables themselves, which is how pg_migrator behaves now.

Obviously we are still trying to figure out the best way to handle data
type changes;  I think as soon as we figure out a plan for tsvector we
can use that method for future changes.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Izmailov
Дата:
Сообщение: Re: information_schema.columns changes needed for OLEDB
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [GENERAL] trouble with to_char('L')