Re: integration of fulltext search in bytea/docs

Поиск
Список
Период
Сортировка
От Sam Mason
Тема Re: integration of fulltext search in bytea/docs
Дата
Msg-id 20090729152345.GF5407@samason.me.uk
обсуждение исходный текст
Ответ на integration of fulltext search in bytea/docs  (Radek Novotný <radek.novotny@mediawork.cz>)
Список pgsql-general
On Wed, Jul 29, 2009 at 04:46:43PM +0200, Radek Novotnnn wrote:
> is there in the roadmap of postgre integration of fulltext searching in
> documents saved in blobs (bytea)?

Do you mean bytea or large-objects?

> Would be very very nice (postgre users can be proud to be first) to save
> documents into bytea and search that field via to_tsvector, to_tsquery ...

This seems easy; for large objects, just use lo_export() to dump the
blob out to the filesystem, and then use something like pl/perl to run
antiword on it, saving the results to another file and then returning
the file line-by-line as a SETOF TEXT (I think this is the best way of
handling things in case the resulting text file is enormous anyway).  If
this code was called "runfilter" we can use it like:

  UPDATE myfiles f SET tsidx = (
    SELECT ts_accum(to_tsvector(t))
    FROM runfilter(f.loid) t);

Where we've defined ts_accum to be:

  CREATE AGGREGATE ts_accum (tsvector) (
    SFUNC = tsvector_concat,
    STYPE = tsvector,
    INITCOND = ''
  );

bytea is different because you know when the values has changed (i.e.
write a trigger) but you need to write more code to get the bytea value
out into the filesystem.

--
  Sam  http://samason.me.uk/

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: How to prevent duplicate key error when two processes do DELETE/INSERT simultaneously?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: OID in $_TD->{new}/$_TD->{old}