Re: tsearch2 and pdf files

Поиск
Список
Период
Сортировка
От Henrik Zagerholm
Тема Re: tsearch2 and pdf files
Дата
Msg-id 179575E2-2F49-427F-9961-CEE966187950@mac.se
обсуждение исходный текст
Ответ на Re: tsearch2 and pdf files  ("Philip Johnson" <philip.johnson@atempo.com>)
Ответы Re: tsearch2 and pdf files  ("Magnus Hagander" <mha@sollentuna.net>)
Список pgsql-general
1. Convert PDF to file with e.g xpdf
2. Insert parsed text to a table of your choice.
3. Make vectors from the text.

Cheers,


11 dec 2006 kl. 18:23 skrev Philip Johnson:

> Do you know what kind of table should I use ?
> Is there a shell script or a php script that does the work ?
>
> regards
>
>> -----Message d'origine-----
>> De : pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>> owner@postgresql.org] De la part de Hannes Dorbath
>> Envoyé : lundi 11 décembre 2006 12:21
>> À : pgsql-general@postgresql.org
>> Objet : Re: [GENERAL] tsearch2 and pdf files
>>
>> You just need software that extracts the text from it. Search
>> google for
>> pdf2txt and others. Printer drivers that try to get text from
>> anything
>> are available as well.
>>
>>
>> On 11.12.2006 11:41, Philip Johnson wrote:
>>> I'm using Postgresql 8.1.5
>>>
>>> Tsearch2 is installed and runs well
>>>
>>> I'd like to use tsearch2 to index PDF files.
>>>
>>> Do someone has a detailed process to implement that?
>>
>>
>> --
>> Regards,
>> Hannes Dorbath
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/


В списке pgsql-general по дате отправления:

Предыдущее
От: John McCawley
Дата:
Сообщение: Re: Status of SSL encryption in ODBC driver
Следующее
От: "Magnus Hagander"
Дата:
Сообщение: Re: tsearch2 and pdf files