GiST or GIN, I feel like I am doing something wrong

Поиск
Список
Период
Сортировка
От Aaron
Тема GiST or GIN, I feel like I am doing something wrong
Дата
Msg-id 13e03a2b0906170527n6ca10021o7541153d9657bac6@mail.gmail.com
обсуждение исходный текст
Ответы Re: GiST or GIN, I feel like I am doing something wrong  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: GiST or GIN, I feel like I am doing something wrong  (Oleg Bartunov <oleg@sai.msu.su>)
Список pgsql-general
We are testing full text searching on a small chunk of our data. We
have created an INDEX to make searching faster.  From the PostgreSQL
8.3 docs, we are running 8.3.7, it seems we should be running GIN
indexes.  The reason GIN on paper seems like the right INDEX:
* we have static data
* we have over 241071 unique words (lexemes)
* GIN index lookups are about three times faster and we are 99.9% searching

The problem is that we have been testing with both INDEX types and
GiST is killing GIN.  I believe it has to do with the size of our GiST
index.
SELECT * from relation_size where relation like '%full%';
                  relation                  |  size
--------------------------------------------+--------
 public.profile_images_fulltext_gin         | 437 MB
 public.profile_images_fulltext             | 161 MB
 public.profile_images_fulltext_gist        | 66 MB
 public.profile_images_fulltext_pif_key_key | 18 MB
(4 rows)

So my questions...
Why is the GiST index so large?
Would the large size likely effect performance?
Am I doing something fundamentally wrong?
Yes I was sure to ANALYZE public.profile_images_fulltext between all
my INDEX DROP and CREATE

More details:
owl=# \d profile_images_fulltext
             Table "public.profile_images_fulltext"
      Column      |            Type             |   Modifiers
------------------+-----------------------------+---------------
 pif_key          | bigint                      | not null
 content          | tsvector                    |
 datetime_created | timestamp without time zone | default now()
 raw              | text                        |

owl=# SELECT count (pif_key) from public.profile_images_fulltext;
 count
--------
 630699
(1 row)

owl=# SELECT count(word) FROM ts_stat('SELECT content FROM
profile_images_fulltext');
 count
--------
 241071
(1 row)

CREATE INDEX profile_images_fulltext_gin ON profile_images_fulltext
USING gin(content);
CREATE INDEX profile_images_fulltext_gist ON profile_images_fulltext
USING gist(content);



Any and all thoughts would be greatly appreciated,
Aaron Thul
http://www.chasingnuts.com
Life is complex: it has real and imaginary components.

В списке pgsql-general по дате отправления:

Предыдущее
От: Whit Armstrong
Дата:
Сообщение: how to cancel a query in progress
Следующее
От: David Fetter
Дата:
Сообщение: Re: Naming functions with reserved words