why is gist index taking so much space on the disc

Поиск
Список
Период
Сортировка
От Grzegorz Jaskiewicz
Тема why is gist index taking so much space on the disc
Дата
Msg-id EAAFDF94-A127-464F-80DA-6B6959F5130E@pointblue.com.pl
обсуждение исходный текст
Ответы Re: why is gist index taking so much space on the disc  (Teodor Sigaev <teodor@sigaev.ru>)
Re: why is gist index taking so much space on the disc  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-hackers
Hi folks

my conquers with Gist index for custom type are nearly finished. It  
is working as it is now, but there are few problems here and there.
One of em, being amount of disc space index it self takes. The type  
stucture it self takes 160bytes. Adding 100.000 rows into table -  
CREATE TABLE blah (a serial, b customType);
with my gist index takes around 2GB on disc ! 100.000 is a large  
number, but the purpose of having gist in first place is defeated if  
that machine can't handle fast I/O or has at least 3GB of ram, first  
to hold index in cache, secondly to operate postgres caching (shared  
memory).
Is it normal that index is so hudge ? Even tho my type has built in  
masks (element that can match few different values), and %. up front  
the string (which behaves just like the sql % in b ~ '%.something').  
And both are used to build "unions" for pick-split, and other  
operations. Is it because of pick-split it self ? It does good work  
in splitting up table of elements into two separate ones, by sorting  
them first, than creating common "mask" for L and P. And by scanning  
whole table again, and putting elements matching into L or P. L and P  
elements sometimes overlap, but so far I can't find better solution.  
Having to iterate 10 or 20 times using k-means (the type holds tree a  
like structure) isn't going to boost efficiency either.
This index works, and it is very fast, but still large.

So final question, what should I do to make that index much smaller  
on the disc.

-- 
GJ

"If we knew what we were doing, it wouldn't be called Research, would  
it?" - AE





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Time for pgindent?
Следующее
От: James William Pye
Дата:
Сообщение: Re: plpython and bytea