Re: GiST intarray rd-tree indexes using intbig

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: GiST intarray rd-tree indexes using intbig
Дата
Msg-id 466A5E4F.50304@sigaev.ru
обсуждение исходный текст
Ответ на GiST intarray rd-tree indexes using intbig  ("Jonathan Gray" <jgray@streamy.com>)
Список pgsql-hackers
> It is documented that intbig utilizes 4096 bit signatures to represent 
> the set nodes in the tree.  However, I am unable to find any reference 
> to a 4kbit signature in the code and am wondering where this is implemented.
_int.h:
/* bigint defines */
#define SIGLENINT  63           /* >122 => key will toast, so very slow!!! */
#define SIGLEN  ( sizeof(int)*SIGLENINT )
#define SIGLENBIT (SIGLEN*BITS_PER_BYTE)

63 /*ints*/ * 4 /* byte per int */ * 8 /* bits per byte */ =  2016 bits
From our experience, using power of 2 number of bits causes bad hashing of array.

You can play with value of SIGLENINT:
- less value decreases index's size, but increase number of false drops, so  index will return more values and pgsql
willdrop it after rechecking of  table's value. That increases table's access
 
- greater value increase index's size but decrease number of false drops.

So, you can find optimal SIGLENINT value for your sets of data.


> Also, is the leaf comparison also a signature comparison like the 
> nodes?  Or is this an exact comparison?  From my understanding of the 

What do you mean: comparison of signatures? RD-Tree doesn't use any comparison 
functions like B-Tree does. Here we use distance function. Distance might be 
defined in different meaning, but we use Hemming distance 
(_intbig_gist.c:hemdistsign) which is number of different bits in signatures.

> code, it doesn’t appear to be an exact comparison.  If this is the case, 
> how can I access the original intarray that is being referenced by this 
> signature?

Index doesn't store original int[] at all. From GiST support fuction there is no 
way to get access to table's value :(.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: Controlling Load Distributed Checkpoints
Следующее
От: "Cui Shijun"
Дата:
Сообщение: Re: Issues with factorial operator