Обсуждение: BTree index

Поиск
Список
Период
Сортировка

BTree index

От
vjanand@uwm.edu
Дата:
I am trying to find information regarding creation of B-tree index in postgres 
for variable length character data (Char/varchar type). Specifically, what 
pagination policy is used, does it use prefix BTree, or any other form of 
compression (encoding)? 


Regards,
VJ Anand




Re: BTree index

От
Alvaro Herrera
Дата:
On Wed, Nov 05, 2003 at 09:08:31AM -0600, vjanand@uwm.edu wrote:

> I am trying to find information regarding creation of B-tree index in postgres 
> for variable length character data (Char/varchar type). Specifically, what 
> pagination policy is used, does it use prefix BTree, or any other form of 
> compression (encoding)? 

I was very surprised while writing this answer:

The whole key is stored (no prefix, no pagination).  If the key is too
big, it won't fit into the index and the insertion will be rejected:

regression=# create table test (a text);
CREATE TABLE
regression=# create index test_idx on test(a);
CREATE INDEX
regression=# insert into test values ('hello world');
INSERT 17115 1
regression=# insert into test select repeat(a,10) from test;
INSERT 17116 1
regression=# insert into test select repeat(a,10) from test;
INSERT 0 2
regression=# insert into test select repeat(a,10) from test;
INSERT 0 4
regression=# insert into test select repeat(a,10) from test;
INSERT 0 8
regression=# insert into test select repeat(a,10) from test;
ERROR:  fila de índice requiere 12624 bytes, tamaño máximo es 8191
-- oops
regression=# set lc_messages to 'C';
SET
regression=# insert into test select repeat(a,10) from test;
ERROR:  index row requires 12624 bytes, maximum size is 8191

So, what size were the tuples inserted:
regression=# select max(length(a)) from test; max   
--------110000
(1 fila)

What!? 110000 bytes?  I have always had the idea that the tuples were
uncompressed, so how can 110000 bytes be stored in 8191 bytes?  After
tracking into the sourcecode, I found that in
src/backend/access/common/indextuple.c the index_formtuple routine seems
to compress the key before insertion.  In src/include/tuptoaster.h there
is a symbol for activation of this feature that is set at least on my
sources (TOAST_INDEX_HACK).

So, there you are: the compression used is the same "lousy fast" LZ
algorithm used elsewhere in the TOAST code (toast_compress_datum()).

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
FOO MANE PADME HUM