Re: GiST seems to drop left-branch leaf tuples

Поиск
Список
Период
Сортировка
От Peter Tanski
Тема Re: GiST seems to drop left-branch leaf tuples
Дата
Msg-id 218BEF96-3524-41EB-A15C-67CA3DAD4B58@raditaz.com
обсуждение исходный текст
Ответ на GiST seems to drop left-branch leaf tuples  (Peter Tanski <ptanski@raditaz.com>)
Ответы Re: GiST seems to drop left-branch leaf tuples  (Oleg Bartunov <oleg@sai.msu.su>)
Список pgsql-hackers
I found another off-by-one error in my Picksplit() algorithm and the GiST index contains one leaf tuple for each row in
thetable now.  The error was to start from 1 instead of 0 when assigning the entries.  Thanks to everyone for your
help.

For the record, this is the only GiST index I know of where the keys are over 2000 bytes in size.  So GiST definitely
handleslarge keys.  Perhaps the maximum size for intarray could be increased. 

On Nov 23, 2010, at 4:01 PM, Yeb Havinga wrote:

> On 2010-11-23 20:54, Peter Tanski wrote:
>> On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote:
>>>>>> j = 0;
>>>>>> for (i = FirstOffsetNumber; i<  maxoff; i = OffsetNumberNext(i)) {
>>>>>>   FPrint* v = deserialize_fprint(entv[i].key);
>>>>> Isn't this off by one?  Offset numbers are 1-based, so the maxoff
>>>>> computation is wrong.
>>> The first for loop of all others compare with i<= maxoff instead of i<  maxoff.
>> You are right: I am missing the last one, there.  (During a memory-debugging phase entv[entryvec-n - 1] was always
invalid,probably as a memory overwrite error but I fixed that later and never changed it back.) 
>>
>> On the other hand, there are two problems:
>>
>> 1. the maximum size on a GiST page is 4240 bytes, so I cannot add a full-size Datum using this kind of hash-key
setup(the base Datum size is 4230 bytes on a 64-bit machine).  The example test cases I used were smaller in order to
getaround that issue: they are 2326 bytes base size. 
>>
>> 2. Even after fixing the Picksplit() loop, the dropped-leaf problem still manifests itself:
> I noticed an n_entries intialization in one of your earlier mails that might also be a source of trouble. I was under
theimpression that gistentryvectors have n-1 entries (not n-2 as you say), because the first element (0 /
InvalidOffsetNumber)must be skipped. E.g. entryvec->n = 5. This means that there are 4 entries, which are in array
positions1,2,3,4. 
>
> btw: interesting topic, audio fingerprinting!
>
> regards,
> Yeb Havinga
>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: security hooks on object creation
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Latches with weak memory ordering (Re: max_wal_senders must die)