Re: Review: GiST support for UUIDs

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: Review: GiST support for UUIDs
Дата
Msg-id 55F72EBF.70608@sigaev.ru
обсуждение исходный текст
Ответ на Re: Review: GiST support for UUIDs  (Paul Jungwirth <pj@illuminatedcomputing.com>)
Ответы Re: Review: GiST support for UUIDs  (Paul Jungwirth <pj@illuminatedcomputing.com>)
Список pgsql-hackers

Paul Jungwirth wrote:
>> 2)
>>      static double
>>      uuid2num(const pg_uuid_t *i)
>>      {
>>          return *((uint64 *)i);
>>      }
>>     It isn't looked as correct transformation for me. May be, it's better
>>     to transform to numeric type (UUID looks like a 16-digit hexademical
>> number)
>>     and follow  gbt_numeric_penalty() logic (or even call directly).
>
> Thanks for the review! A UUID is actually not stored as a string of
> hexadecimal digits. (It is normally displayed that way, but with 32
> digits, not 16.) Rather it is stored as an unstructured 128-bit value
> (which in C is 16 unsigned chars). Here is the easy-to-misread
> declaration from src/backend/utils/adt/uuid.c:
Missed number of digit, but nevertheless it doesn't matter for idea. 
Original coding uses only 8 bytes from 16 to compute penalty which could 
cause a problem with index performance. Simple way is just printing each 
4bits  with %02d modifier into string and then make a numeric value with 
a help of numeric_in.

Or something like this in pseudocode:

numeric = int8_numeric(*(uint64 *)(&i->data[0])) * 
int8_numeric(MAX_INT64) + int8_numeric(*(uint64 *)(&i->data[8]))

> The only other 128-bit type I found in btree_gist was Interval. For that
> type we convert to a double using INTERVAL_TO_SEC, then call
> penalty_num. By my read that accepts a similar loss of precision.
Right, but precision of double  is enough to represent 1 century 
interval with 0.00001 seconds accuracy which is enough for  practical 
usage. In UUID case you will take into account only half of value. Of 
course, GiST will work even with penalty function returning constant but 
each scan could become full-index-scan.

>
> If I'm mistaken about 128-bit integer support, let me know, and maybe we
> can do the penalty computation on the whole UUID. Or maybe I should just
> convert the uint64 to a double before calling penalty_num? I don't
> completely understand what the penalty calculation is all about, so I
> welcome suggestions here.

Penalty method calculates how union key will be enlarged if insert will 
be produced in current subtree. It directly affects selectivity of subtree.

-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru                                      WWW:
http://www.sigaev.ru/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Check existency of table/schema for -t/-n option (pg_dump/pg_res
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: exposing pg_controldata and pg_config as functions