Re: knngist - 0.8

Поиск
Список
Период
Сортировка
От Martijn van Oosterhout
Тема Re: knngist - 0.8
Дата
Msg-id 20101228082252.GA3085@svana.org
обсуждение исходный текст
Ответ на Re: knngist - 0.8  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: knngist - 0.8  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sun, Dec 26, 2010 at 08:13:40PM -0500, Tom Lane wrote:
> [ thinks for a bit... ]  One reason for having a different structure
> would be if we needed to represent abstract semantics for some operators
> that couldn't be associated with a btree opclass.  This is clearly not
> an issue for what RANGE needs, since anything you can order by will
> surely have a btree opclass for that, and in fact we probably need to
> tie those operators to specific orderings if a datatype has more than
> one sort ordering.  But maybe there could be some other situation where
> we'd need to describe operator behavior for a datatype independently of
> any sort ordering.  Can anyone come up with a plausible example?

One thing that comes to mind is the operators used for hash indexes,
namely the hash() function. It is closely related to the collation but
isn't actually used for sorting. For every btree class you can make a
hash class with the same equality operator and an appropriate hash
function.

I've had the idea of defining a parent object and deriving the btree
and hash operator classes from that, but it gets messy once you get
into cross-type operators (i.e. operator families).

With respect to the collation of strings I have thought it useful to be
able to define a sortkey() function, which would map the input space to
a 8 byte integer and satisfies the rule:

sortkey(a) < sortkey(b)  implies a < b

The idea being that you can use this as an efficient first step to
speed up sorting strings, since adding it to the sort list implicitly
before the actual column doesn't change the result. In the case of
strings the sortkey() could be generated with strxfrm().

A similar idea could be used with other expensive comparison
operations, but I can't think of any at the moment. Actually, perhaps
you could use it with ints/floats/etc as well, since you could skip the
function call overhead. You'd be trading (n log n int compares + n
sortkeys) with (n log n comparisions).

Just some thoughts,

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shigeru HANADA
Дата:
Сообщение: Re: SQL/MED - core functionality
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: SQL/MED - core functionality