Re: knngist - 0.8

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: knngist - 0.8
Дата
Msg-id 22720.1290475929@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: knngist - 0.8  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: knngist - 0.8  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
I wrote:
> As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned,
> I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better
> than Teodor's, though we don't need the multiple-purposes-per-entry
> aspect of it.  This is mainly because it looks more easily extensible
> if we ever do get around to having more purposes.

Although having said that ... I was poking around a bit more and noticed
how entirely bogus the current patch's matching of pathkeys is.  It pays
no attention at all to which pk_opfamily is specified, which means it
will fail given a query likeSELECT ... ORDER BY indexable_col <-> constant USING <<<
where <<< represents some sort order that has nothing to do with the
order that GIST is expecting.  The query will be considered to match the
index and then it will proceed to produce results in the wrong order.

We could perhaps kluge around that by insisting that the pathkey opclass
be the default btree opclass for the relevant datatype; but that's
fairly expensive to check for in match_pathkey_to_index, and it's
getting even further away from having a general-purpose solution.

I'm inclined to think that instead of just specifying "FOR ORDER",
the opclass definition ought to specify "FOR ORDER BY something",
where "something" could perhaps be a pre-existing btree opclass name.
Or maybe it could be a suitable sort operator, though then we'd have
to do a bit of reverse-engineering in match_pathkey_to_index, so that's
still no fun from a lookup-speed point of view.  In any case what we'd
be specifying is a sort order for the datatype that the opclass member
operator yields, not the indexed datatype (so in practice it'd usually
be float8_ops or float8 <, no doubt).

The reason I bring this up now is that it affects the decision as to
what the unique key for pg_amop ought to be.  Instead of having an
enum "purpose" column, maybe we should consider that the unique key
is (operator oid, opfamily oid, order-by-oid), where order-by-oid
is zero for a search operator and the OID of the btree opclass or sort
operator for an ordering operator.  This would be of value if we
imagined that a single opclass could support ordering by more than one
distance ordering operator; which seems a bit far-fetched but perhaps
not impossible.  On the other side of the coin it'd mean we aren't
leaving room for other sorts of operator "purposes".

On balance I'm inclined to leave the unique key as per previous proposal
(with a "purpose" column) and add the which-sort-order-is-that
information as payload columns that aren't part of the key.

Thoughts?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David E. Wheeler"
Дата:
Сообщение: Re: s/LABEL/VALUE/ for ENUMs
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: s/LABEL/VALUE/ for ENUMs