Re: DRAFT GIST support for ORDER BY

Поиск
Список
Период
Сортировка
От Michał Kłeczek
Тема Re: DRAFT GIST support for ORDER BY
Дата
Msg-id 6156BF7B-F26B-4DF3-A23F-1136F29D5BDE@kleczek.org
обсуждение исходный текст
Ответ на Re: DRAFT GIST support for ORDER BY  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: DRAFT GIST support for ORDER BY
Список pgsql-hackers

> On 30 Oct 2023, at 13:31, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>
>> The solution is not ideal as it requires registering “<“ and “>” operators as ordering operators in opfamily
>> (which in turn makes it possible to issue somewhat meaningless “ORDER BY a < ‘constant’)
>
> I don't quite understand why we need to register new "<" and ">"
> operators. Can't we update the current ones?

I wasn’t precise: what is needed is adding pg_amop entries with amoppurpose = ‘o’ for existing “<" and “>" operators.

>
>> The problem is though that right now handling of ORDER BY column clauses is tightly coupled to BTree.
>> It would be good to refactor the code so that semantics of ORDER BY column could be more flexible.
>
> The existence of a BTREE operator class for the type is the indicator
> that (and how) the type can be ordered - that is where PostgreSQL gets
> its methods for ordering most types. Although I agree that it's a
> quirk, I don't mind it that much as an indicator of how a type is
> ordered.
> I do agree, though, that operator classes by themselves should be able
> to say "hey, we support full ordered retrieval as well". Right now,
> that seems to be limited to btrees, but indeed a GiST index with
> btree_gist columns should be able to support the same.

Right now opfamily and strategy are set in PathKey before creating index scan paths.

The patch actually copies existing code from create_indexscan_plan
that finds an operator OID for (pk_opfamily, pk_strategy).
The operator is supposed to be binary with specific operand types.

To create a path:
1) do the operator OID lookup as above
2) look for sortfamily of pg_amop entry for (operator did, index opfamily)
If the sort family is the same as pk_opfamily we can create a path.

The side effect is that it is possible to “ORDER BY column < ‘constant’” as we have more ordering operators in pg_amop.

Ideally we could look up _unary_ operator in pg_amop instead - that would make sense we are actually measuring some
“absolutedistance”. 
But this would require more changes - createplan.c would need to decide when to lookup unary and when - binary
operator.


>> It would be great if someone could take a look at it.
>
> I've not looked in detail at the patch, but here's some comments:
>
>> --- a/contrib/btree_gist/btree_gist--1.6--1.7.sql
>> +++ b/contrib/btree_gist/btree_gist--1.6--1.7.sql
>
> You seem to be modifying an existing migration of a released version
> of the btree_bist extension. I suggest you instead add a migration
> from 1.7 to a new version 1.8, and update the control file's default
> installed version.

Thanks. I didn’t know how to register a new migration so did it that way.
Will try to fix that.

>
>> ORDER BY a == ORDER BY a <-> MIN_VALUE
>> and
>> ORDER BY a DESC == ORDER BY a <-> MAX_VALUE
>>
>> This allows implementing GIST ordered scans for btree_gist datatypes.
>>
>> This in turn makes using GIST with partitioning feasible (I have described issues with such usage in my previous
e-mails- see below). 
>
> Did you take into account that GiST's internal distance function uses
> floating point, and is thus only an approximation for values that
> require more than 2^54 significant bits in their distance function?
> For example, GiST wouldn't be guaranteed to yield correct ordering of
> int8/bigint when you use `my_column <-> UINT64_MAX` because as far as
> the floating point math is concerned, 0 is about as far away from
> INT64_MAX as (say) 20 and -21.

Hmm… Good point but it means ORDER BY <-> is broken for these types then?
The patch assumes it works correctly and just uses it for ordered scans.

—
Michal


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Parallel query behaving different with custom GUCs
Следующее
От: Robert Haas
Дата:
Сообщение: Re: POC, WIP: OR-clause support for indexes