Обсуждение: Re: btree_gin, bigint and number literals

Поиск
Список
Период
Сортировка

Re: btree_gin, bigint and number literals

От
"Quentin de Metz"
Дата:
Hello Tom,

I see that you have commited a change (e2b64fcef35f70f96fa92db56fbfa9ac2da136c7) which addresses this issue. Thank
you!

I looked into this issue recently and still don't understand why this would not work for other hardware variants. Will
ityield the wrong plan, or will the plan's execution yield wrong results?
 

I'm surprised because the SQL changes I proposed seemed relatively aligned with the existing extension source code
whichreferences support functions defined in code related to btree indexes (e.g. btint2cmp). These functions are
alreadyhardware-independent. Aren't these functions the ones called by the engine when executing a query and going
throughthe index - as explained here (https://www.postgresql.org/docs/18/xindex.html#XINDEX-OPFAMILY)?
 

Also there a specific reason the integer-related operator classes defined in btree_gin (int2_ops, int4_ops, int8_ops)
don'tbelong to the same operator family? That seems to be the direction suggested by the documentation I linked to
above.

Finally, what is your approach to testing on 32-bit or big-endian hardware?

Thank you for your guidance,

Quentin


On Fri, Jan 31, 2025, at 21:42, Tom Lane wrote:
> "Quentin de Metz" <quentin@de.me.tz> writes:
>> On a multi-column GIN index over a bigint column and a text column, the query planner does not filter the index on
thebigint column when a condition on this column is specified with a number literal.
 
>
> Yeah, because "owner_id = 12" will use int84eq, which as you observe
> is not supported by btree_gin's opclass.
>
>> Would you be open to considering a patch to include the ALTER OPERATOR snippet in the btree_gin install script, so
thatthis works out of the box?
 
>
> I'd be quite surprised if that "just works" without any corresponding
> changes in the C code, because btree_gin.c only knows about applying
> same-type-on-both-sides comparison functions.  (int8 vs int4 might
> appear to work as long as you don't try very hard, but for example
> it'd fail on 32-bit or big-endian hardware.)  If you feel like writing
> a patch that actually takes care of the matter fully, step right up.
>
>             regards, tom lane



Re: btree_gin, bigint and number literals

От
Tom Lane
Дата:
"Quentin de Metz" <quentin@de.me.tz> writes:
> I looked into this issue recently and still don't understand why
> this would not work for other hardware variants. Will it yield the
> wrong plan, or will the plan's execution yield wrong results?

Your original patch will yield wrong results in some cases, up to and
including crashing.  That's because the representations of different
datatypes as Datums aren't guaranteed compatible.

> I'm surprised because the SQL changes I proposed seemed relatively
> aligned with the existing extension source code which references
> support functions defined in code related to btree indexes
> (e.g. btint2cmp). These functions are already hardware-independent.

Um ... not really.  If you dig down into what btint8cmp does, for
example, it's different on 32-bit hardware than 64-bit hardware.
We do our best to hide that behind macros in the source code, but
that doesn't mean the difference isn't there.  So you can't just
blindly apply btint8cmp to int4 or int2 values.

You might find this thread informative:

https://www.postgresql.org/message-id/flat/1749799.1752797397@sss.pgh.pa.us

as it's proposing to remove one of the reasons why there's a problem.
But even if that goes in, I still would not trust your original patch.
There's no code-level guarantee that treating a Datum of one type as
being of some other type is okay.

> Also there a specific reason the integer-related operator classes
> defined in btree_gin (int2_ops, int4_ops, int8_ops) don't belong to
> the same operator family? That seems to be the direction suggested
> by the documentation I linked to above.

Whoever invented the btree_gin extension didn't make them that way.
It might be that btree_gin actually predates our invention of operator
families, not sure.  Anyway we don't have an easily-compatible way
to rearrange them into a single family, and there wouldn't be that
much benefit in doing so.  (We invented operator families mostly
to allow the planner to reason about cross-type comparisons, and
it only cares about btree and hash families.)

> Finally, what is your approach to testing on 32-bit or big-endian hardware?

Well, I've still got an ancient MacBook (32-bit PPC), and when that
isn't suitable I have an account with the GCC compile farm:

https://portal.cfarm.net

In principle you could also use a QEMU VM, but I've found that its
emulation of other architectures frequently leaves a lot to be
desired.

Of course, our own buildfarm also provides testing of such
architectures, but we try hard to iron out portability problems
before patches get to the buildfarm.

            regards, tom lane