Better management of mergejoinable operators

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Better management of mergejoinable operators
Дата
Msg-id 27157.1165964196@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Better management of mergejoinable operators  (Michael Glaesemann <grzm@seespotcode.net>)
Список pgsql-hackers
I noticed today that process_implied_equality() still contains an ugly
hack that should have been got rid of awhile ago: it assumes that
mergejoinable operators are named "=".  This has been a bogus assumption
for several releases, as illustrated by this failure:

regression=# select * from text_tbl a,text_tbl b,text_tbl c where a.f1 ~=~ b.f1 and b.f1 ~=~ c.f1;
ERROR:  equality operator for types text and text should be merge-joinable, but isn't

It can also be fooled by schema-search-path issues, if the needed
operator exists but isn't in the path.  Since we've not heard complaints
from the field about this, I'm not feeling urgent about having a
back-patchable solution, but I want to find one going forward.

What is actually needed in this function is to be able to find a
mergejoinable equality operator whose oprlsortop and oprrsortop are the
two sortops already known for the input pathkey columns.  We have a
couple of problems doing that though: first, with the present system
catalog layout there seems no way to do that short of a seqscan through
all of pg_operator; and second, what if there's not a unique answer,
ie, multiple equality operators alleging the same lsortop/rsortop?

Right offhand I cannot see a reason why there should be different
equality operators with the same sortops.  (If anyone can come up with
a plausible scenario for that, stop me here...)  So what I'm thinking
about is a unique index on oprlsortop/oprrsortop; that would both allow
efficient search, and prevent multiple answers.

Now we can't do that directly because most of the entries in pg_operator
in fact contain zeroes in these columns, and would cause uniqueness
failures.  Probably the cleanest answer would be to allow these two
columns to be NULL, not zero, when not meaningful; but that would be a
bit of a mess to implement because of the code's assumption of fixed
layout for pg_operator tuples.

What I'm considering doing is moving the oprlsortop/oprrsortop/
oprltcmpop/oprgtcmpop fields out of pg_operator and into a new auxiliary
catalog, named say pg_mergejoinop, that would have entries only for
mergejoinable equality operators.  This would have the same kind of
relationship to pg_operator that pg_aggregate has to pg_proc: if a
pg_operator entry has "oprcanmerge" true, then there's an extension
row for it in pg_mergejoinop.  The catalog would be fairly small and
cheap to search (48 entries in a default install, as of CVS head),
and could support a unique index to constrain the oprlsortop/oprrsortop
columns.

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: psql commandline conninfo
Следующее
От: Tom Lane
Дата:
Сообщение: Re: psql commandline conninfo