Re: NOT IN subquery optimization

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: NOT IN subquery optimization
Дата
Msg-id 7771.1551586295@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: NOT IN subquery optimization  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: NOT IN subquery optimization
Список pgsql-hackers
David Rowley <david.rowley@2ndquadrant.com> writes:
> On Sun, 3 Mar 2019 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My initial thought about plugging that admittedly-academic point is
>> to insist that the join operator be both strict and a member of a
>> btree opclass (hash might be OK too; less sure about other index types).

> Why strict? If both inputs are non-NULL, then what additional
> guarantees does strict give us?

Yeah, if we're verifying the inputs are non-null, I think that probably
doesn't matter.

> I implemented a btree opfamily check in my version of the patch. Not
> so sure about hash, can you point me in the direction of a mention of
> how this is guarantees for btree?

https://www.postgresql.org/docs/devel/btree-support-funcs.html
quoth

    The comparison function must take two non-null values A and B and
    return an int32 value that is < 0, 0, or > 0 when A < B, A = B, or A >
    B, respectively. A null result is disallowed: all values of the data
    type must be comparable.

(At the code level, this is implicit in the fact that the comparison
function will be called via FunctionCall2Coll or a sibling, and those
all throw an error if the called function returns NULL.)

Now, it doesn't say in so many words that the comparison operators
have to yield results consistent with the comparison support function,
but I think that's pretty obvious ...

For hash, the equivalent constraint is that the hash function has to
work for every possible input value.  I suppose it's possible that
the associated equality operator would sometimes return null, but
it's hard to think of a practical reason for doing so.

I've not dug in the code, but I wouldn't be too surprised if
nodeMergejoin.c or nodeHashjoin.c, or the stuff related to hash
grouping or hash aggregation, also contain assumptions about
the equality operators not returning null.

> The list of builtin types that have a hash opfamily but no btree
> opfamily that support NOT IN are not very exciting, so doing the same
> for hash might not be worth the extra code.

Agreed for builtin types, but there might be some extensions out there
where this doesn't hold.  It's not terribly hard to imagine a data type
that hasn't got a linear sort order but is amenable to hashing.
(The in-core xid type is an example, actually.)

            regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: NOT IN subquery optimization
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: WIP: BRIN multi-range indexes