Re: NOT IN subquery optimization

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: NOT IN subquery optimization
Дата
Msg-id CAKJS1f9GndGjLQE5yqkwXXxHBei2=uFQnHa40q5iQA=pC9U4bA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: NOT IN subquery optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: NOT IN subquery optimization
Список pgsql-hackers
On Sun, 3 Mar 2019 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm ... thinking about the strictness angle some more: what we really
> need to optimize NOT IN, IIUC, is an assumption that the join operator
> will never return NULL.  While not having NULL inputs is certainly a
> *necessary* condition for that (assuming a strict operator) it's not a
> *sufficient* condition.  Any Postgres function/operator is capable
> of returning NULL whenever it feels like it.  So checking strictness
> does not lead to a mathematically correct optimization.

That's something I didn't think of.

> My initial thought about plugging that admittedly-academic point is
> to insist that the join operator be both strict and a member of a
> btree opclass (hash might be OK too; less sure about other index types).
> The system already contains assumptions that btree comparators never
> return NULL.  I doubt that this costs us any real-world applicability,
> because if the join operator can neither merge nor hash, we're screwed
> anyway for finding a join plan that's better than nested-loop.

Why strict? If both inputs are non-NULL, then what additional
guarantees does strict give us?

I implemented a btree opfamily check in my version of the patch. Not
so sure about hash, can you point me in the direction of a mention of
how this is guarantees for btree?

The attached v1.2 does this adds a regression test using the LINE
type. This has an operator named '=', but no btree opfamily. A few
other types are in this boat too, per:

select typname from pg_type t where not exists(select 1 from pg_amop
where amoplefttype = t.oid and amopmethod=403) and exists (select 1
from pg_operator where oprleft = t.oid and oprname = '=');

The list of builtin types that have a hash opfamily but no btree
opfamily that support NOT IN are not very exciting, so doing the same
for hash might not be worth the extra code.

select typname from pg_type t where exists(select 1 from pg_amop where
amoplefttype = t.oid and amopmethod=405) and exists (select 1 from
pg_operator where oprleft = t.oid and oprname = '=') and not
exists(select 1 from pg_amop where amoplefttype = t.oid and
amopmethod=403);
 typname
---------
 xid
 cid
 aclitem
(3 rows)

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Online verification of checksums
Следующее
От: Tom Lane
Дата:
Сообщение: Re: NOT IN subquery optimization