Re: IN or EXISTS

Поиск

Список

Период

Сортировка

От	Jeff Davis
Тема	Re: IN or EXISTS
Дата	22 сентября 2011 г. 22:32:13
Msg-id	1316741520.31091.7.camel@sussancws0025 обсуждение исходный текст
Ответ на	Re: IN or EXISTS (Craig Ringer <ringerc@ringerc.id.au>)
Список	pgsql-performance

Дерево обсуждения

On Wed, 2011-08-31 at 09:33 +0800, Craig Ringer wrote:
> On the other hand, the `IN' subquery is uncorrelated needs only run
> once, where the `EXISTS' subquery is correlated and has to run once for
> every outer record.

If the EXISTS looks semantically similar to an IN (aside from NULL
semantics), then it can be made into a semijoin. It doesn't require
re-executing any part of the plan.

I don't think there are any cases where [NOT] IN is an improvement, am I
mistaken?

> Another complication is the possible presence of NULL in an IN list.
> Getting NULLs in `IN' lists is a common source of questions on this
> list, because people are quite surprised by how it works. EXISTS avoids
> the NULL handling issue (and in the process demonstrates how woefully
> inconsistent SQL's handling of NULL really is).

Absolutely. The NULL behavior of IN is what makes it hard to optimize,
and therefore you should use EXISTS instead if the semantics are
suitable.

> Theoretically the query planner could transform:
>
> SELECT * from y WHERE y.id IN (SELECT DISTINCT z.y_id FROM z WHERE
> z.y_id IS NOT NULL);
>
> into:
>
> SELECT * FROM y WHERE EXISTS (SELECT 1 FROM z WHERE z.y_id = y.id)
>
> ... or vice versa depending on which it thought would be faster.

Although those two queries are semantically the same (I think), a lot of
very similar pairs of queries are not equivalent. For instance, if it
was a NOT IN you couldn't change that to a NOT EXISTS.

Regards,
    Jeff Davis

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: IN or EXISTS