Re: EXISTS vs IN vs OUTER JOINS

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: EXISTS vs IN vs OUTER JOINS
Дата
Msg-id web-2291801@davinci.ethosmedia.com
обсуждение исходный текст
Ответ на EXISTS vs IN vs OUTER JOINS  (Tomasz Myrta <jasiek@klaster.net>)
Ответы Re: EXISTS vs IN vs OUTER JOINS  (jasiek@klaster.net)
Re: EXISTS vs IN vs OUTER JOINS  (Joe Conway <mail@joeconway.com>)
Список pgsql-performance
Tomasz,

> Few days ago I read, that EXISTS is better than IN, but only if there
> are many records (how many?). I was wondering which one is better and
> when. Did anyone try to compare these queries doing the same work:
>
> - select * from some_table t
>     where t.id [not] in (select id from filter);
> -select * from some_table t
>     where [not] exists (select * from filter where id=t.id);

The rule I use is: if I expect the sub-select to return more than 12
records 20% or more of the time, use EXISTS.   The speed gain for IN on
small lists is not as dramatic as the speed loss for EXISTS on large
lists.

More importantly, the difference between NOT IN and NOT EXISTS can be
as much as 20:1 on large sub-selects, as opposed to IN and EXISTS,
where I have rarely seen a difference of more than 3:1.  As I
understand it, this is because NOT EXISTS can use optimized join
algorithms to locate matching rows, whereas NOT IN must compare each
row against every possible matching value in the subselect.

It also makes a difference whether or not the referenced field(s) in
the subselect is indexed.   EXISTS will often use an index to compare
the values in the master query to the sub-query.  As far as I know, IN
can use an index to retrieve the subquery values, but not to sort or
compare them after they have been retreived into memory.

> -select * from some_table t
>    left join filter f using (id)
>   where f.id is [not] null;

This will not get you the same result as the above.  It will get you
all rows from t+f where a record is present in f and has (or does not
have) a NULL value for f.id.   While this type of query works in MS
Access, it will not work in SQL92/99-commpliant databases.

Incidentally, the dramatic differences between IN and EXISTS are not
only a "PostgreSQL Thing".   The same rules apply to  MS SQL Server and
SQL Anywhere, for the same reasons.

-Josh Berkus


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tomasz Myrta
Дата:
Сообщение: EXISTS vs IN vs OUTER JOINS
Следующее
От: jasiek@klaster.net
Дата:
Сообщение: Re: EXISTS vs IN vs OUTER JOINS