Re: dumb question

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: dumb question
Дата
Msg-id CAKFQuwYgCb6+v8T82yunGiAmwstSDgwDYTdajXFB-9VHannPFw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: dumb question  (Kevin Grittner <kgrittn@gmail.com>)
Список pgsql-general
On Thu, Jun 2, 2016 at 5:44 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:
>> Thanks all the below seem to do the trick.

I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want.  (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)

test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from t);
 max
-----

(1 row)

test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
 max
-----
   3
(1 row)

Note that providing minimal setup (like the above) helps in getting
good answers quickly.

>> do note, this is whats known as an 'anti-join', and these can be pretty
>> expensive on large tables.
>
> +1

*Can* be.  Proper indexing can make them very reasonable.

​Doh (me)...

Indeed, NOT IN (...) doesn't qualify as an anti-join since (for one) it cannot (I don't think) be optimized in this way as the entire contents of the IN() need to be determined.  IOW, its not really a join but just another predicate condition whose one side is a subquery.

That said, writing out a full anti-join NOT EXISTS (or, similarly, a semi-join EXISTS) clause can be a bit tedious for ad-hoc stuff while the IN() variation is a bit more succinct and, I'd venture to say, unfortunately familiar.  In can be made to work in this situation by writing the expression as IN (SELECT ref_id FROM t WHERE ref_id IS NOT NULL).

David J.

В списке pgsql-general по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: PL/PGSQL + inserts+updates+limit - Postgres 9.3
Следующее
От: Merlin Moncure
Дата:
Сообщение: Re: RowDescription via the SQL?