Обсуждение: "NOT IN" predicate hangs result

Поиск
Список
Период
Сортировка

"NOT IN" predicate hangs result

От
"Rick Szeto"
Дата:
Hi all,
   I recently tried this select statement and it hung my tool:

1)   select count(*) from user_profile where address_id not in (select address_id from address);

Originally, I just thought that it was slow, so I left it over night and when I can back the next morning it
just hung there(eating up CPU cycles). I looked in Celko's book and then tried this and it worked(quite fast):

2)   select count(*) from user_profile up where not exists (select * from address addr where up.address_id =
addr.address_id);

Is this a known problem?

Thanks
Rick


Re: "NOT IN" predicate hangs result

От
wsheldah@lexmark.com
Дата:

It is generally known that using EXISTS will be much faster than IN for the sort
of query you describe. Seems to come up regularly on this very list in fact.

Wes




"Rick Szeto" <rszeto%csi.ca@interlock.lexmark.com> on 04/11/2002 12:01:29 PM

To:   pgsql-general%postgresql.org@interlock.lexmark.com
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  [GENERAL] "NOT IN" predicate hangs result


Hi all,
   I recently tried this select statement and it hung my tool:

1)   select count(*) from user_profile where address_id not in (select
address_id from address);

Originally, I just thought that it was slow, so I left it over night and when I
can back the next morning it
just hung there(eating up CPU cycles). I looked in Celko's book and then tried
this and it worked(quite fast):

2)   select count(*) from user_profile up where not exists (select * from
address addr where up.address_id = addr.address_id);

Is this a known problem?

Thanks
Rick


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: "NOT IN" predicate hangs result

От
Neil Conway
Дата:
On Thu, 11 Apr 2002 12:01:29 -0400
"Rick Szeto" <rszeto@csi.ca> wrote:
> Hi all,
>    I recently tried this select statement and it hung my tool:
>
> 1)   select count(*) from user_profile where address_id not in (select address_id from address);
>
> Originally, I just thought that it was slow, so I left it over night and when I can back the next morning it
> just hung there(eating up CPU cycles). I looked in Celko's book and then tried this and it worked(quite fast):
>
> 2)   select count(*) from user_profile up where not exists (select * from address addr where up.address_id =
addr.address_id);
>
> Is this a known problem?

Yes; IN doesn't hang Postgres, it is just very slow. It is documented here:

    http://www.ca.postgresql.org/docs/faq-english.html#4.22

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: "NOT IN" predicate hangs result

От
Stephan Szabo
Дата:
On Thu, 11 Apr 2002, Rick Szeto wrote:

> Hi all,
>    I recently tried this select statement and it hung my tool:
>
> 1)  select count(*) from user_profile where address_id not in (select
> address_id from address);
>
> Originally, I just thought that it was slow, so I left it over night
> and when I can back the next morning it just hung there(eating up CPU
> cycles). I looked in Celko's book and then tried this and it
> worked(quite fast):
>
> 2)  select count(*) from user_profile up where not exists (select *
> from address addr where up.address_id = addr.address_id);
>
> Is this a known problem?

It probably would have finished eventually, but right now NOT IN has
serious performance problems.  I believe that it's running the inner
select once for every row in user_profile or something to that effect.
And it's a known deficiency, it's even got its own FAQ question.



Re: "NOT IN" predicate hangs result

От
"Rick Szeto"
Дата:
Thanks Stephan,
  For an operation with 330k entries in the inner select (and 310k entries on the outer table), it would cause a
major problem. As long as it is a know problem, and there is a work around, this is not a problem.

Thanks for the response(s).
Rick

>>> Stephan Szabo <sszabo@megazone23.bigpanda.com> 04/11/02 12:42PM >>>

On Thu, 11 Apr 2002, Rick Szeto wrote:

> Hi all,
>    I recently tried this select statement and it hung my tool:
>
> 1)  select count(*) from user_profile where address_id not in (select
> address_id from address);
>
> Originally, I just thought that it was slow, so I left it over night
> and when I can back the next morning it just hung there(eating up CPU
> cycles). I looked in Celko's book and then tried this and it
> worked(quite fast):
>
> 2)  select count(*) from user_profile up where not exists (select *
> from address addr where up.address_id = addr.address_id);
>
> Is this a known problem?

It probably would have finished eventually, but right now NOT IN has
serious performance problems.  I believe that it's running the inner
select once for every row in user_profile or something to that effect.
And it's a known deficiency, it's even got its own FAQ question.




Re: "NOT IN" predicate hangs result

От
Andrew Sullivan
Дата:
On Thu, Apr 11, 2002 at 09:42:59AM -0700, Stephan Szabo wrote:

> It probably would have finished eventually, but right now NOT IN has
> serious performance problems.  I believe that it's running the inner
> select once for every row in user_profile or something to that effect.
> And it's a known deficiency, it's even got its own FAQ question.

Just out of curiosity, is there a fundamental reason the optimiser
doesn't just rewrite NOT IN queries to use the NOT EXISTS form, or is
it simply that, since there's a well-known workaround, nobody's
bothered to improve this area?

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: "NOT IN" predicate hangs result

От
Tom Lane
Дата:
Andrew Sullivan <andrew@libertyrms.info> writes:
> Just out of curiosity, is there a fundamental reason the optimiser
> doesn't just rewrite NOT IN queries to use the NOT EXISTS form,

There are cases where the NOT IN implementation is better; I'd not like
to see the system forcing one approach to be used.

In the long run we should try to fix IN/NOT IN/EXISTS/NOT EXISTS to be
treated as variant kinds of join, at least for the simple cases that
people tend to use a lot.  This has been on the TODO list for awhile
but hasn't risen to the top.

            regards, tom lane