Обсуждение: Index Usage using IN

Поиск
Список
Период
Сортировка

Index Usage using IN

От
Ralph Mason
Дата:
Hi,

I have 2 tables both have an index on ID (both ID columns are an oid).

I want to find only only rows in one and not the other.

Select ID from TableA where ID not IN ( Select ID from Table B)

This always generates sequential scans.

Table A has about 250,000 rows.   Table B has about 250,000 Rows.

We should get a Scan on Table B and a Index Lookup on Table A.

Is there any way to force this?  enable_seqscan off doesn't help at all.

The Plan is

Seq Scan on tablea(cost=100000000.00..23883423070450.96 rows=119414 width=4)
  Filter: (NOT (subplan))"
  SubPlan    ->
     Seq Scan on tableb (cost=100000000.00..100004611.17 rows=242617
width=4)


Thanks
Ralph




Re: Index Usage using IN

От
"Jeffrey W. Baker"
Дата:
On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote:
> Hi,
>
> I have 2 tables both have an index on ID (both ID columns are an oid).
>
> I want to find only only rows in one and not the other.
>
> Select ID from TableA where ID not IN ( Select ID from Table B)

Have you considered this:

SELECT ID from TableA EXCEPT Select ID from Table B

?

-jwb


Re: Index Usage using IN

От
"Jeffrey W. Baker"
Дата:
On Wed, 2006-02-01 at 12:22 -0800, Jeffrey W. Baker wrote:
> On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote:
> > Hi,
> >
> > I have 2 tables both have an index on ID (both ID columns are an oid).
> >
> > I want to find only only rows in one and not the other.
> >
> > Select ID from TableA where ID not IN ( Select ID from Table B)
>
> Have you considered this:
>
> SELECT ID from TableA EXCEPT Select ID from Table B

Alternately:

   SELECT a.ID
     FROM TableA AS a
LEFT JOIN TableB AS b
       ON a.ID = b.ID
    WHERE b.ID IS NULL

-jwb

Re: Index Usage using IN

От
Tom Lane
Дата:
"Jeffrey W. Baker" <jwbaker@acm.org> writes:
> On Thu, 2006-02-02 at 09:12 +1300, Ralph Mason wrote:
>> Select ID from TableA where ID not IN ( Select ID from Table B)

> Have you considered this:

> SELECT ID from TableA EXCEPT Select ID from Table B

Also, increasing work_mem might persuade the planner to try a hashed
subplan, which'd be a lot better than what you have.  Note that it's
quite unlikely that indexes are going to help for this.

            regards, tom lane

Re: Index Usage using IN

От
Bruno Wolff III
Дата:
On Thu, Feb 02, 2006 at 09:12:59 +1300,
  Ralph Mason <ralph.mason@telogis.com> wrote:
> Hi,
>
> I have 2 tables both have an index on ID (both ID columns are an oid).
>
> I want to find only only rows in one and not the other.
>
> Select ID from TableA where ID not IN ( Select ID from Table B)
>
> This always generates sequential scans.
>
> Table A has about 250,000 rows.   Table B has about 250,000 Rows.
>
> We should get a Scan on Table B and a Index Lookup on Table A.

I don't think that is going to work if there are NULLs in table B.
I don't know whether or not Postgres has code to special case NULL testing
(either for constraints ruling them out, or doing probes for them in addition
to the key it is trying to match) for doing NOT IN. Just doing a simple
index probe into table A isn't going to tell you all you need to know if
you don't find a match.

Re: Index Usage using IN

От
Hari Warrier
Дата:
Select ID from TableA where not exists ( Select ID from Table B where ID
= TableA.ID)
might give you index scan. Of course, that is only useful is TableA is
very small table.
Not appropriate for 250k rows

on 2/1/2006 12:12 PM Ralph Mason said the following:
> Hi,
>
> I have 2 tables both have an index on ID (both ID columns are an oid).
>
> I want to find only only rows in one and not the other.
>
> Select ID from TableA where ID not IN ( Select ID from Table B)
>
> This always generates sequential scans.
>
> Table A has about 250,000 rows.   Table B has about 250,000 Rows.
>
> We should get a Scan on Table B and a Index Lookup on Table A.
>
> Is there any way to force this?  enable_seqscan off doesn't help at all.
>
> The Plan is
>
> Seq Scan on tablea(cost=100000000.00..23883423070450.96 rows=119414
> width=4)
>  Filter: (NOT (subplan))"
>  SubPlan    ->     Seq Scan on tableb (cost=100000000.00..100004611.17
> rows=242617 width=4)
>
>
> Thanks
> Ralph
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>