> Bruce Momjian wrote:
> >
> > In this QUERY:
> >
> > SELECT keyname
> > FROM markmain
> > WHERE mark_id NOT IN(SELECT mark_id
> > FROM markaty)
> >
> > I have an index on markaty.mark_id, and have vacuum analyzed. EXPLAIN
> > shows:
> >
> > Seq Scan on markmain (cost=2051.43 size=45225 width=12)
> > SubPlan
> > -> Seq Scan on markaty (cost=2017.41 size=52558 width=4)
> >
> > Vadim, why isn't this using the index? Each table has 50k rows. Is it
> > NOT IN that is causing the problem? IN produces the same plan, though.
> ....
> >
> > Seems the optimizer could either hash the subquery, or us an index.
> > Certainly would be faster than a sequental scan, no?
>
> Optimizer should hash the subquery, but I didn't implement this -:(
> Try to rewrite query using NOT EXISTS and index will be used.
How hard would it be to implement it? I know you are deep into MVCC,
but doing a nested loop to join a subquery is really bad.
Now, in our defense, I tried this with commercial Ingres 6.4, and it
took so long I copied the data into PostgreSQL and tried to run it
there. Eventually, I copied the data into a second table, and did a
DELETE FROM using two tables in the WHERE clause, and the rows left
where my NOT IN result. It did use a hash join in that case.
Obviously, Ingres was doing a nested loop do, but I want to do better
than Ingres.
I think we really need to get that hash enabled. Is there something I
can do to enable it, or can I do something to help you enable it?
All queries can't be rewritten as EXISTS.
-- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026