Consider this setup - which is a gross simplification of parts of our
production system ;-)
create table c (id integer primary key);
create table b (id integer primary key, c_id integer);
create index b_on_c on b(c_id)
insert into c (select ... lots of IDs ...);
insert into b (select id, id from c); /* keep it simple :-) */
Now, I'm just interessted in some few rows.
All those gives good plans:
explain select c.id from c order by c.id limit 1;
explain select c.id from c group by c.id order by c.id limit 1;
explain select c.id from c join b on c_id=c.id order by c.id limit 1;
... BUT ... combining join, group and limit makes havoc:
explain select c.id from c join b on c_id=c.id group by c.id order by c.id
desc limit 5;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=3809.65..3809.67 rows=5 width=4)
-> Group (cost=3809.65..3940.59 rows=26187 width=4)
-> Sort (cost=3809.65..3875.12 rows=26188 width=4)
Sort Key: c.id
-> Hash Join (cost=559.34..1887.89 rows=26188 width=4)
Hash Cond: ("outer".id = "inner".c_id)
-> Seq Scan on c (cost=0.00..403.87 rows=26187 width=4)
-> Hash (cost=403.87..403.87 rows=26187 width=4)
-> Seq Scan on b (cost=0.00..403.87 rows=26187 width=4)
(9 rows)
I get the same behaviour on pg 7.4.7 and pg 8.0.2. Of course, I can
probably use subqueries instead of join - though, I would have wished the
planner could do better ;-)
--
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed. Please delete and destroy
this email. Failure to comply will cause my lawyer to yawn.