Обсуждение: 8.4/9.0 simple query performance regression
All,
Just got this simple case off IRC today:
8.4.4
This plan completes in 100ms:
old_prod=# explain analyze select email from u_contact where id not in
(select contact_id from u_user);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Seq Scan on u_contact (cost=2217.72..4759.74 rows=35560 width=22)
(actual time=61.283..107.169 rows=4521 loops=1)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on u_user (cost=0.00..2051.38 rows=66538 width=8)
(actual time=0.034..33.303 rows=66978 loops=1)
Total runtime: 108.001 ms
9.0.2
This plan does not complete in 15 minutes or more:
new_prod=# explain select email from u_contact where id not in (select
contact_id from u_user);
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on u_contact (cost=0.00..100542356.74 rows=36878 width=22)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..2552.56 rows=69504 width=8)
-> Seq Scan on u_user (cost=0.00..1933.04 rows=69504 width=8)
(5 rows)
I'm at a bit of a loss as to what's happening here. I'd guess another
failure of a bail-out-early plan, but I can't see how that would work
with this query.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
> Just got this simple case off IRC today:
> [ hashed versus non-hashed subplan ]
> I'm at a bit of a loss as to what's happening here.
Possibly work_mem is smaller in the second installation?
(If I'm counting on my fingers right, you'd need a setting of at least a
couple MB to let it choose a hashed subplan for this case.)
regards, tom lane
07.06.11 00:45, Josh Berkus написав(ла): > All, > > Just got this simple case off IRC today: > > 8.4.4 > This plan completes in 100ms: > Filter: (NOT (hashed SubPlan 1)) > 9.0.2 > This plan does not complete in 15 minutes or more: > Filter: (NOT (SubPlan 1)) "Hashed" is the key. Hashed subplans usually has much better performance. You need to increase work_mem. I suppose it is in default state as you need not too much memory for hash of 70K integer values. BTW: Why do it want to materialize a result of seq scan without filter. I can see no benefits (or is it more narrow rows?) Best regards, Vitalii Tymchyshyn