Re: exists
От | Tom Lane |
---|---|
Тема | Re: exists |
Дата | |
Msg-id | 10701.998427822@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: exists (Joseph Shraibman <jks@selectacast.net>) |
Список | pgsql-sql |
Joseph Shraibman <jks@selectacast.net> writes: > Then why does the explain say rows=1363 ? That's the estimate of how many rows the inner SELECT would return, if left free to return them all. You should get the same row count estimate (though quite possibly a different plan) if you just do an EXPLAIN of the sub-select that you have inside EXISTS. Of course you can't easily do that if the sub-select has outer references, but consider this simplified example from the regression database: regression=# explain select * from tenk1; NOTICE: QUERY PLAN: Seq Scan on tenk1 (cost=0.00..333.00 rows=10000 width=148) EXPLAIN regression=# explain select * from tenk1 where unique1 > 9000; NOTICE: QUERY PLAN: Seq Scan on tenk1 (cost=0.00..358.00 rows=1045 width=148) EXPLAIN regression=# explain select unique2, regression-# exists(select * from tenk1 where unique1 > 9000), regression-# unique1 from tenk1; NOTICE: QUERY PLAN: Seq Scan on tenk1 (cost=0.00..333.00 rows=10000 width=8) InitPlan -> Seq Scan on tenk1 (cost=0.00..358.00 rows=1045width=148) EXPLAIN regression=# explain select unique2, regression-# exists(select * from tenk1 where unique1 > out1.unique2), regression-# unique1 from tenk1 out1; NOTICE: QUERY PLAN: Seq Scan on tenk1 out1 (cost=0.00..333.00 rows=10000 width=8) SubPlan -> Seq Scan on tenk1 (cost=0.00..358.00 rows=3333width=148) EXPLAIN In the last case the rows count has changed from a statistics-driven estimate to a default estimate because the planner doesn't have any idea what out1.unique2 will be on any given execution of the subplan. > I don't mean to nitpick here, but maybe this is the symptom of a > larger problem. The only "larger problem" I see here is that the planner doesn't bother to estimate the costs of expensive expressions in the SELECT target list; you can see by comparing the above examples that the total cost at the top level doesn't include the obviously-nontrivial cost of evaluating the EXIST expressions. This however is not a bug, but a deliberate simplification to save planning time. The planner cannot alter the number of times the SELECT output expressions are evaluated (at least not if it's delivering the right answer) so there's no point in worrying whether they are expensive or cheap. But it would include the costs of an EXISTS appearing in WHERE. regards, tom lane
В списке pgsql-sql по дате отправления: