Re: exists

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: exists
Дата
Msg-id 10701.998427822@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: exists  (Joseph Shraibman <jks@selectacast.net>)
Список pgsql-sql
Joseph Shraibman <jks@selectacast.net> writes:
> Then why does the explain say rows=1363 ?

That's the estimate of how many rows the inner SELECT would return, if
left free to return them all.  You should get the same row count
estimate (though quite possibly a different plan) if you just do
an EXPLAIN of the sub-select that you have inside EXISTS.  Of course you
can't easily do that if the sub-select has outer references, but
consider this simplified example from the regression database:

regression=# explain select * from tenk1;
NOTICE:  QUERY PLAN:

Seq Scan on tenk1  (cost=0.00..333.00 rows=10000 width=148)

EXPLAIN
regression=# explain select * from tenk1 where unique1 > 9000;
NOTICE:  QUERY PLAN:

Seq Scan on tenk1  (cost=0.00..358.00 rows=1045 width=148)

EXPLAIN
regression=# explain select unique2,
regression-# exists(select * from tenk1 where unique1 > 9000),
regression-# unique1 from tenk1;
NOTICE:  QUERY PLAN:

Seq Scan on tenk1  (cost=0.00..333.00 rows=10000 width=8) InitPlan   ->  Seq Scan on tenk1  (cost=0.00..358.00
rows=1045width=148)
 

EXPLAIN
regression=# explain select unique2,
regression-# exists(select * from tenk1 where unique1 > out1.unique2),
regression-# unique1 from tenk1 out1;
NOTICE:  QUERY PLAN:

Seq Scan on tenk1 out1  (cost=0.00..333.00 rows=10000 width=8) SubPlan   ->  Seq Scan on tenk1  (cost=0.00..358.00
rows=3333width=148)
 

EXPLAIN

In the last case the rows count has changed from a statistics-driven
estimate to a default estimate because the planner doesn't have any
idea what out1.unique2 will be on any given execution of the subplan.

> I don't mean to nitpick here, but maybe this is the symptom of a
> larger problem.

The only "larger problem" I see here is that the planner doesn't bother
to estimate the costs of expensive expressions in the SELECT target
list; you can see by comparing the above examples that the total cost
at the top level doesn't include the obviously-nontrivial cost of
evaluating the EXIST expressions.  This however is not a bug, but a
deliberate simplification to save planning time.  The planner cannot
alter the number of times the SELECT output expressions are evaluated
(at least not if it's delivering the right answer) so there's no point
in worrying whether they are expensive or cheap.  But it would include
the costs of an EXISTS appearing in WHERE.
        regards, tom lane


В списке pgsql-sql по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Should I worry?
Следующее
От: Joseph Shraibman
Дата:
Сообщение: Re: exists