От: Vlad Arkhipov
Тема: Re: Optimization idea
Дата: ,
Msg-id: 4BD10805.9060004@dc.baikal.ru
(см: обсуждение, исходный текст)
Ответ на: Re: Optimization idea  (Greg Smith)
Ответы: Re: Optimization idea  (Robert Haas)
Список: pgsql-performance

Скрыть дерево обсуждения

Optimization idea  (Vlad Arkhipov, )
 Re: Optimization idea  (Greg Smith, )
  Re: Optimization idea  (Vlad Arkhipov, )
   Re: Optimization idea  (Robert Haas, )
    Re: Optimization idea  (Cédric Villemain, )
     Re: Optimization idea  (Robert Haas, )
      Re: Optimization idea  (Cédric Villemain, )
       Re: Optimization idea  (Robert Haas, )
        Re: Optimization idea  (Tom Lane, )
         Re: Optimization idea  (Robert Haas, )
     Re: Optimization idea  ("Kevin Grittner", )
    Re: Optimization idea  (Vlad Arkhipov, )
     Re: Optimization idea  (Cédric Villemain, )
      Re: Optimization idea  (Robert Haas, )
       Re: Optimization idea  (Cédric Villemain, )
        Re: Optimization idea  (Vlad Arkhipov, )
       Re: Optimization idea  (Cédric Villemain, )
        Re: Optimization idea  (Cédric Villemain, )
  Re: Optimization idea  (Vlad Arkhipov, )
 Re: Optimization idea  (Robert Haas, )
  Re: Optimization idea  (Cédric Villemain, )

Greg Smith пишет:
> Vlad Arkhipov wrote:
>> Please do this small optimization if it is possible. It seem that the
>> optimizer have the all information to create a fast plan but it does
>> not do that.
>
> This isn't strictly an optimization problem; it's an issue with
> statistics the optimizer has to work with, the ones ANALYZE computes.
> You noticed this yourself:
>
I don't think this is just an issue with statistics, because the same
problem arises when I try executing a query like this:

explain analyze
select *
from t2
where t2.t in (select 2 union select 3 union select 4) /* It works well
if there is only one row in the subquery */

"Hash Semi Join  (cost=0.17..2474.10 rows=60060 width=16) (actual
time=0.032..103.034 rows=400 loops=1)"
"  Hash Cond: (t2.t = (2))"
"  ->  Seq Scan on t2  (cost=0.00..1543.00 rows=100100 width=16) (actual
time=0.007..47.856 rows=100100 loops=1)"
"  ->  Hash  (cost=0.13..0.13 rows=3 width=4) (actual time=0.019..0.019
rows=3 loops=1)"
"        ->  HashAggregate  (cost=0.07..0.10 rows=3 width=0) (actual
time=0.013..0.015 rows=3 loops=1)"
"              ->  Append  (cost=0.00..0.06 rows=3 width=0) (actual
time=0.001..0.007 rows=3 loops=1)"
"                    ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actual time=0.001..0.001 rows=1 loops=1)"
"                    ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actual time=0.000..0.000 rows=1 loops=1)"
"                    ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actual time=0.000..0.000 rows=1 loops=1)"
"Total runtime: 103.244 ms"

vs

explain analyze
select *
from t2
where t2.t in (2, 3, 4)

"Bitmap Heap Scan on t2  (cost=15.53..527.91 rows=357 width=16) (actual
time=0.068..0.255 rows=400 loops=1)"
"  Recheck Cond: (t = ANY ('{2,3,4}'::bigint[]))"
"  ->  Bitmap Index Scan on t_idx  (cost=0.00..15.44 rows=357 width=0)
(actual time=0.056..0.056 rows=400 loops=1)"
"        Index Cond: (t = ANY ('{2,3,4}'::bigint[]))"
"Total runtime: 0.445 ms"

I also tried setting columns' statistics to 10000, nothing happened.
PostgreSQL version is 8.4.2. It sounds good that there is no such issue
on PostgreSQL 9.0, i'll try it on the weekend.


В списке pgsql-performance по дате сообщения:

От: Vlad Arkhipov
Дата:
Сообщение: Re: Optimization idea
От: Robert Haas
Дата:
Сообщение: Re: Optimization idea