Обсуждение: Possible bug in planner (or planner not enough wise in some cases)

Поиск
Список
Период
Сортировка

Possible bug in planner (or planner not enough wise in some cases)

От
"Boguk Maxim"
Дата:
postgres version 8.1

all tables fresh vacuumed/analyzed

Problem table:

somedb=# \d el_comment
                                               Table "public.el_comment"
      Column      |            Type             |                               Modifiers

------------------+-----------------------------+-----------------------------------------------------------------------
 id               | integer                     | not null default nextval(('public.documents_id_seq'::text)::regclass)
 user_id          | integer                     | not null
 text_id          | integer                     | not null
 status           | smallint                    | not null default 0
 parent_id        | integer                     |
Indexes:
    "el_comment_pkey" PRIMARY KEY, btree (id)
    "el_comment_parent_id" btree (parent_id)
    "el_comment_text" btree (text_id)
    "el_comment_user" btree (user_id)
Foreign-key constraints:
    "delete_el_text" FOREIGN KEY (text_id) REFERENCES el_text(id) ON DELETE CASCADE

Problem query:

somedb=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment WHERE ((parent_id IN (SELECT tt.id FROM el_comment as tt WHERE
tt.user_id= 112 AND tt.status=1)) OR (text_id IN (SELECT el_text.id FROM el_text WHERE el_text.user_id = 112))) AND
status=1;
                                                                   QUERY PLAN
      

-------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=80641.51..80641.52 rows=1 width=0) (actual time=13528.870..13528.873 rows=1 loops=1)
   ->  Seq Scan on el_comment  (cost=56.07..80352.97 rows=1154156 width=0) (actual time=113.866..13528.705 rows=15
loops=1)
         Filter: ((status = 1) AND ((hashed subplan) OR (hashed subplan)))
         SubPlan
           ->  Index Scan using el_text_user on el_text  (cost=0.00..15.92 rows=12 width=4) (actual time=0.992..82.397
rows=12loops=1) 
                 Index Cond: (user_id = 112)
           ->  Index Scan using el_comment_user on el_comment tt  (cost=0.00..40.14 rows=28 width=4) (actual
time=8.748..21.661rows=14 loops=1) 
                 Index Cond: (user_id = 112)
                 Filter: (status = 1)
 Total runtime: 13529.189 ms
(10 rows)

Now lets look output of both subqueries:
SELECT tt.id FROM el_comment as tt WHERE tt.user_id = 112 AND tt.status=1:

 2766039
 2766057
 2244101
 1929350
 1929419
 1929439
 1490610
    1052
 2766033
 2421000
 2420878
  611328
    1019
    1646
(14 rows)

and SELECT el_text.id FROM el_text WHERE el_text.user_id = 112

 3758109
   53688
 1947631
 1915372
 1224421
 1011606
   13772
    1017
  463135
  470614
  575691
  916229
(12 rows)

And put these values into query:

planet=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment WHERE ((parent_id IN
(2766039,2766057,2244101,1929350,1929419,1929439,1490610,1052,2766033,2421000,2420878,611328,1019,1646))OR (text_id IN
(3758109,53688,1947631,1915372,1224421,1011606,13772,1017,463135,470614,575691,916229)))AND status=1; 

                                                        QUERY PLAN

 Aggregate  (cost=340.76..340.77 rows=1 width=0) (actual time=9.452..9.453 rows=1 loops=1)
   ->  Bitmap Heap Scan on el_comment  (cost=52.24..340.71 rows=194 width=0) (actual time=5.431..9.269 rows=15 loops=1)
         Recheck Cond: ((parent_id = 2766039) OR (parent_id = 2766057) OR (parent_id = 2244101) OR (parent_id =
1929350)OR (parent_id = 1929419) OR (parent_id = 1929439) OR (parent_id = 1490610) OR (parent_id = 1052) OR (parent_id
=2766033) OR (parent_id = 2421000) OR (parent_id = 2420878) OR (parent_id = 611328) OR (parent_id = 1019) OR (parent_id
=1646) OR (text_id = 3758109) OR (text_id = 53688) OR (text_id = 1947631) OR (text_id = 1915372) OR (text_id = 1224421)
OR(text_id = 1011606) OR (text_id = 13772) OR (text_id = 1017) OR (text_id = 463135) OR (text_id = 470614) OR (text_id
=575691) OR (text_id = 916229)) 
         Filter: (status = 1)
         ->  BitmapOr  (cost=52.24..52.24 rows=194 width=0) (actual time=4.972..4.972 rows=0 loops=1)
               ->  Bitmap Index Scan on el_comment_parent_id  (cost=0.00..2.00 rows=2 width=0) (actual
time=0.582..0.582rows= 
1 loops=1)
                     Index Cond: (parent_id = 2766039)
....
14 same rows
....

               ->  Bitmap Index Scan on el_comment_text  (cost=0.00..2.02 rows=13 width=0) (actual time=0.983..0.983
rows=0loops=1) 
                     Index Cond: (text_id = 3758109)
....
11 same rows
....

 Total runtime: 10.368 ms
(58 rows)

Complete different result (1000x times faster). Issue look like planner can't/dont want try count both subquery's
resultsand use bitmap scan.  
And planner see amount of results from both subqueris small so bitmap scan must be look way better.
That is intended or bug?

PS: i got reasonable fast results via rewrite query as
select count(*) from
(
select t1.id from el_comment as t1 join el_comment as t2 on t1.parent_id=t2.id and t2.user_id=112 and t2.status=1 where
t1.status=1
union
select t1.id from el_comment as t1 join el_text as t2 on t1.text_id=t2.id and t2.user_id=112 and t2.status=1
) as qqq;

but that is just workaround and work 2-5x time slower.


SY Maxim Boguk


Re: Possible bug in planner (or planner not enough wise in some cases)

От
"Jim C. Nasby"
Дата:
Try changing to a just a join and see if it works.

On Mon, Sep 18, 2006 at 05:35:52PM +0400, Boguk Maxim wrote:
> postgres version 8.1
>
> all tables fresh vacuumed/analyzed
>
> Problem table:
>
> somedb=# \d el_comment
>                                                Table "public.el_comment"
>       Column      |            Type             |                               Modifiers
>
------------------+-----------------------------+-----------------------------------------------------------------------
>  id               | integer                     | not null default
nextval(('public.documents_id_seq'::text)::regclass)
>  user_id          | integer                     | not null
>  text_id          | integer                     | not null
>  status           | smallint                    | not null default 0
>  parent_id        | integer                     |
> Indexes:
>     "el_comment_pkey" PRIMARY KEY, btree (id)
>     "el_comment_parent_id" btree (parent_id)
>     "el_comment_text" btree (text_id)
>     "el_comment_user" btree (user_id)
> Foreign-key constraints:
>     "delete_el_text" FOREIGN KEY (text_id) REFERENCES el_text(id) ON DELETE CASCADE
>
> Problem query:
>
> somedb=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment WHERE ((parent_id IN (SELECT tt.id FROM el_comment as tt
WHEREtt.user_id = 112 AND tt.status=1)) OR (text_id IN (SELECT el_text.id FROM el_text WHERE el_text.user_id = 112)))
ANDstatus=1; 
>                                                                    QUERY PLAN
        
>
-------------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=80641.51..80641.52 rows=1 width=0) (actual time=13528.870..13528.873 rows=1 loops=1)
>    ->  Seq Scan on el_comment  (cost=56.07..80352.97 rows=1154156 width=0) (actual time=113.866..13528.705 rows=15
loops=1)
>          Filter: ((status = 1) AND ((hashed subplan) OR (hashed subplan)))
>          SubPlan
>            ->  Index Scan using el_text_user on el_text  (cost=0.00..15.92 rows=12 width=4) (actual
time=0.992..82.397rows=12 loops=1) 
>                  Index Cond: (user_id = 112)
>            ->  Index Scan using el_comment_user on el_comment tt  (cost=0.00..40.14 rows=28 width=4) (actual
time=8.748..21.661rows=14 loops=1) 
>                  Index Cond: (user_id = 112)
>                  Filter: (status = 1)
>  Total runtime: 13529.189 ms
> (10 rows)
>
> Now lets look output of both subqueries:
> SELECT tt.id FROM el_comment as tt WHERE tt.user_id = 112 AND tt.status=1:
>
>  2766039
>  2766057
>  2244101
>  1929350
>  1929419
>  1929439
>  1490610
>     1052
>  2766033
>  2421000
>  2420878
>   611328
>     1019
>     1646
> (14 rows)
>
> and SELECT el_text.id FROM el_text WHERE el_text.user_id = 112
>
>  3758109
>    53688
>  1947631
>  1915372
>  1224421
>  1011606
>    13772
>     1017
>   463135
>   470614
>   575691
>   916229
> (12 rows)
>
> And put these values into query:
>
> planet=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment WHERE ((parent_id IN
(2766039,2766057,2244101,1929350,1929419,1929439,1490610,1052,2766033,2421000,2420878,611328,1019,1646))OR (text_id IN
(3758109,53688,1947631,1915372,1224421,1011606,13772,1017,463135,470614,575691,916229)))AND status=1; 
>
>                                                         QUERY PLAN
>
>  Aggregate  (cost=340.76..340.77 rows=1 width=0) (actual time=9.452..9.453 rows=1 loops=1)
>    ->  Bitmap Heap Scan on el_comment  (cost=52.24..340.71 rows=194 width=0) (actual time=5.431..9.269 rows=15
loops=1)
>          Recheck Cond: ((parent_id = 2766039) OR (parent_id = 2766057) OR (parent_id = 2244101) OR (parent_id =
1929350)OR (parent_id = 1929419) OR (parent_id = 1929439) OR (parent_id = 1490610) OR (parent_id = 1052) OR (parent_id
=2766033) OR (parent_id = 2421000) OR (parent_id = 2420878) OR (parent_id = 611328) OR (parent_id = 1019) OR (parent_id
=1646) OR (text_id = 3758109) OR (text_id = 53688) OR (text_id = 1947631) OR (text_id = 1915372) OR (text_id = 1224421)
OR(text_id = 1011606) OR (text_id = 13772) OR (text_id = 1017) OR (text_id = 463135) OR (text_id = 470614) OR (text_id
=575691) OR (text_id = 916229)) 
>          Filter: (status = 1)
>          ->  BitmapOr  (cost=52.24..52.24 rows=194 width=0) (actual time=4.972..4.972 rows=0 loops=1)
>                ->  Bitmap Index Scan on el_comment_parent_id  (cost=0.00..2.00 rows=2 width=0) (actual
time=0.582..0.582rows= 
> 1 loops=1)
>                      Index Cond: (parent_id = 2766039)
> ....
> 14 same rows
> ....
>
>                ->  Bitmap Index Scan on el_comment_text  (cost=0.00..2.02 rows=13 width=0) (actual time=0.983..0.983
rows=0loops=1) 
>                      Index Cond: (text_id = 3758109)
> ....
> 11 same rows
> ....
>
>  Total runtime: 10.368 ms
> (58 rows)
>
> Complete different result (1000x times faster). Issue look like planner can't/dont want try count both subquery's
resultsand use bitmap scan.  
> And planner see amount of results from both subqueris small so bitmap scan must be look way better.
> That is intended or bug?
>
> PS: i got reasonable fast results via rewrite query as
> select count(*) from
> (
> select t1.id from el_comment as t1 join el_comment as t2 on t1.parent_id=t2.id and t2.user_id=112 and t2.status=1
wheret1.status=1 
> union
> select t1.id from el_comment as t1 join el_text as t2 on t1.text_id=t2.id and t2.user_id=112 and t2.status=1
> ) as qqq;
>
> but that is just workaround and work 2-5x time slower.
>
>
> SY Maxim Boguk
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: Possible bug in planner (or planner not enough wise in some cases)

От
"Boguk Maxim"
Дата:
Hm i have no idea how to rewrite 'OR' conditions with subqueries as joins
(exept using 'UNION' as writen end of my message)
Are you sure it is possible?

WHERE
(parent_id IN (SELECT tt.id FROM el_comment as tt WHERE tt.user_id=112 AND tt.status=1))
OR
(text_id IN (SELECT el_text.id FROM el_text WHERE el_text.user_id=112))

> -----Original Message-----
> From: Jim C. Nasby [mailto:jim@nasby.net]
>
> Try changing to a just a join and see if it works.
>
> On Mon, Sep 18, 2006 at 05:35:52PM +0400, Boguk Maxim wrote:
> > postgres version 8.1
> >
> > all tables fresh vacuumed/analyzed
> >
> > Problem table:
> >
> > somedb=# \d el_comment
> >                                                Table
> "public.el_comment"
> >       Column      |            Type             |
>                     Modifiers
> >
> ------------------+-----------------------------+-------------
> ----------------------------------------------------------
> >  id               | integer                     | not null
> default nextval(('public.documents_id_seq'::text)::regclass)
> >  user_id          | integer                     | not null
> >  text_id          | integer                     | not null
> >  status           | smallint                    | not null default 0
> >  parent_id        | integer                     |
> > Indexes:
> >     "el_comment_pkey" PRIMARY KEY, btree (id)
> >     "el_comment_parent_id" btree (parent_id)
> >     "el_comment_text" btree (text_id)
> >     "el_comment_user" btree (user_id)
> > Foreign-key constraints:
> >     "delete_el_text" FOREIGN KEY (text_id) REFERENCES
> el_text(id) ON DELETE CASCADE
> >
> > Problem query:
> >
> > somedb=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment
> WHERE ((parent_id IN (SELECT tt.id FROM el_comment as tt
> WHERE tt.user_id = 112 AND tt.status=1)) OR (text_id IN
> (SELECT el_text.id FROM el_text WHERE el_text.user_id =
> 112))) AND status=1;
> >
>        QUERY PLAN
> >
> --------------------------------------------------------------
> --------------------------------------------------------------
> ---------------------
> >  Aggregate  (cost=80641.51..80641.52 rows=1 width=0)
> (actual time=13528.870..13528.873 rows=1 loops=1)
> >    ->  Seq Scan on el_comment  (cost=56.07..80352.97
> rows=1154156 width=0) (actual time=113.866..13528.705 rows=15 loops=1)
> >          Filter: ((status = 1) AND ((hashed subplan) OR
> (hashed subplan)))
> >          SubPlan
> >            ->  Index Scan using el_text_user on el_text
> (cost=0.00..15.92 rows=12 width=4) (actual time=0.992..82.397
> rows=12 loops=1)
> >                  Index Cond: (user_id = 112)
> >            ->  Index Scan using el_comment_user on
> el_comment tt  (cost=0.00..40.14 rows=28 width=4) (actual
> time=8.748..21.661 rows=14 loops=1)
> >                  Index Cond: (user_id = 112)
> >                  Filter: (status = 1)
> >  Total runtime: 13529.189 ms
> > (10 rows)
> >
> > Now lets look output of both subqueries:
> > SELECT tt.id FROM el_comment as tt WHERE tt.user_id = 112
> AND tt.status=1:
> >
> >  2766039
> >  2766057
> >  2244101
> >  1929350
> >  1929419
> >  1929439
> >  1490610
> >     1052
> >  2766033
> >  2421000
> >  2420878
> >   611328
> >     1019
> >     1646
> > (14 rows)
> >
> > and SELECT el_text.id FROM el_text WHERE el_text.user_id = 112
> >
> >  3758109
> >    53688
> >  1947631
> >  1915372
> >  1224421
> >  1011606
> >    13772
> >     1017
> >   463135
> >   470614
> >   575691
> >   916229
> > (12 rows)
> >
> > And put these values into query:
> >
> > planet=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment
> WHERE ((parent_id IN
> (2766039,2766057,2244101,1929350,1929419,1929439,1490610,1052,
> 2766033,2421000,2420878,611328,1019,1646)) OR (text_id IN
> (3758109,53688,1947631,1915372,1224421,1011606,13772,1017,4631
> 35,470614,575691,916229))) AND status=1;
> >
> >                                                         QUERY PLAN
> >
> >  Aggregate  (cost=340.76..340.77 rows=1 width=0) (actual
> time=9.452..9.453 rows=1 loops=1)
> >    ->  Bitmap Heap Scan on el_comment  (cost=52.24..340.71
> rows=194 width=0) (actual time=5.431..9.269 rows=15 loops=1)
> >          Recheck Cond: ((parent_id = 2766039) OR (parent_id
> = 2766057) OR (parent_id = 2244101) OR (parent_id = 1929350)
> OR (parent_id = 1929419) OR (parent_id = 1929439) OR
> (parent_id = 1490610) OR (parent_id = 1052) OR (parent_id =
> 2766033) OR (parent_id = 2421000) OR (parent_id = 2420878) OR
> (parent_id = 611328) OR (parent_id = 1019) OR (parent_id =
> 1646) OR (text_id = 3758109) OR (text_id = 53688) OR (text_id
> = 1947631) OR (text_id = 1915372) OR (text_id = 1224421) OR
> (text_id = 1011606) OR (text_id = 13772) OR (text_id = 1017)
> OR (text_id = 463135) OR (text_id = 470614) OR (text_id =
> 575691) OR (text_id = 916229))
> >          Filter: (status = 1)
> >          ->  BitmapOr  (cost=52.24..52.24 rows=194 width=0)
> (actual time=4.972..4.972 rows=0 loops=1)
> >                ->  Bitmap Index Scan on
> el_comment_parent_id  (cost=0.00..2.00 rows=2 width=0)
> (actual time=0.582..0.582 rows=
> > 1 loops=1)
> >                      Index Cond: (parent_id = 2766039)
> > ....
> > 14 same rows
> > ....
> >
> >                ->  Bitmap Index Scan on el_comment_text
> (cost=0.00..2.02 rows=13 width=0) (actual time=0.983..0.983
> rows=0 loops=1)
> >                      Index Cond: (text_id = 3758109)
> > ....
> > 11 same rows
> > ....
> >
> >  Total runtime: 10.368 ms
> > (58 rows)
> >
> > Complete different result (1000x times faster). Issue look
> like planner can't/dont want try count both subquery's
> results and use bitmap scan.
> > And planner see amount of results from both subqueris small
> so bitmap scan must be look way better.
> > That is intended or bug?
> >
> > PS: i got reasonable fast results via rewrite query as
> > select count(*) from
> > (
> > select t1.id from el_comment as t1 join el_comment as t2 on
> t1.parent_id=t2.id and t2.user_id=112 and t2.status=1 where
> t1.status=1
> > union
> > select t1.id from el_comment as t1 join el_text as t2 on
> t1.text_id=t2.id and t2.user_id=112 and t2.status=1
> > ) as qqq;
> >
> > but that is just workaround and work 2-5x time slower.
> >
> >
> > SY Maxim Boguk
> >
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings
> >
>
> --
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
>

Re: Possible bug in planner (or planner not enough wise in some cases)

От
Jim Nasby
Дата:
On Sep 20, 2006, at 5:32 AM, Boguk Maxim wrote:

> Hm i have no idea how to rewrite 'OR' conditions with subqueries as
> joins
> (exept using 'UNION' as writen end of my message)
> Are you sure it is possible?
>
> WHERE
> (parent_id IN (SELECT tt.id FROM el_comment as tt WHERE
> tt.user_id=112 AND tt.status=1))
> OR
> (text_id IN (SELECT el_text.id FROM el_text WHERE
> el_text.user_id=112))
>>> somedb=# EXPLAIN ANALYZE SELECT count(*) FROM el_comment
>> WHERE ((parent_id IN (SELECT tt.id FROM el_comment as tt
>> WHERE tt.user_id = 112 AND tt.status=1)) OR (text_id IN
>> (SELECT el_text.id FROM el_text WHERE el_text.user_id =
>> 112))) AND status=1;

... FROM el_comment c, el_comment tt, el_text t
WHERE ( ( c.parent_id = tt.id AND tt.user_id = 112 AND tt.status = 1 )
    OR ( c.text_id = t.id AND  t.user_id = 112 ) ) AND c.status=1;

--
Jim Nasby                                    jimn@enterprisedb.com
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)