Обсуждение: potential performance gain by query planner optimization

Поиск
Список
Период
Сортировка

potential performance gain by query planner optimization

От
"Kneringer, Armin"
Дата:
Hi there.

I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed with
8.4.1(and earlier versions) on CentOS 5.3 (x64) 

The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge
deployingca. 200 GB of data to the local disk (ca. 180.000 tmp-files) 

explain SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
 NOT EXISTS (
 WHERE t6.objid = t4.objid AND
 t4.fileresporgid = 573936067464397682 AND
   NOT EXISTS (
   SELECT 1
   FROM ataggval q1_1,
   atdateval t5
   WHERE q1_1.objid = t4.objid AND
   q1_1.attrid = 281479288456451 AND
   q1_1.aggrid = 0 AND
   t5.aggrid = q1_1.aggval AND
   t5.objid = t4.objid AND
   t5.attrid = 281479288456447 ) AND
 ((t6.objclassid IN (285774255832590,285774255764301))) AND
 ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
 ORDER BY t4.objid;

                                                                                  QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=2592137103.99..2592137104.00 rows=1 width=8)
   ->  Sort  (cost=2592137103.99..2592137104.00 rows=1 width=8)
         Sort Key: t4.objid
         ->  Nested Loop  (cost=1105592553.38..2592137103.98 rows=1 width=8)
               ->  Hash Anti Join  (cost=1105592553.38..2592137095.75 rows=1 width=8)
                     Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid))
                     ->  Bitmap Heap Scan on fscsubfile t4  (cost=154.42...14136.40 rows=5486 width=8)
                           Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
                           ->  Bitmap Index Scan on ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
                                 Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
                     ->  Hash  (cost=11917516.57..11917516.57 rows=55006045159 width=16)
                           ->  Nested Loop  (cost=0.00..11917516.57 rows=55006045159 width=16)
                                 ->  Seq Scan on atdateval t5  (cost=0.00...294152.40 rows=1859934 width=12)
                                       Filter: (attrid = 281479288456447::bigint)
                                 ->  Index Scan using ind_ataggval on ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
                                       Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval =
t5.aggrid))
                                       Filter: (q1_1.aggrid = 0)
               ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
                     Index Cond: (t6.objid = t4.objid)
                     Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
(20 rows)


As the disks pace is limited on my test system I can't provide the "explain analyze" output
If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files for sorting.
(Changed lines are marked with [!!!!!] as I don't know HTML-Mails will be delivered without conversion

explain SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
   NOT EXISTS (
   SELECT 1
   FROM ataggval q1_1,
   atdateval t5
   WHERE q1_1.objid = t4.objid AND
   q1_1.attrid = 281479288456451 AND
   q1_1.aggrid = 0 AND
   t5.aggrid = q1_1.aggval AND
   t5.objid = q1_1.objid AND                 [!!!!!]
   t5.attrid = 281479288456447 ) AND
   ((t6.objclassid IN (285774255832590,285774255764301))) AND
   ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
 ORDER BY t4.objid;
                                                                            QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=918320.29..971968.88 rows=1 width=8)
   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8)
         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1 width=8)
               Merge Cond: (t4.objid = q1_1.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
AND(objid < 573936101807357952::bigint)) 
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8)
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8)
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12)
                                 Sort Key: t5.objid, t5.aggrid
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12)
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12)
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12)
                                       Sort Key: q1_1.objid, q1_1.aggval
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
(26 rows)

explain analyze SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
  NOT EXISTS (
  SELECT 1
  FROM ataggval q1_1,
  atdateval t5
  WHERE q1_1.objid = t4.objid AND
  q1_1.attrid = 281479288456451 AND
  q1_1.aggrid = 0 AND
  t5.aggrid = q1_1.aggval AND
  t5.objid = q1_1.objid AND                 [!!!!!]
  t5.attrid = 281479288456447 ) AND
((t6.objclassid IN (285774255832590,285774255764301))) AND
((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
ORDER BY t4.objid;
                                                                                     QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.598..12083.048 rows=64 loops=1)
   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.594..12083.010 rows=64 loops=1)
         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1 width=8) (actual time=12037.524..12081.989 rows=108
loops=1)
               Merge Cond: (t4.objid = q1_1.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)(actual time=0.073..83.498 rows=63436 loops=1) 
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
AND(objid < 573936101807357952::bigint)) 
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8866.253..11753.055
rows=1299685loops=1) 
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8866.246..11413.397
rows=1299685loops=1) 
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3133.362..3774.076
rows=1299685loops=1) 
                                 Sort Key: t5.objid, t5.aggrid
                                 Sort Method:  external merge  Disk: 47192kB
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12)
(actualtime=282.454..1079.038 rows=1857906 loops=1) 
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)(actual time=258.749...258.749 rows=1857906 loops=1) 
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12) (actual
time=5732.872..6683.784rows=1299685 loops=1) 
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12) (actual
time=5732.866..6387.188rows=1299685 loops=1) 
                                      Sort Key: q1_1.objid, q1_1.aggval
                                       Sort Method:  external merge  Disk: 39920kB
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)(actual time=1644.983..3634.044 rows=1857906 loops=1) 
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)(actual time=1606.325..1606.325 rows=1877336 loops=1) 
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009
rows=1loops=108) 
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
Total runtime: 12108.663 ms
(29 rows)


Another way to optimize my query is to change it as follows:
(Once again changes are marked with [!!!!!]

explain SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
   NOT EXISTS (
   SELECT 1
   FROM ataggval q1_1,
   atdateval t5
   WHERE q1_1.objid = t5.objid AND                 [!!!!!]
   q1_1.attrid = 281479288456451 AND
   q1_1.aggrid = 0 AND
   t5.aggrid = q1_1.aggval AND
   t5.objid = t4.objid AND
   t5.attrid = 281479288456447 ) AND
 ((t6.objclassid IN (285774255832590,285774255764301))) AND
 ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
 ORDER BY t4.objid;
                                                                            QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique  (cost=916978.86..969139.72 rows=1 width=8)
   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8)
         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1 width=8)
               Merge Cond: (t4.objid = t5.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
AND(objid < 573936101807357952::bigint)) 
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8)
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8)
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12)
                                 Sort Key: t5.objid, t5.aggrid
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12)
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12)
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12)
                                       Sort Key: q1_1.objid, q1_1.aggval
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
(26 rows)


explain analyze SELECT DISTINCT t4.objid
FROM fscsubfile t4, cooobject t6
WHERE t6.objid = t4.objid AND
t4.fileresporgid = 573936067464397682 AND
  NOT EXISTS (
  SELECT 1
  FROM ataggval q1_1,
  atdateval t5
  WHERE q1_1.objid = t5.objid AND                 [!!!!!]
  q1_1.attrid = 281479288456451 AND
  q1_1.aggrid = 0 AND
  t5.aggrid = q1_1.aggval AND
  t5.objid = t4.objid AND
  t5.attrid = 281479288456447 ) AND
((t6.objclassid IN (285774255832590,285774255764301))) AND
((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
ORDER BY t4.objid;
                                                                                     QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Unique  (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.964..12106.409 rows=64 loops=1)
   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.959..12106.375 rows=64 loops=1)
         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1 width=8) (actual time=12060.916..12105.374 rows=108
loops=1)
               Merge Cond: (t4.objid = t5.objid)
               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)(actual time=0.080..81.397 rows=63436 loops=1) 
                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
AND(objid < 573936101807357952::bigint)) 
               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8874.492..11778.254
rows=1299685loops=1) 
                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8874.484..11437.175
rows=1299685loops=1) 
                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3117.555..3756.062
rows=1299685loops=1) 
                                 Sort Key: t5.objid, t5.aggrid
                                 Sort Method:  external merge  Disk: 39920kB
                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934 width=12)
(actualtime=289.475..1079.624 rows=1857906 loops=1) 
                                       Recheck Cond: (attrid = 281479288456447::bigint)
                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)(actual time=265.720...265.720 rows=1857906 loops=1) 
                                             Index Cond: (attrid = 281479288456447::bigint)
                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12) (actual
time=5756.915..6707.864rows=1299685 loops=1) 
                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12) (actual
time=5756.909..6409.819rows=1299685 loops=1) 
                                       Sort Key: q1_1.objid, q1_1.aggval
                                       Sort Method:  external merge  Disk: 39920kB
                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)(actual time=1646.955..3628.918 rows=1857906 loops=1) 
                                             Recheck Cond: (attrid = 281479288456451::bigint)
                                             Filter: (aggrid = 0)
                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)(actual time=1608.233..1608.233 rows=1877336 loops=1) 
                                                   Index Cond: (attrid = 281479288456451::bigint)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009
rows=1loops=108) 
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
Total runtime: 12129.613 ms
(29 rows)



As the query performs in roughly 12 seconds in both (changed) cases you might advise to change my queries :-)
(In fact we are working on this)
As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked this
issueon MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the query
anthe performance is well. Oralce shows a slightly difference but the performance is also well. 
As I mentioned we are looking forward to change our query but in my opinion there could be a general performance gain
ifthis issue is addressed. (especially if you don't know you run into this issue on the query performance is sufficient
enough)

greets
Armin

Re: potential performance gain by query planner optimization

От
Pavel Stehule
Дата:
Hello

2010/7/20 Kneringer, Armin <Armin.Kneringer@fabasoft.com>:
> Hi there.
>
> I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed
with8.4.1 (and earlier versions) on CentOS 5.3 (x64) 
>
> The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge
deployingca. 200 GB of data to the local disk (ca. 180.000 tmp-files) 

can you try show check explain with set enable_hashjoin to off;   ?

Regards

Pavel Stehule

>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
>  NOT EXISTS (
>  WHERE t6.objid = t4.objid AND
>  t4.fileresporgid = 573936067464397682 AND
>   NOT EXISTS (
>   SELECT 1
>   FROM ataggval q1_1,
>   atdateval t5
>   WHERE q1_1.objid = t4.objid AND
>   q1_1.attrid = 281479288456451 AND
>   q1_1.aggrid = 0 AND
>   t5.aggrid = q1_1.aggval AND
>   t5.objid = t4.objid AND
>   t5.attrid = 281479288456447 ) AND
>  ((t6.objclassid IN (285774255832590,285774255764301))) AND
>  ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>  ORDER BY t4.objid;
>
>                                                                                  QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique  (cost=2592137103.99..2592137104.00 rows=1 width=8)
>   ->  Sort  (cost=2592137103.99..2592137104.00 rows=1 width=8)
>         Sort Key: t4.objid
>         ->  Nested Loop  (cost=1105592553.38..2592137103.98 rows=1 width=8)
>               ->  Hash Anti Join  (cost=1105592553.38..2592137095.75 rows=1 width=8)
>                     Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid))
>                     ->  Bitmap Heap Scan on fscsubfile t4  (cost=154.42...14136.40 rows=5486 width=8)
>                           Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>                           ->  Bitmap Index Scan on ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
>                                 Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>                     ->  Hash  (cost=11917516.57..11917516.57 rows=55006045159 width=16)
>                           ->  Nested Loop  (cost=0.00..11917516.57 rows=55006045159 width=16)
>                                 ->  Seq Scan on atdateval t5  (cost=0.00...294152.40 rows=1859934 width=12)
>                                       Filter: (attrid = 281479288456447::bigint)
>                                 ->  Index Scan using ind_ataggval on ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
>                                       Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval =
t5.aggrid))
>                                       Filter: (q1_1.aggrid = 0)
>               ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
>                     Index Cond: (t6.objid = t4.objid)
>                     Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> (20 rows)
>
>
> As the disks pace is limited on my test system I can't provide the "explain analyze" output
> If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files for sorting.
> (Changed lines are marked with [!!!!!] as I don't know HTML-Mails will be delivered without conversion
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
>   NOT EXISTS (
>   SELECT 1
>   FROM ataggval q1_1,
>   atdateval t5
>   WHERE q1_1.objid = t4.objid AND
>   q1_1.attrid = 281479288456451 AND
>   q1_1.aggrid = 0 AND
>   t5.aggrid = q1_1.aggval AND
>   t5.objid = q1_1.objid AND                 [!!!!!]
>   t5.attrid = 281479288456447 ) AND
>   ((t6.objclassid IN (285774255832590,285774255764301))) AND
>   ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>  ORDER BY t4.objid;
>                                                                            QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique  (cost=918320.29..971968.88 rows=1 width=8)
>   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8)
>         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1 width=8)
>               Merge Cond: (t4.objid = q1_1.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)
>                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8)
>                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8)
>                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12)
>                                 Sort Key: t5.objid, t5.aggrid
>                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934
width=12)
>                                       Recheck Cond: (attrid = 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)
>                                             Index Cond: (attrid = 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12)
>                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12)
>                                       Sort Key: q1_1.objid, q1_1.aggval
>                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)
>                                             Recheck Cond: (attrid = 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)
>                                                   Index Cond: (attrid = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
> explain analyze SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
>  NOT EXISTS (
>  SELECT 1
>  FROM ataggval q1_1,
>  atdateval t5
>  WHERE q1_1.objid = t4.objid AND
>  q1_1.attrid = 281479288456451 AND
>  q1_1.aggrid = 0 AND
>  t5.aggrid = q1_1.aggval AND
>  t5.objid = q1_1.objid AND                 [!!!!!]
>  t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>                                                                                     QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique  (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.598..12083.048 rows=64 loops=1)
>   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.594..12083.010 rows=64 loops=1)
>         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1 width=8) (actual time=12037.524..12081.989 rows=108
loops=1)
>               Merge Cond: (t4.objid = q1_1.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)(actual time=0.073..83.498 rows=63436 loops=1) 
>                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8866.253..11753.055
rows=1299685loops=1) 
>                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8866.246..11413.397
rows=1299685loops=1) 
>                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3133.362..3774.076
rows=1299685loops=1) 
>                                 Sort Key: t5.objid, t5.aggrid
>                                 Sort Method:  external merge  Disk: 47192kB
>                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934
width=12)(actual time=282.454..1079.038 rows=1857906 loops=1) 
>                                       Recheck Cond: (attrid = 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)(actual time=258.749...258.749 rows=1857906 loops=1) 
>                                             Index Cond: (attrid = 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12) (actual
time=5732.872..6683.784rows=1299685 loops=1) 
>                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12) (actual
time=5732.866..6387.188rows=1299685 loops=1) 
>                                      Sort Key: q1_1.objid, q1_1.aggval
>                                       Sort Method:  external merge  Disk: 39920kB
>                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)(actual time=1644.983..3634.044 rows=1857906 loops=1) 
>                                             Recheck Cond: (attrid = 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)(actual time=1606.325..1606.325 rows=1877336 loops=1) 
>                                                   Index Cond: (attrid = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009
rows=1loops=108) 
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12108.663 ms
> (29 rows)
>
>
> Another way to optimize my query is to change it as follows:
> (Once again changes are marked with [!!!!!]
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
>   NOT EXISTS (
>   SELECT 1
>   FROM ataggval q1_1,
>   atdateval t5
>   WHERE q1_1.objid = t5.objid AND                 [!!!!!]
>   q1_1.attrid = 281479288456451 AND
>   q1_1.aggrid = 0 AND
>   t5.aggrid = q1_1.aggval AND
>   t5.objid = t4.objid AND
>   t5.attrid = 281479288456447 ) AND
>  ((t6.objclassid IN (285774255832590,285774255764301))) AND
>  ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>  ORDER BY t4.objid;
>                                                                            QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique  (cost=916978.86..969139.72 rows=1 width=8)
>   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8)
>         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1 width=8)
>               Merge Cond: (t4.objid = t5.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)
>                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8)
>                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8)
>                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12)
>                                 Sort Key: t5.objid, t5.aggrid
>                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934
width=12)
>                                       Recheck Cond: (attrid = 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)
>                                             Index Cond: (attrid = 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12)
>                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12)
>                                       Sort Key: q1_1.objid, q1_1.aggval
>                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)
>                                             Recheck Cond: (attrid = 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)
>                                                   Index Cond: (attrid = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8)
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
>
> explain analyze SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
>  NOT EXISTS (
>  SELECT 1
>  FROM ataggval q1_1,
>  atdateval t5
>  WHERE q1_1.objid = t5.objid AND                 [!!!!!]
>  q1_1.attrid = 281479288456451 AND
>  q1_1.aggrid = 0 AND
>  t5.aggrid = q1_1.aggval AND
>  t5.objid = t4.objid AND
>  t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>                                                                                     QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Unique  (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.964..12106.409 rows=64 loops=1)
>   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.959..12106.375 rows=64 loops=1)
>         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1 width=8) (actual time=12060.916..12105.374 rows=108
loops=1)
>               Merge Cond: (t4.objid = t5.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on fscsubfile t4  (cost=0.00..19016.05 rows=5486
width=8)(actual time=0.080..81.397 rows=63436 loops=1) 
>                     Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8874.492..11778.254
rows=1299685loops=1) 
>                     ->  Merge Join  (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8874.484..11437.175
rows=1299685loops=1) 
>                           Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3117.555..3756.062
rows=1299685loops=1) 
>                                 Sort Key: t5.objid, t5.aggrid
>                                 Sort Method:  external merge  Disk: 39920kB
>                                 ->  Bitmap Heap Scan on atdateval t5  (cost=43749.07..176555.24 rows=1859934
width=12)(actual time=289.475..1079.624 rows=1857906 loops=1) 
>                                       Recheck Cond: (attrid = 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on ind_atdateval  (cost=0.00..43284.08 rows=1859934
width=0)(actual time=265.720...265.720 rows=1857906 loops=1) 
>                                             Index Cond: (attrid = 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97 rows=1701738 width=12) (actual
time=5756.915..6707.864rows=1299685 loops=1) 
>                                 ->  Sort  (cost=510392.25..514646.59 rows=1701738 width=12) (actual
time=5756.909..6409.819rows=1299685 loops=1) 
>                                       Sort Key: q1_1.objid, q1_1.aggval
>                                       Sort Method:  external merge  Disk: 39920kB
>                                       ->  Bitmap Heap Scan on ataggval q1_1  (cost=44666.00..305189.47 rows=1701738
width=12)(actual time=1646.955..3628.918 rows=1857906 loops=1) 
>                                             Recheck Cond: (attrid = 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on ind_ataggval  (cost=0.00..44240.56 rows=1860698
width=0)(actual time=1608.233..1608.233 rows=1877336 loops=1) 
>                                                   Index Cond: (attrid = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009
rows=1loops=108) 
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12129.613 ms
> (29 rows)
>
>
>
> As the query performs in roughly 12 seconds in both (changed) cases you might advise to change my queries :-)
> (In fact we are working on this)
> As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked this
issueon MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the query
anthe performance is well. Oralce shows a slightly difference but the performance is also well. 
> As I mentioned we are looking forward to change our query but in my opinion there could be a general performance gain
ifthis issue is addressed. (especially if you don't know you run into this issue on the query performance is sufficient
enough)
>
> greets
> Armin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: potential performance gain by query planner optimization

От
"Kneringer, Armin"
Дата:
Hi Pavel,

Turning hashjoin off also does the trick. (the explain output is below)
My basic intention was to check if the query planner could be optmized to automatically improve the query processing.
In this case all users (espacially those which are not be aware of ineffective query processing e.g. due their own
queries)might profit by faster query execution. 
This is just a thought (or suggestion) for further enhancement. Evt. it will be added to the project backlog.

kind regards
Armin

For reasons of completeness the eplain output with "hashjoin off":


# explain analyze SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject t6 WHERE t6.objid = t4.objid AND
t4.fileresporgid= 573936067464397682 AND NOT EXISTS (SELECT 1 FROM ataggval q1_1, atdateval t5 WHERE q1_1.objid =
t4.objidAND q1_1.attrid = 281479288456451 AND q1_1.aggrid = 0 AND t5.aggrid = q1_1.aggval AND t5.objid = t4.objid AND
t5.attrid= 281479288456447 ) AND ( (t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid >
573936097512390656and t4.objid < 573936101807357952)) ORDER BY t4.objid; 
                                                                                  QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=13639921632.59..14512729357.51 rows=1 width=8) (actual time=14172.154..14468.435 rows=64 loops=1)
   ->  Nested Loop  (cost=13639921632.59..14512729357.51 rows=1 width=8) (actual time=14172.148..14468.364 rows=64
loops=1)
         ->  Merge Anti Join  (cost=13639921632.59..14512729349.28 rows=1 width=8) (actual time=14092.764..14108.850
rows=108loops=1) 
               Merge Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid))
               ->  Sort  (cost=14477.12..14490.83 rows=5486 width=8) (actual time=100.070..109.200 rows=63436 loops=1)
                     Sort Key: t4.objid
                     Sort Method:  quicksort  Memory: 4510kB
                     ->  Bitmap Heap Scan on fscsubfile t4  (cost=154.42...14136.40 rows=5486 width=8) (actual
time=14.645..54.176rows=63436 loops=1) 
                           Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
                           ->  Bitmap Index Scan on ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
(actualtime=11.438..11.438 rows=63436 loops=1) 
                                 Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
               ->  Materialize  (cost=13584864369.09..14272439933.58 rows=55006045159 width=16) (actual
time=12914.166..13699.719rows=1299867 loops=1) 
                     ->  Sort  (cost=13584864369.09..13722379481.99 rows=55006045159 width=16) (actual
time=12914.153..13411.554rows=1299867 loops=1) 
                           Sort Key: q1_1.objid, t5.objid
                           Sort Method:  external merge  Disk: 47192kB
                           ->  Nested Loop  (cost=0.00..11917516.57 rows=55006045159 width=16) (actual
time=0.621..10505.130rows=1858326 loops=1) 
                                 ->  Seq Scan on atdateval t5  (cost=0.00...294152.40 rows=1859934 width=12) (actual
time=0.593..1870.182rows=1857906 loops=1) 
                                       Filter: (attrid = 281479288456447::bigint)
                                 ->  Index Scan using ind_ataggval on ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
(actualtime=0.004..0.004 rows=1 loops=1857906) 
                                       Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval =
t5.aggrid))
                                       Filter: (q1_1.aggrid = 0)
         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=3.327..3.328
rows=1loops=108) 
               Index Cond: (t6.objid = t4.objid)
               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
 Total runtime: 14487.434 ms






-----Original Message-----
From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
Sent: Dienstag, 20. Juli 2010 21:39
To: Kneringer, Armin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] potential performance gain by query planner optimization

Hello

2010/7/20 Kneringer, Armin <Armin.Kneringer@fabasoft.com>:
> Hi there.
>
> I think I found a potential performance gain if the query planner
> would be optimized. All Tests has been performed with 8.4.1 (and
> earlier versions) on CentOS 5.3 (x64)
>
> The following query will run on my database (~250 GB) for ca. 1600
> seconds and the sort will result in a disk merge deploying ca. 200 GB
> of data to the local disk (ca. 180.000 tmp-files)

can you try show check explain with set enable_hashjoin to off;   ?

Regards

Pavel Stehule

>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
>  NOT EXISTS (
>  WHERE t6.objid = t4.objid AND
>  t4.fileresporgid = 573936067464397682 AND
>   NOT EXISTS (
>   SELECT 1
>   FROM ataggval q1_1,
>   atdateval t5
>   WHERE q1_1.objid = t4.objid AND
>   q1_1.attrid = 281479288456451 AND
>   q1_1.aggrid = 0 AND
>   t5.aggrid = q1_1.aggval AND
>   t5.objid = t4.objid AND
>   t5.attrid = 281479288456447 ) AND
>  ((t6.objclassid IN (285774255832590,285774255764301))) AND
>  ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>  ORDER BY t4.objid;
>
>
> QUERY PLAN
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ----------------------------------
> Unique  (cost=2592137103.99..2592137104.00 rows=1 width=8)
>   ->  Sort  (cost=2592137103.99..2592137104.00 rows=1 width=8)
>         Sort Key: t4.objid
>         ->  Nested Loop  (cost=1105592553.38..2592137103.98 rows=1
> width=8)
>               ->  Hash Anti Join  (cost=1105592553.38..2592137095.75
> rows=1 width=8)
>                     Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid
> = t5.objid))
>                     ->  Bitmap Heap Scan on fscsubfile t4
> (cost=154.42...14136.40 rows=5486 width=8)
>                           Recheck Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
> AND (objid < 573936101807357952::bigint))
>                           ->  Bitmap Index Scan on
> ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
>                                 Index Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
> AND (objid < 573936101807357952::bigint))
>                     ->  Hash  (cost=11917516.57..11917516.57
> rows=55006045159 width=16)
>                           ->  Nested Loop  (cost=0.00..11917516.57
> rows=55006045159 width=16)
>                                 ->  Seq Scan on atdateval t5
> (cost=0.00...294152.40 rows=1859934 width=12)
>                                       Filter: (attrid =
> 281479288456447::bigint)
>                                 ->  Index Scan using ind_ataggval on
> ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
>                                       Index Cond: ((q1_1.attrid =
> 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid))
>                                       Filter: (q1_1.aggrid = 0)
>               ->  Index Scan using cooobjectix on cooobject t6
> (cost=0.00..8.22 rows=1 width=8)
>                     Index Cond: (t6.objid = t4.objid)
>                     Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> (20 rows)
>
>
> As the disks pace is limited on my test system I can't provide the
> "explain analyze" output If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files
forsorting. 
> (Changed lines are marked with [!!!!!] as I don't know HTML-Mails will
> be delivered without conversion
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
>   NOT EXISTS (
>   SELECT 1
>   FROM ataggval q1_1,
>   atdateval t5
>   WHERE q1_1.objid = t4.objid AND
>   q1_1.attrid = 281479288456451 AND
>   q1_1.aggrid = 0 AND
>   t5.aggrid = q1_1.aggval AND
>   t5.objid = q1_1.objid AND                 [!!!!!]
>   t5.attrid = 281479288456447 ) AND
>   ((t6.objclassid IN (285774255832590,285774255764301))) AND
>   ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>  ORDER BY t4.objid;
>
> QUERY PLAN
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ---------------------- Unique  (cost=918320.29..971968.88 rows=1
> width=8)
>   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8)
>         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1
> width=8)
>               Merge Cond: (t4.objid = q1_1.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on
> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8)
>                     Index Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
> AND (objid < 573936101807357952::bigint))
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
> width=8)
>                     ->  Merge Join  (cost=912418.42..956372.47
> rows=22689 width=8)
>                           Merge Cond: ((t5.objid = q1_1.objid) AND
> (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63
> rows=1859934 width=12)
>                                 Sort Key: t5.objid, t5.aggrid
>                                 ->  Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12)
>                                       Recheck Cond: (attrid =
> 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on
> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0)
>                                             Index Cond: (attrid =
> 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97
> rows=1701738 width=12)
>                                 ->  Sort  (cost=510392.25..514646.59
> rows=1701738 width=12)
>                                       Sort Key: q1_1.objid,
> q1_1.aggval
>                                       ->  Bitmap Heap Scan on ataggval
> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12)
>                                             Recheck Cond: (attrid =
> 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on
> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0)
>                                                   Index Cond: (attrid
> = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6
> (cost=0.00..8.22 rows=1 width=8)
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
> explain analyze SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject
> t6 WHERE t6.objid = t4.objid AND t4.fileresporgid = 573936067464397682
> AND
>  NOT EXISTS (
>  SELECT 1
>  FROM ataggval q1_1,
>  atdateval t5
>  WHERE q1_1.objid = t4.objid AND
>  q1_1.attrid = 281479288456451 AND
>  q1_1.aggrid = 0 AND
>  t5.aggrid = q1_1.aggval AND
>  t5.objid = q1_1.objid AND                 [!!!!!]
>  t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid
> > 573936097512390656 and t4.objid < 573936101807357952)) ORDER BY
> t4.objid;
>
> QUERY PLAN
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> -----------------------------------
> Unique  (cost=918320.29..971968.88 rows=1 width=8) (actual
> time=12079.598..12083.048 rows=64 loops=1)
>   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8) (actual
> time=12079.594..12083.010 rows=64 loops=1)
>         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1
> width=8) (actual time=12037.524..12081.989 rows=108 loops=1)
>               Merge Cond: (t4.objid = q1_1.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on
> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8) (actual
> time=0.073..83.498 rows=63436 loops=1)
>                     Index Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
> AND (objid < 573936101807357952::bigint))
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
> width=8) (actual time=8866.253..11753.055 rows=1299685 loops=1)
>                     ->  Merge Join  (cost=912418.42..956372.47
> rows=22689 width=8) (actual time=8866.246..11413.397 rows=1299685
> loops=1)
>                           Merge Cond: ((t5.objid = q1_1.objid) AND
> (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63
> rows=1859934 width=12) (actual time=3133.362..3774.076 rows=1299685
> loops=1)
>                                 Sort Key: t5.objid, t5.aggrid
>                                 Sort Method:  external merge  Disk:
> 47192kB
>                                 ->  Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12) (actual
> time=282.454..1079.038 rows=1857906 loops=1)
>                                       Recheck Cond: (attrid =
> 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on
> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0) (actual
> time=258.749...258.749 rows=1857906 loops=1)
>                                             Index Cond: (attrid =
> 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97
> rows=1701738 width=12) (actual time=5732.872..6683.784 rows=1299685
> loops=1)
>                                 ->  Sort  (cost=510392.25..514646.59
> rows=1701738 width=12) (actual time=5732.866..6387.188 rows=1299685
> loops=1)
>                                      Sort Key: q1_1.objid, q1_1.aggval
>                                       Sort Method:  external merge
> Disk: 39920kB
>                                       ->  Bitmap Heap Scan on ataggval
> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12) (actual
> time=1644.983..3634.044 rows=1857906 loops=1)
>                                             Recheck Cond: (attrid =
> 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on
> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0) (actual
> time=1606.325..1606.325 rows=1877336 loops=1)
>                                                   Index Cond: (attrid
> = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6
> (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009 rows=1
> loops=108)
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12108.663 ms
> (29 rows)
>
>
> Another way to optimize my query is to change it as follows:
> (Once again changes are marked with [!!!!!]
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
>   NOT EXISTS (
>   SELECT 1
>   FROM ataggval q1_1,
>   atdateval t5
>   WHERE q1_1.objid = t5.objid AND                 [!!!!!]
>   q1_1.attrid = 281479288456451 AND
>   q1_1.aggrid = 0 AND
>   t5.aggrid = q1_1.aggval AND
>   t5.objid = t4.objid AND
>   t5.attrid = 281479288456447 ) AND
>  ((t6.objclassid IN (285774255832590,285774255764301))) AND
>  ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>  ORDER BY t4.objid;
>
> QUERY PLAN
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ---------------------- Unique  (cost=916978.86..969139.72 rows=1
> width=8)
>   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8)
>         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1
> width=8)
>               Merge Cond: (t4.objid = t5.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on
> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8)
>                     Index Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
> AND (objid < 573936101807357952::bigint))
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
> width=8)
>                     ->  Merge Join  (cost=912418.42..956372.47
> rows=22689 width=8)
>                           Merge Cond: ((t5.objid = q1_1.objid) AND
> (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63
> rows=1859934 width=12)
>                                 Sort Key: t5.objid, t5.aggrid
>                                 ->  Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12)
>                                       Recheck Cond: (attrid =
> 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on
> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0)
>                                             Index Cond: (attrid =
> 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97
> rows=1701738 width=12)
>                                 ->  Sort  (cost=510392.25..514646.59
> rows=1701738 width=12)
>                                       Sort Key: q1_1.objid,
> q1_1.aggval
>                                       ->  Bitmap Heap Scan on ataggval
> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12)
>                                             Recheck Cond: (attrid =
> 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on
> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0)
>                                                   Index Cond: (attrid
> = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6
> (cost=0.00..8.22 rows=1 width=8)
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
>
> explain analyze SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject
> t6 WHERE t6.objid = t4.objid AND t4.fileresporgid = 573936067464397682
> AND
>  NOT EXISTS (
>  SELECT 1
>  FROM ataggval q1_1,
>  atdateval t5
>  WHERE q1_1.objid = t5.objid AND                 [!!!!!]
>  q1_1.attrid = 281479288456451 AND
>  q1_1.aggrid = 0 AND
>  t5.aggrid = q1_1.aggval AND
>  t5.objid = t4.objid AND
>  t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid
> > 573936097512390656 and t4.objid < 573936101807357952)) ORDER BY
> t4.objid;
>
> QUERY PLAN
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> -----------------------------------
>
> Unique  (cost=916978.86..969139.72 rows=1 width=8) (actual
> time=12102.964..12106.409 rows=64 loops=1)
>   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8) (actual
> time=12102.959..12106.375 rows=64 loops=1)
>         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1
> width=8) (actual time=12060.916..12105.374 rows=108 loops=1)
>               Merge Cond: (t4.objid = t5.objid)
>               ->  Index Scan using ind_fscsubfile_filerespons on
> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8) (actual
> time=0.080..81.397 rows=63436 loops=1)
>                     Index Cond: ((fileresporgid =
> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
> AND (objid < 573936101807357952::bigint))
>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
> width=8) (actual time=8874.492..11778.254 rows=1299685 loops=1)
>                     ->  Merge Join  (cost=912418.42..956372.47
> rows=22689 width=8) (actual time=8874.484..11437.175 rows=1299685
> loops=1)
>                           Merge Cond: ((t5.objid = q1_1.objid) AND
> (t5.aggrid = q1_1.aggval))
>                           ->  Sort  (cost=402024.80..406674.63
> rows=1859934 width=12) (actual time=3117.555..3756.062 rows=1299685
> loops=1)
>                                 Sort Key: t5.objid, t5.aggrid
>                                 Sort Method:  external merge  Disk:
> 39920kB
>                                 ->  Bitmap Heap Scan on atdateval t5
> (cost=43749.07..176555.24 rows=1859934 width=12) (actual
> time=289.475..1079.624 rows=1857906 loops=1)
>                                       Recheck Cond: (attrid =
> 281479288456447::bigint)
>                                       ->  Bitmap Index Scan on
> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0) (actual
> time=265.720...265.720 rows=1857906 loops=1)
>                                             Index Cond: (attrid =
> 281479288456447::bigint)
>                           ->  Materialize  (cost=510392.25..531663.97
> rows=1701738 width=12) (actual time=5756.915..6707.864 rows=1299685
> loops=1)
>                                 ->  Sort  (cost=510392.25..514646.59
> rows=1701738 width=12) (actual time=5756.909..6409.819 rows=1299685
> loops=1)
>                                       Sort Key: q1_1.objid,
> q1_1.aggval
>                                       Sort Method:  external merge
> Disk: 39920kB
>                                       ->  Bitmap Heap Scan on ataggval
> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12) (actual
> time=1646.955..3628.918 rows=1857906 loops=1)
>                                             Recheck Cond: (attrid =
> 281479288456451::bigint)
>                                             Filter: (aggrid = 0)
>                                             ->  Bitmap Index Scan on
> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0) (actual
> time=1608.233..1608.233 rows=1877336 loops=1)
>                                                   Index Cond: (attrid
> = 281479288456451::bigint)
>         ->  Index Scan using cooobjectix on cooobject t6
> (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009 rows=1
> loops=108)
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY
> ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12129.613 ms
> (29 rows)
>
>
>
> As the query performs in roughly 12 seconds in both (changed) cases
> you might advise to change my queries :-) (In fact we are working on
> this) As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked
thisissue on MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the
queryan the performance is well. Oralce shows a slightly difference but the performance is also well. 
> As I mentioned we are looking forward to change our query but in my
> opinion there could be a general performance gain if this issue is
> addressed. (especially if you don't know you run into this issue on
> the query performance is sufficient enough)
>
> greets
> Armin
>
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: potential performance gain by query planner optimization

От
Pavel Stehule
Дата:
2010/7/21 Kneringer, Armin <Armin.Kneringer@fabasoft.com>:
> Hi Pavel,
>
> Turning hashjoin off also does the trick. (the explain output is below)
> My basic intention was to check if the query planner could be optmized to automatically improve the query processing.
> In this case all users (espacially those which are not be aware of ineffective query processing e.g. due their own
queries)might profit by faster query execution. 
> This is just a thought (or suggestion) for further enhancement. Evt. it will be added to the project backlog.
>

You have a problem with inadequate statistics. Somewhere optimalizer
prefer hash join (available for sets less than work_mem), but try to
store to much data to hash tables and system will to use a swap :(.

Regards
Pavel Stehule

> kind regards
> Armin
>
> For reasons of completeness the eplain output with "hashjoin off":
>
>
> # explain analyze SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject t6 WHERE t6.objid = t4.objid AND
t4.fileresporgid= 573936067464397682 AND NOT EXISTS (SELECT 1 FROM ataggval q1_1, atdateval t5 WHERE q1_1.objid =
t4.objidAND q1_1.attrid = 281479288456451 AND q1_1.aggrid = 0 AND t5.aggrid = q1_1.aggval AND t5.objid = t4.objid AND
t5.attrid= 281479288456447 ) AND ( (t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid >
573936097512390656and t4.objid < 573936101807357952)) ORDER BY t4.objid; 
>                                                                                  QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Unique  (cost=13639921632.59..14512729357.51 rows=1 width=8) (actual time=14172.154..14468.435 rows=64 loops=1)
>   ->  Nested Loop  (cost=13639921632.59..14512729357.51 rows=1 width=8) (actual time=14172.148..14468.364 rows=64
loops=1)
>         ->  Merge Anti Join  (cost=13639921632.59..14512729349.28 rows=1 width=8) (actual time=14092.764..14108.850
rows=108loops=1) 
>               Merge Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid))
>               ->  Sort  (cost=14477.12..14490.83 rows=5486 width=8) (actual time=100.070..109.200 rows=63436 loops=1)
>                     Sort Key: t4.objid
>                     Sort Method:  quicksort  Memory: 4510kB
>                     ->  Bitmap Heap Scan on fscsubfile t4  (cost=154.42...14136.40 rows=5486 width=8) (actual
time=14.645..54.176rows=63436 loops=1) 
>                           Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>                           ->  Bitmap Index Scan on ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
(actualtime=11.438..11.438 rows=63436 loops=1) 
>                                 Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid >
573936097512390656::bigint)AND (objid < 573936101807357952::bigint)) 
>               ->  Materialize  (cost=13584864369.09..14272439933.58 rows=55006045159 width=16) (actual
time=12914.166..13699.719rows=1299867 loops=1) 
>                     ->  Sort  (cost=13584864369.09..13722379481.99 rows=55006045159 width=16) (actual
time=12914.153..13411.554rows=1299867 loops=1) 
>                           Sort Key: q1_1.objid, t5.objid
>                           Sort Method:  external merge  Disk: 47192kB
>                           ->  Nested Loop  (cost=0.00..11917516.57 rows=55006045159 width=16) (actual
time=0.621..10505.130rows=1858326 loops=1) 
>                                 ->  Seq Scan on atdateval t5  (cost=0.00...294152.40 rows=1859934 width=12) (actual
time=0.593..1870.182rows=1857906 loops=1) 
>                                       Filter: (attrid = 281479288456447::bigint)
>                                 ->  Index Scan using ind_ataggval on ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
(actualtime=0.004..0.004 rows=1 loops=1857906) 
>                                       Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval =
t5.aggrid))
>                                       Filter: (q1_1.aggrid = 0)
>         ->  Index Scan using cooobjectix on cooobject t6  (cost=0.00..8.22 rows=1 width=8) (actual time=3.327..3.328
rows=1loops=108) 
>               Index Cond: (t6.objid = t4.objid)
>               Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
>  Total runtime: 14487.434 ms
>
>
>
>
>
>
> -----Original Message-----
> From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
> Sent: Dienstag, 20. Juli 2010 21:39
> To: Kneringer, Armin
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] potential performance gain by query planner optimization
>
> Hello
>
> 2010/7/20 Kneringer, Armin <Armin.Kneringer@fabasoft.com>:
>> Hi there.
>>
>> I think I found a potential performance gain if the query planner
>> would be optimized. All Tests has been performed with 8.4.1 (and
>> earlier versions) on CentOS 5.3 (x64)
>>
>> The following query will run on my database (~250 GB) for ca. 1600
>> seconds and the sort will result in a disk merge deploying ca. 200 GB
>> of data to the local disk (ca. 180.000 tmp-files)
>
> can you try show check explain with set enable_hashjoin to off;   ?
>
> Regards
>
> Pavel Stehule
>
>>
>> explain SELECT DISTINCT t4.objid
>> FROM fscsubfile t4, cooobject t6
>>  NOT EXISTS (
>>  WHERE t6.objid = t4.objid AND
>>  t4.fileresporgid = 573936067464397682 AND
>>   NOT EXISTS (
>>   SELECT 1
>>   FROM ataggval q1_1,
>>   atdateval t5
>>   WHERE q1_1.objid = t4.objid AND
>>   q1_1.attrid = 281479288456451 AND
>>   q1_1.aggrid = 0 AND
>>   t5.aggrid = q1_1.aggval AND
>>   t5.objid = t4.objid AND
>>   t5.attrid = 281479288456447 ) AND
>>  ((t6.objclassid IN (285774255832590,285774255764301))) AND
>>  ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>>  ORDER BY t4.objid;
>>
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> ----------------------------------------------------------------------
>> ----------------------------------
>> Unique  (cost=2592137103.99..2592137104.00 rows=1 width=8)
>>   ->  Sort  (cost=2592137103.99..2592137104.00 rows=1 width=8)
>>         Sort Key: t4.objid
>>         ->  Nested Loop  (cost=1105592553.38..2592137103.98 rows=1
>> width=8)
>>               ->  Hash Anti Join  (cost=1105592553.38..2592137095.75
>> rows=1 width=8)
>>                     Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid
>> = t5.objid))
>>                     ->  Bitmap Heap Scan on fscsubfile t4
>> (cost=154.42...14136.40 rows=5486 width=8)
>>                           Recheck Cond: ((fileresporgid =
>> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
>> AND (objid < 573936101807357952::bigint))
>>                           ->  Bitmap Index Scan on
>> ind_fscsubfile_filerespons  (cost=0.00..153.05 rows=5486 width=0)
>>                                 Index Cond: ((fileresporgid =
>> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
>> AND (objid < 573936101807357952::bigint))
>>                     ->  Hash  (cost=11917516.57..11917516.57
>> rows=55006045159 width=16)
>>                           ->  Nested Loop  (cost=0.00..11917516.57
>> rows=55006045159 width=16)
>>                                 ->  Seq Scan on atdateval t5
>> (cost=0.00...294152.40 rows=1859934 width=12)
>>                                       Filter: (attrid =
>> 281479288456447::bigint)
>>                                 ->  Index Scan using ind_ataggval on
>> ataggval q1_1  (cost=0.00..6.20 rows=4 width=12)
>>                                       Index Cond: ((q1_1.attrid =
>> 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid))
>>                                       Filter: (q1_1.aggrid = 0)
>>               ->  Index Scan using cooobjectix on cooobject t6
>> (cost=0.00..8.22 rows=1 width=8)
>>                     Index Cond: (t6.objid = t4.objid)
>>                     Filter: (t6.objclassid = ANY
>> ('{285774255832590,285774255764301}'::bigint[]))
>> (20 rows)
>>
>>
>> As the disks pace is limited on my test system I can't provide the
>> "explain analyze" output If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files
forsorting. 
>> (Changed lines are marked with [!!!!!] as I don't know HTML-Mails will
>> be delivered without conversion
>>
>> explain SELECT DISTINCT t4.objid
>> FROM fscsubfile t4, cooobject t6
>> WHERE t6.objid = t4.objid AND
>> t4.fileresporgid = 573936067464397682 AND
>>   NOT EXISTS (
>>   SELECT 1
>>   FROM ataggval q1_1,
>>   atdateval t5
>>   WHERE q1_1.objid = t4.objid AND
>>   q1_1.attrid = 281479288456451 AND
>>   q1_1.aggrid = 0 AND
>>   t5.aggrid = q1_1.aggval AND
>>   t5.objid = q1_1.objid AND                 [!!!!!]
>>   t5.attrid = 281479288456447 ) AND
>>   ((t6.objclassid IN (285774255832590,285774255764301))) AND
>>   ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>>  ORDER BY t4.objid;
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> ----------------------------------------------------------------------
>> ---------------------- Unique  (cost=918320.29..971968.88 rows=1
>> width=8)
>>   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8)
>>         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1
>> width=8)
>>               Merge Cond: (t4.objid = q1_1.objid)
>>               ->  Index Scan using ind_fscsubfile_filerespons on
>> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8)
>>                     Index Cond: ((fileresporgid =
>> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
>> AND (objid < 573936101807357952::bigint))
>>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
>> width=8)
>>                     ->  Merge Join  (cost=912418.42..956372.47
>> rows=22689 width=8)
>>                           Merge Cond: ((t5.objid = q1_1.objid) AND
>> (t5.aggrid = q1_1.aggval))
>>                           ->  Sort  (cost=402024.80..406674.63
>> rows=1859934 width=12)
>>                                 Sort Key: t5.objid, t5.aggrid
>>                                 ->  Bitmap Heap Scan on atdateval t5
>> (cost=43749.07..176555.24 rows=1859934 width=12)
>>                                       Recheck Cond: (attrid =
>> 281479288456447::bigint)
>>                                       ->  Bitmap Index Scan on
>> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0)
>>                                             Index Cond: (attrid =
>> 281479288456447::bigint)
>>                           ->  Materialize  (cost=510392.25..531663.97
>> rows=1701738 width=12)
>>                                 ->  Sort  (cost=510392.25..514646.59
>> rows=1701738 width=12)
>>                                       Sort Key: q1_1.objid,
>> q1_1.aggval
>>                                       ->  Bitmap Heap Scan on ataggval
>> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12)
>>                                             Recheck Cond: (attrid =
>> 281479288456451::bigint)
>>                                             Filter: (aggrid = 0)
>>                                             ->  Bitmap Index Scan on
>> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0)
>>                                                   Index Cond: (attrid
>> = 281479288456451::bigint)
>>         ->  Index Scan using cooobjectix on cooobject t6
>> (cost=0.00..8.22 rows=1 width=8)
>>               Index Cond: (t6.objid = t4.objid)
>>               Filter: (t6.objclassid = ANY
>> ('{285774255832590,285774255764301}'::bigint[]))
>> (26 rows)
>>
>> explain analyze SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject
>> t6 WHERE t6.objid = t4.objid AND t4.fileresporgid = 573936067464397682
>> AND
>>  NOT EXISTS (
>>  SELECT 1
>>  FROM ataggval q1_1,
>>  atdateval t5
>>  WHERE q1_1.objid = t4.objid AND
>>  q1_1.attrid = 281479288456451 AND
>>  q1_1.aggrid = 0 AND
>>  t5.aggrid = q1_1.aggval AND
>>  t5.objid = q1_1.objid AND                 [!!!!!]
>>  t5.attrid = 281479288456447 ) AND
>> ((t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid
>> > 573936097512390656 and t4.objid < 573936101807357952)) ORDER BY
>> t4.objid;
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> ----------------------------------------------------------------------
>> -----------------------------------
>> Unique  (cost=918320.29..971968.88 rows=1 width=8) (actual
>> time=12079.598..12083.048 rows=64 loops=1)
>>   ->  Nested Loop  (cost=918320.29..971968.88 rows=1 width=8) (actual
>> time=12079.594..12083.010 rows=64 loops=1)
>>         ->  Merge Anti Join  (cost=918320.29..971960.65 rows=1
>> width=8) (actual time=12037.524..12081.989 rows=108 loops=1)
>>               Merge Cond: (t4.objid = q1_1.objid)
>>               ->  Index Scan using ind_fscsubfile_filerespons on
>> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8) (actual
>> time=0.073..83.498 rows=63436 loops=1)
>>                     Index Cond: ((fileresporgid =
>> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
>> AND (objid < 573936101807357952::bigint))
>>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
>> width=8) (actual time=8866.253..11753.055 rows=1299685 loops=1)
>>                     ->  Merge Join  (cost=912418.42..956372.47
>> rows=22689 width=8) (actual time=8866.246..11413.397 rows=1299685
>> loops=1)
>>                           Merge Cond: ((t5.objid = q1_1.objid) AND
>> (t5.aggrid = q1_1.aggval))
>>                           ->  Sort  (cost=402024.80..406674.63
>> rows=1859934 width=12) (actual time=3133.362..3774.076 rows=1299685
>> loops=1)
>>                                 Sort Key: t5.objid, t5.aggrid
>>                                 Sort Method:  external merge  Disk:
>> 47192kB
>>                                 ->  Bitmap Heap Scan on atdateval t5
>> (cost=43749.07..176555.24 rows=1859934 width=12) (actual
>> time=282.454..1079.038 rows=1857906 loops=1)
>>                                       Recheck Cond: (attrid =
>> 281479288456447::bigint)
>>                                       ->  Bitmap Index Scan on
>> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0) (actual
>> time=258.749...258.749 rows=1857906 loops=1)
>>                                             Index Cond: (attrid =
>> 281479288456447::bigint)
>>                           ->  Materialize  (cost=510392.25..531663.97
>> rows=1701738 width=12) (actual time=5732.872..6683.784 rows=1299685
>> loops=1)
>>                                 ->  Sort  (cost=510392.25..514646.59
>> rows=1701738 width=12) (actual time=5732.866..6387.188 rows=1299685
>> loops=1)
>>                                      Sort Key: q1_1.objid, q1_1.aggval
>>                                       Sort Method:  external merge
>> Disk: 39920kB
>>                                       ->  Bitmap Heap Scan on ataggval
>> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12) (actual
>> time=1644.983..3634.044 rows=1857906 loops=1)
>>                                             Recheck Cond: (attrid =
>> 281479288456451::bigint)
>>                                             Filter: (aggrid = 0)
>>                                             ->  Bitmap Index Scan on
>> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0) (actual
>> time=1606.325..1606.325 rows=1877336 loops=1)
>>                                                   Index Cond: (attrid
>> = 281479288456451::bigint)
>>         ->  Index Scan using cooobjectix on cooobject t6
>> (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009 rows=1
>> loops=108)
>>               Index Cond: (t6.objid = t4.objid)
>>               Filter: (t6.objclassid = ANY
>> ('{285774255832590,285774255764301}'::bigint[]))
>> Total runtime: 12108.663 ms
>> (29 rows)
>>
>>
>> Another way to optimize my query is to change it as follows:
>> (Once again changes are marked with [!!!!!]
>>
>> explain SELECT DISTINCT t4.objid
>> FROM fscsubfile t4, cooobject t6
>> WHERE t6.objid = t4.objid AND
>> t4.fileresporgid = 573936067464397682 AND
>>   NOT EXISTS (
>>   SELECT 1
>>   FROM ataggval q1_1,
>>   atdateval t5
>>   WHERE q1_1.objid = t5.objid AND                 [!!!!!]
>>   q1_1.attrid = 281479288456451 AND
>>   q1_1.aggrid = 0 AND
>>   t5.aggrid = q1_1.aggval AND
>>   t5.objid = t4.objid AND
>>   t5.attrid = 281479288456447 ) AND
>>  ((t6.objclassid IN (285774255832590,285774255764301))) AND
>>  ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
>>  ORDER BY t4.objid;
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> ----------------------------------------------------------------------
>> ---------------------- Unique  (cost=916978.86..969139.72 rows=1
>> width=8)
>>   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8)
>>         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1
>> width=8)
>>               Merge Cond: (t4.objid = t5.objid)
>>               ->  Index Scan using ind_fscsubfile_filerespons on
>> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8)
>>                     Index Cond: ((fileresporgid =
>> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
>> AND (objid < 573936101807357952::bigint))
>>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
>> width=8)
>>                     ->  Merge Join  (cost=912418.42..956372.47
>> rows=22689 width=8)
>>                           Merge Cond: ((t5.objid = q1_1.objid) AND
>> (t5.aggrid = q1_1.aggval))
>>                           ->  Sort  (cost=402024.80..406674.63
>> rows=1859934 width=12)
>>                                 Sort Key: t5.objid, t5.aggrid
>>                                 ->  Bitmap Heap Scan on atdateval t5
>> (cost=43749.07..176555.24 rows=1859934 width=12)
>>                                       Recheck Cond: (attrid =
>> 281479288456447::bigint)
>>                                       ->  Bitmap Index Scan on
>> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0)
>>                                             Index Cond: (attrid =
>> 281479288456447::bigint)
>>                           ->  Materialize  (cost=510392.25..531663.97
>> rows=1701738 width=12)
>>                                 ->  Sort  (cost=510392.25..514646.59
>> rows=1701738 width=12)
>>                                       Sort Key: q1_1.objid,
>> q1_1.aggval
>>                                       ->  Bitmap Heap Scan on ataggval
>> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12)
>>                                             Recheck Cond: (attrid =
>> 281479288456451::bigint)
>>                                             Filter: (aggrid = 0)
>>                                             ->  Bitmap Index Scan on
>> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0)
>>                                                   Index Cond: (attrid
>> = 281479288456451::bigint)
>>         ->  Index Scan using cooobjectix on cooobject t6
>> (cost=0.00..8.22 rows=1 width=8)
>>               Index Cond: (t6.objid = t4.objid)
>>               Filter: (t6.objclassid = ANY
>> ('{285774255832590,285774255764301}'::bigint[]))
>> (26 rows)
>>
>>
>> explain analyze SELECT DISTINCT t4.objid FROM fscsubfile t4, cooobject
>> t6 WHERE t6.objid = t4.objid AND t4.fileresporgid = 573936067464397682
>> AND
>>  NOT EXISTS (
>>  SELECT 1
>>  FROM ataggval q1_1,
>>  atdateval t5
>>  WHERE q1_1.objid = t5.objid AND                 [!!!!!]
>>  q1_1.attrid = 281479288456451 AND
>>  q1_1.aggrid = 0 AND
>>  t5.aggrid = q1_1.aggval AND
>>  t5.objid = t4.objid AND
>>  t5.attrid = 281479288456447 ) AND
>> ((t6.objclassid IN (285774255832590,285774255764301))) AND ((t4.objid
>> > 573936097512390656 and t4.objid < 573936101807357952)) ORDER BY
>> t4.objid;
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> ----------------------------------------------------------------------
>> -----------------------------------
>>
>> Unique  (cost=916978.86..969139.72 rows=1 width=8) (actual
>> time=12102.964..12106.409 rows=64 loops=1)
>>   ->  Nested Loop  (cost=916978.86..969139.72 rows=1 width=8) (actual
>> time=12102.959..12106.375 rows=64 loops=1)
>>         ->  Merge Anti Join  (cost=916978.86..969131.49 rows=1
>> width=8) (actual time=12060.916..12105.374 rows=108 loops=1)
>>               Merge Cond: (t4.objid = t5.objid)
>>               ->  Index Scan using ind_fscsubfile_filerespons on
>> fscsubfile t4  (cost=0.00..19016.05 rows=5486 width=8) (actual
>> time=0.080..81.397 rows=63436 loops=1)
>>                     Index Cond: ((fileresporgid =
>> 573936067464397682::bigint) AND (objid > 573936097512390656::bigint)
>> AND (objid < 573936101807357952::bigint))
>>               ->  Materialize  (cost=912418.42..956599.36 rows=22689
>> width=8) (actual time=8874.492..11778.254 rows=1299685 loops=1)
>>                     ->  Merge Join  (cost=912418.42..956372.47
>> rows=22689 width=8) (actual time=8874.484..11437.175 rows=1299685
>> loops=1)
>>                           Merge Cond: ((t5.objid = q1_1.objid) AND
>> (t5.aggrid = q1_1.aggval))
>>                           ->  Sort  (cost=402024.80..406674.63
>> rows=1859934 width=12) (actual time=3117.555..3756.062 rows=1299685
>> loops=1)
>>                                 Sort Key: t5.objid, t5.aggrid
>>                                 Sort Method:  external merge  Disk:
>> 39920kB
>>                                 ->  Bitmap Heap Scan on atdateval t5
>> (cost=43749.07..176555.24 rows=1859934 width=12) (actual
>> time=289.475..1079.624 rows=1857906 loops=1)
>>                                       Recheck Cond: (attrid =
>> 281479288456447::bigint)
>>                                       ->  Bitmap Index Scan on
>> ind_atdateval  (cost=0.00..43284.08 rows=1859934 width=0) (actual
>> time=265.720...265.720 rows=1857906 loops=1)
>>                                             Index Cond: (attrid =
>> 281479288456447::bigint)
>>                           ->  Materialize  (cost=510392.25..531663.97
>> rows=1701738 width=12) (actual time=5756.915..6707.864 rows=1299685
>> loops=1)
>>                                 ->  Sort  (cost=510392.25..514646.59
>> rows=1701738 width=12) (actual time=5756.909..6409.819 rows=1299685
>> loops=1)
>>                                       Sort Key: q1_1.objid,
>> q1_1.aggval
>>                                       Sort Method:  external merge
>> Disk: 39920kB
>>                                       ->  Bitmap Heap Scan on ataggval
>> q1_1  (cost=44666.00..305189.47 rows=1701738 width=12) (actual
>> time=1646.955..3628.918 rows=1857906 loops=1)
>>                                             Recheck Cond: (attrid =
>> 281479288456451::bigint)
>>                                             Filter: (aggrid = 0)
>>                                             ->  Bitmap Index Scan on
>> ind_ataggval  (cost=0.00..44240.56 rows=1860698 width=0) (actual
>> time=1608.233..1608.233 rows=1877336 loops=1)
>>                                                   Index Cond: (attrid
>> = 281479288456451::bigint)
>>         ->  Index Scan using cooobjectix on cooobject t6
>> (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009 rows=1
>> loops=108)
>>               Index Cond: (t6.objid = t4.objid)
>>               Filter: (t6.objclassid = ANY
>> ('{285774255832590,285774255764301}'::bigint[]))
>> Total runtime: 12129.613 ms
>> (29 rows)
>>
>>
>>
>> As the query performs in roughly 12 seconds in both (changed) cases
>> you might advise to change my queries :-) (In fact we are working on
>> this) As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked
thisissue on MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the
queryan the performance is well. Oralce shows a slightly difference but the performance is also well. 
>> As I mentioned we are looking forward to change our query but in my
>> opinion there could be a general performance gain if this issue is
>> addressed. (especially if you don't know you run into this issue on
>> the query performance is sufficient enough)
>>
>> greets
>> Armin
>>
>> --
>> Sent via pgsql-performance mailing list
>> (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>

Re: potential performance gain by query planner optimization

От
Tom Lane
Дата:
"Kneringer, Armin" <Armin.Kneringer@fabasoft.com> writes:
> I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed
with8.4.1 (and earlier versions) on CentOS 5.3 (x64) 

> The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge
deployingca. 200 GB of data to the local disk (ca. 180.000 tmp-files) 

What have you got work_mem set to?  It looks like you must be using an
unreasonably large value, else the planner wouldn't have tried to use a
hash join here:

>                      ->  Hash  (cost=11917516.57..11917516.57 rows=55006045159 width=16)
>                            ->  Nested Loop  (cost=0.00..11917516.57 rows=55006045159 width=16)
>                                  ->  Seq Scan on atdateval t5  (cost=0.00...294152.40 rows=1859934 width=12)
>                                        Filter: (attrid = 281479288456447::bigint)
>                                  ->  Index Scan using ind_ataggval on ataggval q1_1  (cost=0.00..6.20 rows=4
width=12)
>                                        Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval =
t5.aggrid))
>                                        Filter: (q1_1.aggrid = 0)

Also, please try something newer than 8.4.1 --- this might be some
already-fixed bug.

            regards, tom lane