Re: Erratically behaving query needs optimization

Поиск
Список
Период
Сортировка
От Barbu Paul - Gheorghe
Тема Re: Erratically behaving query needs optimization
Дата
Msg-id CAB14va8TnWBkLYUAt6YhfwnOU+RMOvQmJzeJMsnadt2G58aitg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Erratically behaving query needs optimization  (Luís Roberto Weck <luisroberto@siscobra.com.br>)
Ответы Re: Erratically behaving query needs optimization  (Barbu Paul - Gheorghe <barbu.paul.gheorghe@gmail.com>)
Список pgsql-performance
with ORDER BY so I get the correct results (163 seconds):
https://explain.depesz.com/s/j3o1

Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
time=163953.091..163954.621 rows=2 loops=1)
  Buffers: shared hit=183080 read=103411
  ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
time=163953.081..163953.570 rows=4102 loops=1)
        Sort Key: results.attribute_id, results."timestamp" DESC
        Sort Method: quicksort  Memory: 641kB
        Buffers: shared hit=183080 read=103411
        ->  Nested Loop  (cost=132172.41..164243.74 rows=5999
width=54) (actual time=3054.965..163928.686 rows=4102 loops=1)
              Buffers: shared hit=183074 read=103411
              ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
width=4) (actual time=2484.449..2581.582 rows=59269 loops=1)
                    Group Key: scheduler_operation_executions.id
                    Buffers: shared hit=87 read=76529
                    ->  Gather  (cost=44474.33..132020.01 rows=60790
width=4) (actual time=312.503..2463.254 rows=59269 loops=1)
                          Workers Planned: 2
                          Workers Launched: 2
                          Buffers: shared hit=87 read=76529
                          ->  Parallel Hash Join
(cost=43474.33..124941.01 rows=25329 width=4) (actual
time=124.733..2279.986 rows=19756 loops=3)
                                Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
                                Buffers: shared hit=87 read=76529
                                ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74948.21 rows=2483221
width=8) (actual time=0.126..1828.461 rows=1986887 loops=3)
                                      Buffers: shared hit=2 read=50114
                                ->  Parallel Hash
(cost=43263.67..43263.67 rows=16853 width=4) (actual
time=123.631..123.631 rows=13558 loops=3)
                                      Buckets: 65536  Batches: 1
Memory Usage: 2144kB
                                      Buffers: shared hit=1 read=26415
                                      ->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=757.90..43263.67 rows=16853 width=4)
(actual time=6.944..120.405 rows=13558 loops=3)
                                            Recheck Cond: (device_id = 97)
                                            Heap Blocks: exact=24124
                                            Buffers: shared hit=1 read=26415
                                            ->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..747.79 rows=40448
width=0) (actual time=13.378..13.378 rows=40675 loops=1)
                                                  Index Cond: (device_id = 97)
                                                  Buffers: shared read=114
              ->  Index Scan using index_operation_execution_id_asc on
results  (cost=0.43..0.51 rows=1 width=58) (actual time=2.720..2.720
rows=0 loops=59269)
                    Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
                    Filter: ((data IS NOT NULL) AND (data <>
'<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
                    Rows Removed by Filter: 0
                    Buffers: shared hit=182987 read=26882
Planning Time: 349.908 ms
Execution Time: 163962.314 ms


With ORDER BY (on the second run, 0.6 seconds):
https://explain.depesz.com/s/QZ1Z
Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
time=621.057..621.527 rows=2 loops=1)
  Buffers: shared hit=236659 read=49826
  ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
time=621.056..621.188 rows=4102 loops=1)
        Sort Key: results.attribute_id, results."timestamp" DESC
        Sort Method: quicksort  Memory: 641kB
        Buffers: shared hit=236659 read=49826
        ->  Nested Loop  (cost=132172.41..164243.74 rows=5999
width=54) (actual time=503.577..619.250 rows=4102 loops=1)
              Buffers: shared hit=236659 read=49826
              ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
width=4) (actual time=503.498..513.551 rows=59269 loops=1)
                    Group Key: scheduler_operation_executions.id
                    Buffers: shared hit=26790 read=49826
                    ->  Gather  (cost=44474.33..132020.01 rows=60790
width=4) (actual time=65.499..489.396 rows=59269 loops=1)
                          Workers Planned: 2
                          Workers Launched: 2
                          Buffers: shared hit=26790 read=49826
                          ->  Parallel Hash Join
(cost=43474.33..124941.01 rows=25329 width=4) (actual
time=22.059..441.847 rows=19756 loops=3)
                                Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
                                Buffers: shared hit=26790 read=49826
                                ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74948.21 rows=2483221
width=8) (actual time=0.083..229.120 rows=1986887 loops=3)
                                      Buffers: shared hit=290 read=49826
                                ->  Parallel Hash
(cost=43263.67..43263.67 rows=16853 width=4) (actual
time=20.648..20.648 rows=13558 loops=3)
                                      Buckets: 65536  Batches: 1
Memory Usage: 2144kB
                                      Buffers: shared hit=26416
                                      ->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=757.90..43263.67 rows=16853 width=4)
(actual time=12.833..26.689 rows=20338 loops=2)
                                            Recheck Cond: (device_id = 97)
                                            Heap Blocks: exact=26052
                                            Buffers: shared hit=26416
                                            ->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..747.79 rows=40448
width=0) (actual time=19.424..19.424 rows=40675 loops=1)
                                                  Index Cond: (device_id = 97)
                                                  Buffers: shared hit=114
              ->  Index Scan using index_operation_execution_id_asc on
results  (cost=0.43..0.51 rows=1 width=58) (actual time=0.002..0.002
rows=0 loops=59269)
                    Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
                    Filter: ((data IS NOT NULL) AND (data <>
'<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
                    Rows Removed by Filter: 0
                    Buffers: shared hit=209869
Planning Time: 1.893 ms
Execution Time: 627.590 ms



Without (exactly as you wrote it, 1.1s): https://explain.depesz.com/s/qKmj
Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
time=1103.230..1103.587 rows=2 loops=1)
  Buffers: shared hit=183077 read=103411
  ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
time=1103.230..1103.359 rows=4102 loops=1)
        Sort Key: results.attribute_id
        Sort Method: quicksort  Memory: 641kB
        Buffers: shared hit=183077 read=103411
        ->  Nested Loop  (cost=132172.41..164243.74 rows=5999
width=54) (actual time=605.314..1101.687 rows=4102 loops=1)
              Buffers: shared hit=183074 read=103411
              ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
width=4) (actual time=604.710..615.933 rows=59269 loops=1)
                    Group Key: scheduler_operation_executions.id
                    Buffers: shared hit=87 read=76529
                    ->  Gather  (cost=44474.33..132020.01 rows=60790
width=4) (actual time=173.528..590.757 rows=59269 loops=1)
                          Workers Planned: 2
                          Workers Launched: 2
                          Buffers: shared hit=87 read=76529
                          ->  Parallel Hash Join
(cost=43474.33..124941.01 rows=25329 width=4) (actual
time=143.420..563.646 rows=19756 loops=3)
                                Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
                                Buffers: shared hit=87 read=76529
                                ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74948.21 rows=2483221
width=8) (actual time=0.121..228.542 rows=1986887 loops=3)
                                      Buffers: shared hit=2 read=50114
                                ->  Parallel Hash
(cost=43263.67..43263.67 rows=16853 width=4) (actual
time=142.853..142.853 rows=13558 loops=3)
                                      Buckets: 65536  Batches: 1
Memory Usage: 2112kB
                                      Buffers: shared hit=1 read=26415
                                      ->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=757.90..43263.67 rows=16853 width=4)
(actual time=2.869..139.083 rows=13558 loops=3)
                                            Recheck Cond: (device_id = 97)
                                            Heap Blocks: exact=10677
                                            Buffers: shared hit=1 read=26415
                                            ->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..747.79 rows=40448
width=0) (actual time=5.347..5.347 rows=40675 loops=1)
                                                  Index Cond: (device_id = 97)
                                                  Buffers: shared read=114
              ->  Index Scan using index_operation_execution_id_asc on
results  (cost=0.43..0.51 rows=1 width=58) (actual time=0.008..0.008
rows=0 loops=59269)
                    Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
                    Filter: ((data IS NOT NULL) AND (data <>
'<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
                    Rows Removed by Filter: 0
                    Buffers: shared hit=182987 read=26882
Planning Time: 23.634 ms
Execution Time: 1106.375 ms

On Thu, Aug 22, 2019 at 3:05 PM Luís Roberto Weck
<luisroberto@siscobra.com.br> wrote:
>
> Em 22/08/2019 08:51, Barbu Paul - Gheorghe escreveu:
> > That query, if I add the ORDER BY and LIMIT, returns the same results.
> >
> > The problem is the fact that it behaves the same way regarding its
> > speed as the original query with the index you suggested.
> > Sometimes it takes 800ms, sometimes it takes 6s to run, how the hell
> > can I get it to behave the same every time?
> > After I added the index you suggested, it was fine for a while, next
> > morning the run time exploded back to several seconds per query... and
> > it oscillates.
> >
> > On Wed, Aug 21, 2019 at 2:25 PM Luís Roberto Weck
> > <luisroberto@siscobra.com.br> wrote:
> >> Em 21/08/2019 04:30, Barbu Paul - Gheorghe escreveu:
> >>> I wonder how I missed that... probabily because of the "WHERE" clause
> >>> in what I already had.
> >>>
> >>> I indexed by scheduler_task_executions.device_id and the new plan is
> >>> as follows: https://explain.depesz.com/s/cQRq
> >>>
> >>> Can it be further improved?
> >>>
> >>> Limit  (cost=138511.45..138519.36 rows=2 width=54) (actual
> >>> time=598.703..618.524 rows=2 loops=1)
> >>>     Buffers: shared hit=242389 read=44098
> >>>     ->  Unique  (cost=138511.45..138527.26 rows=4 width=54) (actual
> >>> time=598.701..598.878 rows=2 loops=1)
> >>>           Buffers: shared hit=242389 read=44098
> >>>           ->  Sort  (cost=138511.45..138519.36 rows=3162 width=54)
> >>> (actual time=598.699..598.767 rows=2052 loops=1)
> >>>                 Sort Key: results.attribute_id, results."timestamp" DESC
> >>>                 Sort Method: quicksort  Memory: 641kB
> >>>                 Buffers: shared hit=242389 read=44098
> >>>                 ->  Gather  (cost=44082.11..138327.64 rows=3162
> >>> width=54) (actual time=117.548..616.456 rows=4102 loops=1)
> >>>                       Workers Planned: 2
> >>>                       Workers Launched: 2
> >>>                       Buffers: shared hit=242389 read=44098
> >>>                       ->  Nested Loop  (cost=43082.11..137011.44
> >>> rows=1318 width=54) (actual time=47.436..525.664 rows=1367 loops=3)
> >>>                             Buffers: shared hit=242389 read=44098
> >>>                             ->  Parallel Hash Join
> >>> (cost=43081.68..124545.34 rows=24085 width=4) (actual
> >>> time=33.099..469.958 rows=19756 loops=3)
> >>>                                   Hash Cond:
> >>> (scheduler_operation_executions.task_execution_id =
> >>> scheduler_task_executions.id)
> >>>                                   Buffers: shared hit=32518 read=44098
> >>>                                   ->  Parallel Seq Scan on
> >>> scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
> >>> width=8) (actual time=8.493..245.190 rows=1986887 loops=3)
> >>>                                         Buffers: shared hit=6018 read=44098
> >>>                                   ->  Parallel Hash
> >>> (cost=42881.33..42881.33 rows=16028 width=4) (actual
> >>> time=23.272..23.272 rows=13558 loops=3)
> >>>                                         Buckets: 65536  Batches: 1
> >>> Memory Usage: 2112kB
> >>>                                         Buffers: shared hit=26416
> >>>                                         ->  Parallel Bitmap Heap Scan on
> >>> scheduler_task_executions  (cost=722.55..42881.33 rows=16028 width=4)
> >>> (actual time=27.290..61.563 rows=40675 loops=1)
> >>>                                               Recheck Cond: (device_id = 97)
> >>>                                               Heap Blocks: exact=26302
> >>>                                               Buffers: shared hit=26416
> >>>                                               ->  Bitmap Index Scan on
> >>> scheduler_task_executions_device_id_idx  (cost=0.00..712.93 rows=38467
> >>> width=0) (actual time=17.087..17.087 rows=40675 loops=1)
> >>>                                                     Index Cond: (device_id = 97)
> >>>                                                     Buffers: shared hit=114
> >>>                             ->  Index Scan using
> >>> index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
> >>> width=58) (actual time=0.003..0.003 rows=0 loops=59269)
> >>>                                   Index Cond: (operation_execution_id =
> >>> scheduler_operation_executions.id)
> >>>                                   Filter: ((data IS NOT NULL) AND (data
> >>> <> '<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
> >>> AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
> >>>                                   Rows Removed by Filter: 0
> >>>                                   Buffers: shared hit=209871
> >>> Planning Time: 2.327 ms
> >>> Execution Time: 618.935 ms
> >>>
> >>> On Tue, Aug 20, 2019 at 5:54 PM Luís Roberto Weck
> >>> <luisroberto@siscobra.com.br> wrote:
> >>>> Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:
> >>>>> Hello,
> >>>>> I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
> >>>>> 64-bit" and I have a query that runs several times per user action
> >>>>> (9-10 times).
> >>>>> The query takes a long time to execute, specially at first, due to
> >>>>> cold caches I think, but the performance varies greatly during a run
> >>>>> of the application (while applying the said action by the user several
> >>>>> times).
> >>>>>
> >>>>> My tables are only getting bigger with time, not much DELETEs and even
> >>>>> less UPDATEs as far as I can tell.
> >>>>>
> >>>>> Problematic query:
> >>>>>
> >>>>> EXPLAIN (ANALYZE,BUFFERS)
> >>>>> SELECT DISTINCT ON (results.attribute_id) results.timestamp,
> >>>>> results.data FROM results
> >>>>>        JOIN scheduler_operation_executions ON
> >>>>> scheduler_operation_executions.id = results.operation_execution_id
> >>>>>        JOIN scheduler_task_executions ON scheduler_task_executions.id =
> >>>>> scheduler_operation_executions.task_execution_id
> >>>>> WHERE scheduler_task_executions.device_id = 97
> >>>>>        AND results.data <> '<NullData/>'
> >>>>>        AND results.data IS NOT NULL
> >>>>>        AND results.object_id = 1955
> >>>>>        AND results.attribute_id IN (4, 5) -- possibly a longer list here
> >>>>>        AND results.data_access_result = 'SUCCESS'
> >>>>> ORDER BY results.attribute_id, results.timestamp DESC
> >>>>> LIMIT 2 -- limit by the length of the attributes list
> >>>>>
> >>>>> In words: I want the latest (ORDER BY results.timestamp DESC) results
> >>>>> of a device (scheduler_task_executions.device_id = 97 - hence the
> >>>>> joins results -> scheduler_operation_executions ->
> >>>>> scheduler_task_executions)
> >>>>> for a given object and attributes with some additional constraints on
> >>>>> the data column. But I only want the latest attributes for which we
> >>>>> have results, hence the DISTINCT ON (results.attribute_id) and LIMIT.
> >>>>>
> >>>>> First run: https://explain.depesz.com/s/qh4C
> >>>>> Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
> >>>>> time=44068.166..44086.970 rows=2 loops=1)
> >>>>>      Buffers: shared hit=215928 read=85139
> >>>>>      ->  Unique  (cost=157282.39..157298.20 rows=4 width=54) (actual
> >>>>> time=44068.164..44069.301 rows=2 loops=1)
> >>>>>            Buffers: shared hit=215928 read=85139
> >>>>>            ->  Sort  (cost=157282.39..157290.29 rows=3162 width=54)
> >>>>> (actual time=44068.161..44068.464 rows=2052 loops=1)
> >>>>>                  Sort Key: results.attribute_id, results."timestamp" DESC
> >>>>>                  Sort Method: quicksort  Memory: 641kB
> >>>>>                  Buffers: shared hit=215928 read=85139
> >>>>>                  ->  Gather  (cost=62853.04..157098.57 rows=3162
> >>>>> width=54) (actual time=23518.745..44076.385 rows=4102 loops=1)
> >>>>>                        Workers Planned: 2
> >>>>>                        Workers Launched: 2
> >>>>>                        Buffers: shared hit=215928 read=85139
> >>>>>                        ->  Nested Loop  (cost=61853.04..155782.37
> >>>>> rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367
> >>>>> loops=3)
> >>>>>                              Buffers: shared hit=215928 read=85139
> >>>>>                              ->  Parallel Hash Join
> >>>>> (cost=61852.61..143316.27 rows=24085 width=4) (actual
> >>>>> time=23271.275..40018.451 rows=19756 loops=3)
> >>>>>                                    Hash Cond:
> >>>>> (scheduler_operation_executions.task_execution_id =
> >>>>> scheduler_task_executions.id)
> >>>>>                                    Buffers: shared hit=6057 read=85139
> >>>>>                                    ->  Parallel Seq Scan on
> >>>>> scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
> >>>>> width=8) (actual time=7.575..15694.435 rows=1986887 loops=3)
> >>>>>                                          Buffers: shared hit=2996 read=47120
> >>>>>                                    ->  Parallel Hash
> >>>>> (cost=61652.25..61652.25 rows=16029 width=4) (actual
> >>>>> time=23253.337..23253.337 rows=13558 loops=3)
> >>>>>                                          Buckets: 65536  Batches: 1
> >>>>> Memory Usage: 2144kB
> >>>>>                                          Buffers: shared hit=2977 read=38019
> >>>>>                                          ->  Parallel Seq Scan on
> >>>>> scheduler_task_executions  (cost=0.00..61652.25 rows=16029 width=4)
> >>>>> (actual time=25.939..23222.174 rows=13558 loops=3)
> >>>>>                                                Filter: (device_id = 97)
> >>>>>                                                Rows Removed by Filter: 1308337
> >>>>>                                                Buffers: shared hit=2977 read=38019
> >>>>>                              ->  Index Scan using
> >>>>> index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
> >>>>> width=58) (actual time=0.191..0.191 rows=0 loops=59269)
> >>>>>                                    Index Cond: (operation_execution_id =
> >>>>> scheduler_operation_executions.id)
> >>>>>                                    Filter: ((data IS NOT NULL) AND (data
> >>>>> <> '<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
> >>>>> AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
> >>>>>                                    Rows Removed by Filter: 0
> >>>>>                                    Buffers: shared hit=209871
> >>>>> Planning Time: 29.295 ms
> >>>>> Execution Time: 44087.365 ms
> >>>>>
> >>>>>
> >>>>> Second run: https://explain.depesz.com/s/uy9f
> >>>>> Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
> >>>>> time=789.363..810.440 rows=2 loops=1)
> >>>>>      Buffers: shared hit=216312 read=84755
> >>>>>      ->  Unique  (cost=157282.39..157298.20 rows=4 width=54) (actual
> >>>>> time=789.361..789.535 rows=2 loops=1)
> >>>>>            Buffers: shared hit=216312 read=84755
> >>>>>            ->  Sort  (cost=157282.39..157290.29 rows=3162 width=54)
> >>>>> (actual time=789.361..789.418 rows=2052 loops=1)
> >>>>>                  Sort Key: results.attribute_id, results."timestamp" DESC
> >>>>>                  Sort Method: quicksort  Memory: 641kB
> >>>>>                  Buffers: shared hit=216312 read=84755
> >>>>>                  ->  Gather  (cost=62853.04..157098.57 rows=3162
> >>>>> width=54) (actual time=290.356..808.454 rows=4102 loops=1)
> >>>>>                        Workers Planned: 2
> >>>>>                        Workers Launched: 2
> >>>>>                        Buffers: shared hit=216312 read=84755
> >>>>>                        ->  Nested Loop  (cost=61853.04..155782.37
> >>>>> rows=1318 width=54) (actual time=238.313..735.472 rows=1367 loops=3)
> >>>>>                              Buffers: shared hit=216312 read=84755
> >>>>>                              ->  Parallel Hash Join
> >>>>> (cost=61852.61..143316.27 rows=24085 width=4) (actual
> >>>>> time=237.966..677.975 rows=19756 loops=3)
> >>>>>                                    Hash Cond:
> >>>>> (scheduler_operation_executions.task_execution_id =
> >>>>> scheduler_task_executions.id)
> >>>>>                                    Buffers: shared hit=6441 read=84755
> >>>>>                                    ->  Parallel Seq Scan on
> >>>>> scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
> >>>>> width=8) (actual time=0.117..234.279 rows=1986887 loops=3)
> >>>>>                                          Buffers: shared hit=3188 read=46928
> >>>>>                                    ->  Parallel Hash
> >>>>> (cost=61652.25..61652.25 rows=16029 width=4) (actual
> >>>>> time=236.631..236.631 rows=13558 loops=3)
> >>>>>                                          Buckets: 65536  Batches: 1
> >>>>> Memory Usage: 2144kB
> >>>>>                                          Buffers: shared hit=3169 read=37827
> >>>>>                                          ->  Parallel Seq Scan on
> >>>>> scheduler_task_executions  (cost=0.00..61652.25 rows=16029 width=4)
> >>>>> (actual time=0.132..232.758 rows=13558 loops=3)
> >>>>>                                                Filter: (device_id = 97)
> >>>>>                                                Rows Removed by Filter: 1308337
> >>>>>                                                Buffers: shared hit=3169 read=37827
> >>>>>                              ->  Index Scan using
> >>>>> index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
> >>>>> width=58) (actual time=0.003..0.003 rows=0 loops=59269)
> >>>>>                                    Index Cond: (operation_execution_id =
> >>>>> scheduler_operation_executions.id)
> >>>>>                                    Filter: ((data IS NOT NULL) AND (data
> >>>>> <> '<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
> >>>>> AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
> >>>>>                                    Rows Removed by Filter: 0
> >>>>>                                    Buffers: shared hit=209871
> >>>>> Planning Time: 1.787 ms
> >>>>> Execution Time: 810.634 ms
> >>>>>
> >>>>> You can see that the second run takes less than one second to run...
> >>>>> which is 43 seconds better than the first try, just by re-running the
> >>>>> query.
> >>>>> Other runs take maybe 1s, 3s, still a long time.
> >>>>>
> >>>>> How can I improve it to be consistently fast (is it possible to get to
> >>>>> several milliseconds?)?
> >>>>> What I don't really understand is why the nested loop has 3 loops
> >>>>> (three joined tables)?
> >>>>> And why does the first index scan indicate ~60k loops? And does it
> >>>>> really work? It doesn't seem to filter out any rows.
> >>>>>
> >>>>> Should I add an index only on (attribute_id, object_id)? And maybe
> >>>>> data_access_result?
> >>>>> Does it make sens to add it on a text column (results.data)?
> >>>>>
> >>>>> My tables:
> >>>>>
https://gist.githubusercontent.com/paulbarbu/0d36271d710349d8fb6102d9a466bb54/raw/7a6946ba7c2adec5b87ed90f343f1aff37432d21/gistfile1.txt
> >>>>>
> >>>>> As you can see from the gist the foreign keys are indexed. Other
> >>>>> indices were added to speed up other queries.
> >>>>> Other relevant information (my tables have 3+ millions of rows, not
> >>>>> very big I think?), additional info with regards to size also included
> >>>>> below.
> >>>>> This query has poor performance on two PCs (both running off of HDDs)
> >>>>> so I think it has more to do with my indices and query than Postgres
> >>>>> config & hardware, will post those if necessary.
> >>>>>
> >>>>>
> >>>>> Size info:
> >>>>> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts,
> >>>>> relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE
> >>>>> relname IN ('results', 'scheduler_operation_executions',
> >>>>> 'scheduler_task_executions');
> >>>>> -[ RECORD 1 ]--+-------------------------------
> >>>>> relname        | results
> >>>>> relpages       | 65922
> >>>>> reltuples      | 3.17104e+06
> >>>>> relallvisible  | 65922
> >>>>> relkind        | r
> >>>>> relnatts       | 9
> >>>>> relhassubclass | f
> >>>>> reloptions     |
> >>>>> pg_table_size  | 588791808
> >>>>> -[ RECORD 2 ]--+-------------------------------
> >>>>> relname        | scheduler_operation_executions
> >>>>> relpages       | 50116
> >>>>> reltuples      | 5.95916e+06
> >>>>> relallvisible  | 50116
> >>>>> relkind        | r
> >>>>> relnatts       | 8
> >>>>> relhassubclass | f
> >>>>> reloptions     |
> >>>>> pg_table_size  | 410697728
> >>>>> -[ RECORD 3 ]--+-------------------------------
> >>>>> relname        | scheduler_task_executions
> >>>>> relpages       | 40996
> >>>>> reltuples      | 3.966e+06
> >>>>> relallvisible  | 40996
> >>>>> relkind        | r
> >>>>> relnatts       | 12
> >>>>> relhassubclass | f
> >>>>> reloptions     |
> >>>>> pg_table_size  | 335970304
> >>>>>
> >>>>> Thanks for your time!
> >>>>>
> >>>>> --
> >>>>> Barbu Paul - Gheorghe
> >>>>>
> >>>> Can you create an index on scheduler_task_executions.device_id and run
> >>>> it again?
> >> Can you try this query, please?  Although I'm not really sure it'll give
> >> you the same results.
> >>
> >>    SELECT DISTINCT ON (results.attribute_id)
> >>           results.timestamp,
> >>           results.data
> >>      FROM results
> >>     WHERE results.data <> '<NullData/>'
> >>       AND results.data IS NOT NULL
> >>       AND results.object_id = 1955
> >>       AND results.attribute_id IN (4, 5) -- possibly a longer list here
> >>       AND results.data_access_result = 'SUCCESS'
> >>       AND EXISTS (SELECT 1
> >>                     FROM scheduler_operation_executions
> >>                          JOIN scheduler_task_executions ON
> >> scheduler_task_executions.id =
> >> scheduler_operation_executions.task_execution_id
> >>                    WHERE scheduler_operation_executions.id =
> >> results.operation_execution_id
> >>                      AND scheduler_task_executions.device_id = 97)
> >>     ORDER BY results.attribute_id, results.timestamp DESC
> >>     LIMIT 2 -- limit by the length of the attributes list
> Can you provide the EXPLAIN ANALYZE plan for the query I sent you?



--

Barbu Paul - Gheorghe



В списке pgsql-performance по дате отправления:

Предыдущее
От: Luís Roberto Weck
Дата:
Сообщение: Re: Erratically behaving query needs optimization
Следующее
От: Marco Colli
Дата:
Сообщение: Extremely slow count (simple query, with index)