Re: Erratically behaving query needs optimization

Поиск
Список
Период
Сортировка
От Luís Roberto Weck
Тема Re: Erratically behaving query needs optimization
Дата
Msg-id 9736f758-d702-ee9a-2988-53762c73c3e3@siscobra.com.br
обсуждение исходный текст
Ответ на Erratically behaving query needs optimization  (Barbu Paul - Gheorghe <barbu.paul.gheorghe@gmail.com>)
Ответы Re: Erratically behaving query needs optimization  (MichaelDBA <MichaelDBA@sqlexec.com>)
Re: Erratically behaving query needs optimization  (Barbu Paul - Gheorghe <barbu.paul.gheorghe@gmail.com>)
Список pgsql-performance
Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:
> Hello,
> I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
> 64-bit" and I have a query that runs several times per user action
> (9-10 times).
> The query takes a long time to execute, specially at first, due to
> cold caches I think, but the performance varies greatly during a run
> of the application (while applying the said action by the user several
> times).
>
> My tables are only getting bigger with time, not much DELETEs and even
> less UPDATEs as far as I can tell.
>
> Problematic query:
>
> EXPLAIN (ANALYZE,BUFFERS)
> SELECT DISTINCT ON (results.attribute_id) results.timestamp,
> results.data FROM results
>      JOIN scheduler_operation_executions ON
> scheduler_operation_executions.id = results.operation_execution_id
>      JOIN scheduler_task_executions ON scheduler_task_executions.id =
> scheduler_operation_executions.task_execution_id
> WHERE scheduler_task_executions.device_id = 97
>      AND results.data <> '<NullData/>'
>      AND results.data IS NOT NULL
>      AND results.object_id = 1955
>      AND results.attribute_id IN (4, 5) -- possibly a longer list here
>      AND results.data_access_result = 'SUCCESS'
> ORDER BY results.attribute_id, results.timestamp DESC
> LIMIT 2 -- limit by the length of the attributes list
>
> In words: I want the latest (ORDER BY results.timestamp DESC) results
> of a device (scheduler_task_executions.device_id = 97 - hence the
> joins results -> scheduler_operation_executions ->
> scheduler_task_executions)
> for a given object and attributes with some additional constraints on
> the data column. But I only want the latest attributes for which we
> have results, hence the DISTINCT ON (results.attribute_id) and LIMIT.
>
> First run: https://explain.depesz.com/s/qh4C
> Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
> time=44068.166..44086.970 rows=2 loops=1)
>    Buffers: shared hit=215928 read=85139
>    ->  Unique  (cost=157282.39..157298.20 rows=4 width=54) (actual
> time=44068.164..44069.301 rows=2 loops=1)
>          Buffers: shared hit=215928 read=85139
>          ->  Sort  (cost=157282.39..157290.29 rows=3162 width=54)
> (actual time=44068.161..44068.464 rows=2052 loops=1)
>                Sort Key: results.attribute_id, results."timestamp" DESC
>                Sort Method: quicksort  Memory: 641kB
>                Buffers: shared hit=215928 read=85139
>                ->  Gather  (cost=62853.04..157098.57 rows=3162
> width=54) (actual time=23518.745..44076.385 rows=4102 loops=1)
>                      Workers Planned: 2
>                      Workers Launched: 2
>                      Buffers: shared hit=215928 read=85139
>                      ->  Nested Loop  (cost=61853.04..155782.37
> rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367
> loops=3)
>                            Buffers: shared hit=215928 read=85139
>                            ->  Parallel Hash Join
> (cost=61852.61..143316.27 rows=24085 width=4) (actual
> time=23271.275..40018.451 rows=19756 loops=3)
>                                  Hash Cond:
> (scheduler_operation_executions.task_execution_id =
> scheduler_task_executions.id)
>                                  Buffers: shared hit=6057 read=85139
>                                  ->  Parallel Seq Scan on
> scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
> width=8) (actual time=7.575..15694.435 rows=1986887 loops=3)
>                                        Buffers: shared hit=2996 read=47120
>                                  ->  Parallel Hash
> (cost=61652.25..61652.25 rows=16029 width=4) (actual
> time=23253.337..23253.337 rows=13558 loops=3)
>                                        Buckets: 65536  Batches: 1
> Memory Usage: 2144kB
>                                        Buffers: shared hit=2977 read=38019
>                                        ->  Parallel Seq Scan on
> scheduler_task_executions  (cost=0.00..61652.25 rows=16029 width=4)
> (actual time=25.939..23222.174 rows=13558 loops=3)
>                                              Filter: (device_id = 97)
>                                              Rows Removed by Filter: 1308337
>                                              Buffers: shared hit=2977 read=38019
>                            ->  Index Scan using
> index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
> width=58) (actual time=0.191..0.191 rows=0 loops=59269)
>                                  Index Cond: (operation_execution_id =
> scheduler_operation_executions.id)
>                                  Filter: ((data IS NOT NULL) AND (data
> <> '<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
> AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
>                                  Rows Removed by Filter: 0
>                                  Buffers: shared hit=209871
> Planning Time: 29.295 ms
> Execution Time: 44087.365 ms
>
>
> Second run: https://explain.depesz.com/s/uy9f
> Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
> time=789.363..810.440 rows=2 loops=1)
>    Buffers: shared hit=216312 read=84755
>    ->  Unique  (cost=157282.39..157298.20 rows=4 width=54) (actual
> time=789.361..789.535 rows=2 loops=1)
>          Buffers: shared hit=216312 read=84755
>          ->  Sort  (cost=157282.39..157290.29 rows=3162 width=54)
> (actual time=789.361..789.418 rows=2052 loops=1)
>                Sort Key: results.attribute_id, results."timestamp" DESC
>                Sort Method: quicksort  Memory: 641kB
>                Buffers: shared hit=216312 read=84755
>                ->  Gather  (cost=62853.04..157098.57 rows=3162
> width=54) (actual time=290.356..808.454 rows=4102 loops=1)
>                      Workers Planned: 2
>                      Workers Launched: 2
>                      Buffers: shared hit=216312 read=84755
>                      ->  Nested Loop  (cost=61853.04..155782.37
> rows=1318 width=54) (actual time=238.313..735.472 rows=1367 loops=3)
>                            Buffers: shared hit=216312 read=84755
>                            ->  Parallel Hash Join
> (cost=61852.61..143316.27 rows=24085 width=4) (actual
> time=237.966..677.975 rows=19756 loops=3)
>                                  Hash Cond:
> (scheduler_operation_executions.task_execution_id =
> scheduler_task_executions.id)
>                                  Buffers: shared hit=6441 read=84755
>                                  ->  Parallel Seq Scan on
> scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
> width=8) (actual time=0.117..234.279 rows=1986887 loops=3)
>                                        Buffers: shared hit=3188 read=46928
>                                  ->  Parallel Hash
> (cost=61652.25..61652.25 rows=16029 width=4) (actual
> time=236.631..236.631 rows=13558 loops=3)
>                                        Buckets: 65536  Batches: 1
> Memory Usage: 2144kB
>                                        Buffers: shared hit=3169 read=37827
>                                        ->  Parallel Seq Scan on
> scheduler_task_executions  (cost=0.00..61652.25 rows=16029 width=4)
> (actual time=0.132..232.758 rows=13558 loops=3)
>                                              Filter: (device_id = 97)
>                                              Rows Removed by Filter: 1308337
>                                              Buffers: shared hit=3169 read=37827
>                            ->  Index Scan using
> index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
> width=58) (actual time=0.003..0.003 rows=0 loops=59269)
>                                  Index Cond: (operation_execution_id =
> scheduler_operation_executions.id)
>                                  Filter: ((data IS NOT NULL) AND (data
> <> '<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
> AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
>                                  Rows Removed by Filter: 0
>                                  Buffers: shared hit=209871
> Planning Time: 1.787 ms
> Execution Time: 810.634 ms
>
> You can see that the second run takes less than one second to run...
> which is 43 seconds better than the first try, just by re-running the
> query.
> Other runs take maybe 1s, 3s, still a long time.
>
> How can I improve it to be consistently fast (is it possible to get to
> several milliseconds?)?
> What I don't really understand is why the nested loop has 3 loops
> (three joined tables)?
> And why does the first index scan indicate ~60k loops? And does it
> really work? It doesn't seem to filter out any rows.
>
> Should I add an index only on (attribute_id, object_id)? And maybe
> data_access_result?
> Does it make sens to add it on a text column (results.data)?
>
> My tables:
>
https://gist.githubusercontent.com/paulbarbu/0d36271d710349d8fb6102d9a466bb54/raw/7a6946ba7c2adec5b87ed90f343f1aff37432d21/gistfile1.txt
>
> As you can see from the gist the foreign keys are indexed. Other
> indices were added to speed up other queries.
> Other relevant information (my tables have 3+ millions of rows, not
> very big I think?), additional info with regards to size also included
> below.
> This query has poor performance on two PCs (both running off of HDDs)
> so I think it has more to do with my indices and query than Postgres
> config & hardware, will post those if necessary.
>
>
> Size info:
> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts,
> relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE
> relname IN ('results', 'scheduler_operation_executions',
> 'scheduler_task_executions');
> -[ RECORD 1 ]--+-------------------------------
> relname        | results
> relpages       | 65922
> reltuples      | 3.17104e+06
> relallvisible  | 65922
> relkind        | r
> relnatts       | 9
> relhassubclass | f
> reloptions     |
> pg_table_size  | 588791808
> -[ RECORD 2 ]--+-------------------------------
> relname        | scheduler_operation_executions
> relpages       | 50116
> reltuples      | 5.95916e+06
> relallvisible  | 50116
> relkind        | r
> relnatts       | 8
> relhassubclass | f
> reloptions     |
> pg_table_size  | 410697728
> -[ RECORD 3 ]--+-------------------------------
> relname        | scheduler_task_executions
> relpages       | 40996
> reltuples      | 3.966e+06
> relallvisible  | 40996
> relkind        | r
> relnatts       | 12
> relhassubclass | f
> reloptions     |
> pg_table_size  | 335970304
>
> Thanks for your time!
>
> --
> Barbu Paul - Gheorghe
>
Can you create an index on scheduler_task_executions.device_id and run 
it again?



В списке pgsql-performance по дате отправления:

Предыдущее
От: Barbu Paul - Gheorghe
Дата:
Сообщение: Erratically behaving query needs optimization
Следующее
От: MichaelDBA
Дата:
Сообщение: Re: Erratically behaving query needs optimization