Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query

Поиск
Список
Период
Сортировка
От Ken Tanzer
Тема Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query
Дата
Msg-id CAD3a31UJVhu-Ojsikjz12PEdvbz_ha_4U-BzPn5smKUyrECO6A@mail.gmail.com
обсуждение исходный текст
Ответы Re: Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query  (Justin Pryzby <pryzby@telsasoft.com>)
Re: Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
Hi.  I've got a query that runs fine (~50ms).  When I add a "LIMIT 25" to it though, it takes way longer.  The query itself then takes about 4.5 seconds.  And when I do an explain, it takes 90+ seconds for the same query!

Explains and detailed table/view info below.  tbl_log has 1.2M records, tbl_reference has 550k.  This is 9.6.19 on CentOS 6 with PDGG packages.

I know the query itself could be re-written, but it's coming from an ORM, so I'm really focused on why the adding a limit is causing such performance degradation, and what to do about it.  Any help or insight would be appreciated. Also the discrepancy between the actual query and the explain.  Thanks!

Ken


The good query (no LIMIT):

agency=> EXPLAIN (ANALYZE,VERBOSE,BUFFERS,TIMING)  SELECT * FROM Log WHERE log_id IN (SELECT CASE WHEN to_table='client' THEN from_id WHEN from_table='client' THEN to_id END FROM reference WHERE ((from_id_field =  E'client_id'
        AND from_id =  E'34918'
        AND from_table =  E'client'
        AND to_table =  E'log'
        )
        OR  (to_id_field =  E'client_id'
        AND to_id =  E'34918'
        AND to_table =  E'client'
        AND from_table =  E'log'
        ))) ORDER BY added_at DESC;
                                                                                                                                                                                                               QUERY PLAN                                                                                                                                                            
                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------
 Sort  (cost=167065.81..168594.77 rows=611586 width=336) (actual time=43.942..46.566 rows=1432 loops=1)
   Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report, tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at, tbl_log.sys_log
   Sort Key: tbl_log.added_at DESC
   Sort Method: quicksort  Memory: 999kB
   Buffers: shared hit=7026
   ->  Nested Loop  (cost=4313.36..14216.18 rows=611586 width=336) (actual time=10.837..38.177 rows=1432 loops=1)
         Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report, tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at, tbl_log.sys_log
         Buffers: shared hit=7026
         ->  HashAggregate  (cost=4312.93..4325.68 rows=1275 width=136) (actual time=10.802..13.800 rows=1433 loops=1)
               Output: tbl_reference.to_table, tbl_reference.from_id, tbl_reference.from_table, tbl_reference.to_id
               Group Key: CASE WHEN (tbl_reference.to_table = 'client'::name) THEN tbl_reference.from_id WHEN (tbl_reference.from_table = 'client'::name) THEN tbl_reference.to_id ELSE NULL::integer END
               Buffers: shared hit=1288
               ->  Bitmap Heap Scan on public.tbl_reference  (cost=46.69..4309.74 rows=1276 width=136) (actual time=0.747..6.822 rows=1433 loops=1)
                     Output: tbl_reference.to_table, tbl_reference.from_id, tbl_reference.from_table, tbl_reference.to_id, CASE WHEN (tbl_reference.to_table = 'client'::name) THEN tbl_reference.from_id WHEN (tbl_reference.from_table = 'client'::name) THEN tbl_reference.to_id ELSE NULL::integer END
                     Recheck Cond: ((tbl_reference.from_id_field = 'client_id'::name) OR ((tbl_reference.to_table = 'client'::name) AND (tbl_reference.to_id = 34918)))
                     Filter: ((NOT tbl_reference.is_deleted) AND (((tbl_reference.from_id_field = 'client_id'::name) AND (tbl_reference.from_id = 34918) AND (tbl_reference.from_table = 'client'::name) AND (tbl_reference.to_table = 'log'::name)) OR ((tbl_reference.to_id_field = 'client_id'::name) AND (tbl_reference.to_id = 34918) AND (tbl_reference.to_table = 'client'::na
me) AND (tbl_reference.from_table = 'log'::name))))
                     Rows Removed by Filter: 15
                     Heap Blocks: exact=1275
                     Buffers: shared hit=1288
                     ->  BitmapOr  (cost=46.69..46.69 rows=1319 width=0) (actual time=0.453..0.454 rows=0 loops=1)
                           Buffers: shared hit=13
                           ->  Bitmap Index Scan on index_tbl_reference_from_id_field  (cost=0.00..4.43 rows=1 width=0) (actual time=0.025..0.026 rows=0 loops=1)
                                 Index Cond: (tbl_reference.from_id_field = 'client_id'::name)
                                 Buffers: shared hit=3
                           ->  Bitmap Index Scan on index_tbl_reference_to_table_id  (cost=0.00..41.61 rows=1319 width=0) (actual time=0.421..0.423 rows=1448 loops=1)
                                 Index Cond: ((tbl_reference.to_table = 'client'::name) AND (tbl_reference.to_id = 34918))
                                 Buffers: shared hit=10
         ->  Index Scan using tbl_log_pkey on public.tbl_log  (cost=0.43..7.75 rows=1 width=336) (actual time=0.007..0.009 rows=1 loops=1433)
               Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at, tbl_log.sys_log, tbl_log.shift_report
               Index Cond: (tbl_log.log_id = CASE WHEN (tbl_reference.to_table = 'client'::name) THEN tbl_reference.from_id WHEN (tbl_reference.from_table = 'client'::name) THEN tbl_reference.to_id ELSE NULL::integer END)
               Buffers: shared hit=5738
 Planning time: 0.866 ms
 Execution time: 48.915 ms
(33 rows)


The bad query (LIMIT):

agency=> EXPLAIN (ANALYZE,VERBOSE,BUFFERS,TIMING)  SELECT * FROM Log WHERE log_id IN (SELECT CASE WHEN to_table='client' THEN from_id WHEN from_table='client' THEN to_id END FROM reference WHERE ((from_id_field =  E'client_id'
        AND from_id =  E'34918'
        AND from_table =  E'client'
        AND to_table =  E'log'
        )
        OR  (to_id_field =  E'client_id'
        AND to_id =  E'34918'
        AND to_table =  E'client'
        AND from_table =  E'log'
        ))) ORDER BY added_at DESC LIMIT 25;
                                                                                                                                                                                                               QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------
 Limit  (cost=47.11..1329.32 rows=25 width=336) (actual time=47.103..97236.235 rows=25 loops=1)
   Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report, tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at, tbl_log.sys_log
   Buffers: shared hit=3820
   ->  Nested Loop Semi Join  (cost=47.11..31367302.81 rows=611586 width=336) (actual time=47.098..97236.123 rows=25 loops=1)
         Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report, tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at, tbl_log.sys_log
         Join Filter: (tbl_log.log_id = CASE WHEN (tbl_reference.to_table = 'client'::name) THEN tbl_reference.from_id WHEN (tbl_reference.from_table = 'client'::name) THEN tbl_reference.to_id ELSE NULL::integer END)
         Rows Removed by Join Filter: 28364477
         Buffers: shared hit=3820
         ->  Index Scan Backward using tbl_log_added_at on public.tbl_log  (cost=0.43..147665.96 rows=1223171 width=336) (actual time=0.016..123.097 rows=19794 loops=1)
               Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at, tbl_log.sys_log, tbl_log.shift_report
               Buffers: shared hit=2532
         ->  Materialize  (cost=46.69..4316.12 rows=1276 width=136) (actual time=0.002..2.351 rows=1433 loops=19794)
               Output: tbl_reference.to_table, tbl_reference.from_id, tbl_reference.from_table, tbl_reference.to_id
               Buffers: shared hit=1288
               ->  Bitmap Heap Scan on public.tbl_reference  (cost=46.69..4309.74 rows=1276 width=136) (actual time=0.508..5.594 rows=1433 loops=1)
                     Output: tbl_reference.to_table, tbl_reference.from_id, tbl_reference.from_table, tbl_reference.to_id
                     Recheck Cond: ((tbl_reference.from_id_field = 'client_id'::name) OR ((tbl_reference.to_table = 'client'::name) AND (tbl_reference.to_id = 34918)))
                     Filter: ((NOT tbl_reference.is_deleted) AND (((tbl_reference.from_id_field = 'client_id'::name) AND (tbl_reference.from_id = 34918) AND (tbl_reference.from_table = 'client'::name) AND (tbl_reference.to_table = 'log'::name)) OR ((tbl_reference.to_id_field = 'client_id'::name) AND (tbl_reference.to_id = 34918) AND (tbl_reference.to_table = 'client'::na
me) AND (tbl_reference.from_table = 'log'::name))))
                     Rows Removed by Filter: 15
                     Heap Blocks: exact=1275
                     Buffers: shared hit=1288
                     ->  BitmapOr  (cost=46.69..46.69 rows=1319 width=0) (actual time=0.313..0.315 rows=0 loops=1)
                           Buffers: shared hit=13
                           ->  Bitmap Index Scan on index_tbl_reference_from_id_field  (cost=0.00..4.43 rows=1 width=0) (actual time=0.011..0.013 rows=0 loops=1)
                                 Index Cond: (tbl_reference.from_id_field = 'client_id'::name)
                                 Buffers: shared hit=3
                           ->  Bitmap Index Scan on index_tbl_reference_to_table_id  (cost=0.00..41.61 rows=1319 width=0) (actual time=0.296..0.298 rows=1448 loops=1)
                                 Index Cond: ((tbl_reference.to_table = 'client'::name) AND (tbl_reference.to_id = 34918))
                                 Buffers: shared hit=10
 Planning time: 0.650 ms
 Execution time: 97236.582 ms
(31 rows)

Time: 97238.387 ms


The bad query, as actual query, not explain:

agency=> SELECT * FROM Log WHERE log_id IN (SELECT CASE WHEN to_table='client' THEN from_id WHEN from_table='client' THEN to_id END FROM reference WHERE ((from_id_field =  E'client_id'
        AND from_id =  E'34918'
        AND from_table =  E'client'
        AND to_table =  E'log'
        )
        OR  (to_id_field =  E'client_id'
        AND to_id =  E'34918'
        AND to_table =  E'client'
        AND from_table =  E'log'
        ))) ORDER BY added_at DESC LIMIT 25;
 log_id  | log_type_code |                                     subject                                     |
                                                                                                                                                                                                   log_text

           |     occurred_at     | shift_report | written_by | added_by |      added_at       | changed_by |     changed_at      | sys_log
---------+---------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------+---------------------+--------------+------------+----------+---------------------+------------+---------------------+---------
(actual results snipped)

Time: 4654.190 ms


Description of tables and views:

agency=> \d+ log
                                  View "public.log"
    Column     |              Type              | Modifiers | Storage  | Description
---------------+--------------------------------+-----------+----------+-------------
 log_id        | integer                        |           | plain    |
 log_type_code | character varying(10)[]        |           | extended |
 subject       | character varying(80)          |           | extended |
 log_text      | text                           |           | extended |
 occurred_at   | timestamp(0) without time zone |           | plain    |
 shift_report  | boolean                        |           | plain    |
 written_by    | integer                        |           | plain    |
 added_by      | integer                        |           | plain    |
 added_at      | timestamp(0) without time zone |           | plain    |
 changed_by    | integer                        |           | plain    |
 changed_at    | timestamp(0) without time zone |           | plain    |
 sys_log       | text                           |           | extended |
View definition:
 SELECT tbl_log.log_id,
    tbl_log.log_type_code,
    tbl_log.subject,
    tbl_log.log_text,
    tbl_log.occurred_at,
    tbl_log.shift_report,
    tbl_log.written_by,
    tbl_log.added_by,
    tbl_log.added_at,
    tbl_log.changed_by,
    tbl_log.changed_at,
    tbl_log.sys_log
   FROM tbl_log;

agency=> \d tbl_log
                                          Table "public.tbl_log"
    Column     |              Type              |                        Modifiers                        
---------------+--------------------------------+----------------------------------------------------------
 log_id        | integer                        | not null default nextval('tbl_log_log_id_seq'::regclass)
 log_type_code | character varying(10)[]        | not null
 subject       | character varying(80)          | not null
 log_text      | text                           |
 occurred_at   | timestamp(0) without time zone |
 written_by    | integer                        | not null
 added_by      | integer                        | not null
 added_at      | timestamp(0) without time zone | not null default now()
 changed_by    | integer                        | not null
 changed_at    | timestamp(0) without time zone | not null default now()
 sys_log       | text                           |
 shift_report  | boolean                        | default false
Indexes:
    "tbl_log_pkey" PRIMARY KEY, btree (log_id)
    "index_tbl_log_log_type_code" btree (log_type_code)
    "tbl_log_added_at" btree (added_at)
    "tbl_log_added_by" btree (added_by)
    "tbl_log_added_by2" btree (added_at DESC)
    "tbl_log_event_time" btree ((COALESCE(occurred_at, added_at)))
    "tbl_log_log_type_code" btree (log_type_code)
    "tbl_log_log_type_code_gin" gin (log_type_code)
    "tbl_log_occurred_at" btree (occurred_at)
    "tbl_log_subject" btree (subject)
    "tbl_log_test" btree (added_at, log_type_code)
    "tbl_log_test2" btree (log_type_code, added_at)
    "tbl_log_written_by" btree (written_by)
Foreign-key constraints:
    "tbl_log_added_by_fkey" FOREIGN KEY (added_by) REFERENCES tbl_staff(staff_id)
    "tbl_log_changed_by_fkey" FOREIGN KEY (changed_by) REFERENCES tbl_staff(staff_id)
    "tbl_log_written_by_fkey" FOREIGN KEY (written_by) REFERENCES tbl_staff(staff_id)
Triggers:
    tbl_log_changed_at_update BEFORE UPDATE ON tbl_log FOR EACH ROW EXECUTE PROCEDURE auto_changed_at_update()
    tbl_log_log_chg AFTER INSERT OR DELETE OR UPDATE ON tbl_log FOR EACH ROW EXECUTE PROCEDURE table_log()

agency=> \d+ reference
                                View "public.reference"
     Column      |              Type              | Modifiers | Storage  | Description
-----------------+--------------------------------+-----------+----------+-------------
 reference_id    | integer                        |           | plain    |
 from_table      | name                           |           | plain    |
 from_id_field   | name                           |           | plain    |
 from_id         | integer                        |           | plain    |
 to_table        | name                           |           | plain    |
 to_id_field     | name                           |           | plain    |
 to_id           | integer                        |           | plain    |
 added_at        | timestamp(0) without time zone |           | plain    |
 added_by        | integer                        |           | plain    |
 changed_at      | timestamp(0) without time zone |           | plain    |
 changed_by      | integer                        |           | plain    |
 is_deleted      | boolean                        |           | plain    |
 deleted_at      | timestamp(0) without time zone |           | plain    |
 deleted_by      | integer                        |           | plain    |
 deleted_comment | text                           |           | extended |
 sys_log         | text                           |           | extended |
View definition:
 SELECT tbl_reference.reference_id,
    tbl_reference.from_table,
    tbl_reference.from_id_field,
    tbl_reference.from_id,
    tbl_reference.to_table,
    tbl_reference.to_id_field,
    tbl_reference.to_id,
    tbl_reference.added_at,
    tbl_reference.added_by,
    tbl_reference.changed_at,
    tbl_reference.changed_by,
    tbl_reference.is_deleted,
    tbl_reference.deleted_at,
    tbl_reference.deleted_by,
    tbl_reference.deleted_comment,
    tbl_reference.sys_log
   FROM tbl_reference
  WHERE NOT tbl_reference.is_deleted;

agency=> \d+ tbl_reference
                                                                  Table "public.tbl_reference"
     Column      |              Type              |                              Modifiers                               | Storage  | Stats target | Description
-----------------+--------------------------------+----------------------------------------------------------------------+----------+--------------+-------------
 reference_id    | integer                        | not null default nextval('tbl_reference_reference_id_seq'::regclass) | plain    |              |
 from_table      | name                           | not null                                                             | plain    |              |
 from_id_field   | name                           | not null                                                             | plain    |              |
 from_id         | integer                        | not null                                                             | plain    |              |
 to_table        | name                           | not null                                                             | plain    |              |
 to_id_field     | name                           | not null                                                             | plain    |              |
 to_id           | integer                        | not null                                                             | plain    |              |
 added_at        | timestamp(0) without time zone | not null default now()                                               | plain    |              |
 added_by        | integer                        | not null                                                             | plain    |              |
 changed_at      | timestamp(0) without time zone | not null default now()                                               | plain    |              |
 changed_by      | integer                        | not null                                                             | plain    |              |
 is_deleted      | boolean                        | not null default false                                               | plain    |              |
 deleted_at      | timestamp(0) without time zone |                                                                      | plain    |              |
 deleted_by      | integer                        |                                                                      | plain    |              |
 deleted_comment | text                           |                                                                      | extended |              |
 sys_log         | text                           |                                                                      | extended |              |
Indexes:
    "tbl_reference_pkey" PRIMARY KEY, btree (reference_id)
    "unique_index_tbl_reference" UNIQUE, btree (from_table, from_id_field, from_id, to_table, to_id_field, to_id)
    "index_tbl_reference_from_id" btree (from_id)
    "index_tbl_reference_from_id_field" btree (from_id_field)
    "index_tbl_reference_from_table" btree (from_table)
    "index_tbl_reference_is_deleted" btree (is_deleted)
    "index_tbl_reference_to_id" btree (to_id)
    "index_tbl_reference_to_id_field" btree (to_id_field)
    "index_tbl_reference_to_table" btree (to_table)
    "index_tbl_reference_to_table_id" btree (to_table, to_id)
Foreign-key constraints:
    "tbl_reference_added_by_fkey" FOREIGN KEY (added_by) REFERENCES tbl_staff(staff_id)
    "tbl_reference_changed_by_fkey" FOREIGN KEY (changed_by) REFERENCES tbl_staff(staff_id)
    "tbl_reference_deleted_by_fkey" FOREIGN KEY (deleted_by) REFERENCES tbl_staff(staff_id)
Triggers:
    tbl_reference_alert_notify AFTER INSERT OR DELETE OR UPDATE ON tbl_reference FOR EACH ROW EXECUTE PROCEDURE table_alert_notify()
    tbl_reference_changed_at_update BEFORE UPDATE ON tbl_reference FOR EACH ROW EXECUTE PROCEDURE auto_changed_at_update()
    tbl_reference_log_chg AFTER INSERT OR DELETE OR UPDATE ON tbl_reference FOR EACH ROW EXECUTE PROCEDURE table_log()




--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Hstore index for full text search
Следующее
От: Justin Pryzby
Дата:
Сообщение: Re: Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query