understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr
| От | john gale |
|---|---|
| Тема | understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr |
| Дата | |
| Msg-id | B3F6AE48-E10E-4A91-A450-241B12B6312A@smadness.com обсуждение исходный текст |
| Ответы |
Re: understanding why two nearly identical queries take
two different planner routes, one 5s and one 2hr
Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr |
| Список | pgsql-general |
I would be appreciative if somebody could help explain why we have two nearly identical queries taking two different
plannerroutes; one a nested index loop that takes about 5s to complete, and the other a hash join & heap scan that
takesabout 2hr. This is using Postgres 9.3.3 on OS X 10.9.4.
These two queries are different only in the value of an hstore entry and a date entry:
SELECT *, testruns.id FROM testruns JOIN test_types ON testruns.test_type_id=test_types.id WHERE
testruns.custom_data->'SpawnID'= 'SpawnID-428842195.338828' AND testruns.started_at > '2014-08-03 10:49:55.338828';
SELECT *, testruns.id FROM testruns JOIN test_types ON testruns.test_type_id=test_types.id WHERE
testruns.custom_data->'SpawnID'= 'SpawnID-428870395.258592' AND testruns.started_at > '2014-08-03 18:39:55.258592';
We have built an index for the hstore entry for SpawnID and most queries use it correctly.
Column | Type | Modifiers
------------------+-----------------------------+-------------------------------------------------------
id | integer | not null default nextval('testruns_id_seq'::regclass)
started_at | timestamp without time zone | not null
...
Indexes:
...
"index_testruns_on_custom_spawnid" btree ((custom_data -> 'SpawnID'::text))
However after realizing that these two queries differed in execution time so drastically, an explain showed that they
aretaking two very different paths:
db=# explain SELECT *, testruns.id FROM testruns JOIN test_types ON testruns.test_type_id=test_types.id WHERE
testruns.custom_data->'SpawnID'= 'SpawnID-428842195.338828' AND testruns.started_at > '2014-08-03 10:49:55.338828';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.85..2696.12 rows=88 width=1466)
-> Index Scan using index_testruns_on_custom_spawnid on testruns (cost=0.57..2378.40 rows=88 width=1399)
Index Cond: ((custom_data -> 'SpawnID'::text) = 'SpawnID-428842195.338828'::text)
Filter: (started_at > '2014-08-03 10:49:55.338828'::timestamp without time zone)
-> Index Scan using test_types_pkey on test_types (cost=0.29..3.60 rows=1 width=67)
Index Cond: (id = testruns.test_type_id)
(6 rows)
Time: 22.363 ms
db=# explain SELECT *, testruns.id FROM testruns JOIN test_types ON testruns.test_type_id=test_types.id WHERE
testruns.custom_data->'SpawnID'= 'SpawnID-428870395.258592' AND testruns.started_at > '2014-08-03 18:39:55.258592';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=292249.24..348608.93 rows=28273 width=1466)
Hash Cond: (testruns.test_type_id = test_types.id)
-> Bitmap Heap Scan on testruns (cost=291510.31..347269.21 rows=28273 width=1399)
Recheck Cond: (((custom_data -> 'SpawnID'::text) = 'SpawnID-428870395.258592'::text) AND (started_at >
'2014-08-0318:39:55.258592'::timestamp without time zone))
-> BitmapAnd (cost=291510.31..291510.31 rows=28273 width=0)
-> Bitmap Index Scan on index_testruns_on_custom_spawnid (cost=0.00..41383.84 rows=500170 width=0)
Index Cond: ((custom_data -> 'SpawnID'::text) = 'SpawnID-428870395.258592'::text)
-> Bitmap Index Scan on index_testruns_on_started_at_2 (cost=0.00..250112.08 rows=3188736 width=0)
Index Cond: (started_at > '2014-08-03 18:39:55.258592'::timestamp without time zone)
-> Hash (cost=563.41..563.41 rows=14041 width=67)
-> Seq Scan on test_types (cost=0.00..563.41 rows=14041 width=67)
(11 rows)
Time: 1.231 ms
Mainly, I don't understand why two queries constructed the same but with subtle value differences would take two
differentpaths through the analyzer. We haven't fiddled with the default planner options, aside from random_page_cost
andeffective_cache_size to match our hardware. Re-analyzing the table does not seem to change the cost. The index
sizeis somewhat large (16G for a 114G table), and I haven't tried rebuilding it yet.
thanks,
~ john
В списке pgsql-general по дате отправления: