Execution plan does not use index

Поиск
Список
Период
Сортировка
От Peter Coppens
Тема Execution plan does not use index
Дата
Msg-id CAEkf4WBnCDcqQXfw4z-TpTYucBE=AEHEQjLd3QvOtMrPrrK6fA@mail.gmail.com
обсуждение исходный текст
Ответы Re: Execution plan does not use index  (Michael Lewis <mlewis@entrata.com>)
Список pgsql-general
Hello,

Consider the following PostgreSQL 9.6.18 tables  
- measurement_value: time series table  with a unique key on (device_id,timestamp) columns and a number of columns with measurements. Table contains a large number of rows (>150million)
- device table: with device properties (short_id joins to device_id in measurement_value table)
- device_configuration table: extra device properties, including a timezone

This query works fine and quick (returns ±320K rows in 3.2 seconds when explain is removed)

explain
  select d.short_id,mv.timestamp,mv.I64_01
  from device d, device_configuration dc, measurement_value mv, pg_timezone_names tzn
  where mv.device_id=d.short_id and dc.device_id = d.id and dc.latest=true and dc.timezone=tzn.name and
        mv.timestamp > '2020-11-06'::timestamp and mv.timestamp < '2020-11-07'::timestamp


==>
Hash Join  (cost=23.63..2156655.51 rows=1454320 width=20)
  Hash Cond: (dc.timezone = pg_timezone_names.name)
  ->  Nested Loop  (cost=1.13..2089933.38 rows=409070 width=34)
        ->  Nested Loop  (cost=0.56..7497.34 rows=615 width=18)
              ->  Index Scan using device_short_id_key on device d  (cost=0.28..2423.90 rows=683 width=20)
              ->  Index Scan using device_configuration_device_latest_idx on device_configuration dc  (cost=0.28..7.42 rows=1 width=30)
                    Index Cond: ((device_id = d.id) AND (latest = true))
                    Filter: latest
        ->  Index Scan using measurement_values_pkey on measurement_value mv  (cost=0.57..3375.60 rows=1047 width=20)
              Index Cond: ((device_id = d.short_id) AND ("timestamp" > '2020-11-06 00:00:00'::timestamp without time zone) AND ("timestamp" < '2020-11-07
00:00:00'::timestamp without time zone))
  ->  Hash  (cost=10.00..10.00 rows=1000 width=32)
        ->  Function Scan on pg_timezone_names  (cost=0.00..10.00 rows=1000 width=32)



When adding the timezone offset to the where clause, the query becomes extremely slow due to a full table scan of the measurement_value table (±320K rows in 11 minutes)

(See  ->  Seq Scan on measurement_value mv  (cost=0.00..6308233.77 rows=169617977 width=20))

explain 
  select d.short_id,mv.timestamp,mv.I64_01
  from device d, device_configuration dc, measurement_value mv, pg_timezone_names tzn
  where mv.device_id=d.short_id and dc.device_id = d.id and dc.latest=true and dc.timezone=tzn.name and
        mv.timestamp > '2020-11-06'::timestamp - tzn.utc_offset and mv.timestamp < '2020-11-07'::timestamp - tzn.utc_offset


==>
Hash Join  (cost=683.93..18226567.32 rows=60331762 width=20)
  Hash Cond: (mv.device_id = d.short_id)
  Join Filter: ((mv."timestamp" > ('2020-11-06 00:00:00'::timestamp without time zone - pg_timezone_names.utc_offset)) AND (mv."timestamp" < ('2020-11-07 00:00:00'::timestamp without time zone - pg_timezone_names.utc_offset)))
  ->  Seq Scan on measurement_value mv  (cost=0.00..6308233.77 rows=169617977 width=20)
  ->  Hash  (cost=656.61..656.61 rows=2186 width=20)
        ->  Hash Join  (cost=77.87..656.61 rows=2186 width=20)
              Hash Cond: (dc.timezone = pg_timezone_names.name)
              ->  Hash Join  (cost=55.37..533.83 rows=615 width=18)
                    Hash Cond: (dc.device_id = d.id)
                    ->  Seq Scan on device_configuration dc  (cost=0.00..470.01 rows=615 width=30)
                          Filter: latest
                    ->  Hash  (cost=46.83..46.83 rows=683 width=20)
                          ->  Seq Scan on device d  (cost=0.00..46.83 rows=683 width=20)
              ->  Hash  (cost=10.00..10.00 rows=1000 width=48)
                    ->  Function Scan on pg_timezone_names  (cost=0.00..10.00 rows=1000 width=48)



Adding the tzn.utc_offset results in the fact that the execution plan no longer considers to use the index on the measurement_value table. Is there any way the SQL can be rewritten so that the index is used? Or any other solution so that the query with the timezone offset returns in a comparable time?

Many thanks!

Peter


В списке pgsql-general по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: New "function tables" in V13 documentation
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: Foreign Data Wrapper Handler