Partition table query performance

Поиск
Список
Период
Сортировка
От Greg Jaman
Тема Partition table query performance
Дата
Msg-id b72893ad0811252007h33cf477dx54d5468705facaaa@mail.gmail.com
обсуждение исходный текст
Ответы Re: Partition table query performance
Список pgsql-performance

I have a problem with partitioning and I'm wondering if anyone can provide some insight.   I'm trying to find the max value of a column across multiple partitions.  The query against the partition set is quite slow while queries against child partitions is very fast!


I setup a basic Range Partition table definition.
   A parent table:  Data {  dataID, sensorID, value, ts }
   child tables   Data_YYYY_WEEKNO { dataID, sensorID, value, ts}  inherited from Data
          Each child tables has a primary key index on dataID and a composite index on (sensorID, ts).
          Each child has check constraints for the week range identified in the table name (non overlapping)

I want to perform a simple operation:  select the max ts (timestamp) giving a sensorID.  Given my indexs on the table, this should be a simple and fast operation.


DB=# EXPLAIN ANALYZE  select max(ts) from "Data" where valid=true and "sensorID"=8293 ;
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=334862.92..334862.93 rows=1 width=8) (actual time=85183.381..85183.383 rows=1 loops=1)
   ->  Append  (cost=2.30..329397.68 rows=2186096 width=8) (actual time=1.263..76592.755 rows=2205408 loops=1)
         ->  Bitmap Heap Scan on "Data"  (cost=2.30..8.84 rows=3 width=8) (actual time=0.027..0.027 rows=0 loops=1)
               Recheck Cond: ("sensorID" = 8293)
               Filter: valid
               ->  Bitmap Index Scan on "def_data_sensorID_ts"  (cost=0.00..2.30 rows=6 width=0) (actual time=0.021..0.021 rows=0 loops=1)
                     Index Cond: ("sensorID" = 8293)
         ->  Index Scan using "Data_2008_01_sensorID_ts_index" on "Data_2008_01" "Data"  (cost=0.00..4.27 rows=1 width=8) (actual time=0.014..0.014 rows=0 loops=1)
               Index Cond: ("sensorID" = 8293)
               Filter: valid
         ->  Bitmap Heap Scan on "Data_2008_02" "Data"  (cost=3.01..121.08 rows=98 width=8) (actual time=0.017..0.017 rows=0 loops=1)
               Recheck Cond: ("sensorID" = 8293)
               Filter: valid
               ->  Bitmap Index Scan on "Data_2008_02_sensorID_ts_index"  (cost=0.00..2.99 rows=98 width=0) (actual time=0.011..0.011 rows=0 loops=1)
                     Index Cond: ("sensorID" = 8293)
.
.
. (omitted a list of all partitions with same as data above)
.
 Total runtime: 85188.694 ms




When I query against a specific partition:


DB=# EXPLAIN ANALYZE  select max(ts) from "Data_2008_48" where valid=true and "sensorID"=8293 ;
                                                                                   QUERY PLAN                                                                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.10..0.11 rows=1 width=0) (actual time=3.830..3.832 rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..0.10 rows=1 width=8) (actual time=3.817..3.819 rows=1 loops=1)
           ->  Index Scan Backward using "Data_2008_48_sensorID_ts_index" on "Data_2008_48"  (cost=0.00..15304.55 rows=148959 width=8) (actual time=3.813..3.813 rows=1 loops=1)
                 Index Cond: ("sensorID" = 8293)
                 Filter: ((ts IS NOT NULL) AND valid)
 Total runtime: 0.225 ms


The query plan against the child partition makes sense - Uses the index to find the max value.  The query plan for the partitions uses a combination of bitmap heap scans and index scans. 
Why would the query plan choose to use a bitmap heap scan after bitmap index scan or is that the best choice?  (what is it doing?) and what can I do to speed up this query?

As a sanity check I did a union query of all partitions to find the max(ts). My manual union query executed in 13ms vs the query against the parent table that was 85,188ms!!!.  



Greg Jaman
 



В списке pgsql-performance по дате отправления:

Предыдущее
От: Mark Kirkwood
Дата:
Сообщение: Re: Monitoring buffercache...
Следующее
От: Richard Huxton
Дата:
Сообщение: Re: Increasing pattern index query speed