Обсуждение: Very slow query in PostgreSQL 9.3.3

От:
Дата:

<span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>PostgreSQL 9.3.3 RHEL 6.4<br /><br />Total db
Servermemory 64GB<br /><br /><br /># -----------------------------<br /># PostgreSQL configuration file<br />#
-----------------------------<br/>max_connections = 100<br />shared_buffers = 16GB<br />work_mem =
32MB                        <br />maintenance_work_mem = 1GB<br />seq_page_cost = 1.0                   <br
/>random_page_cost= 2.0                <br />cpu_tuple_cost = 0.03                  <br />#cpu_index_tuple_cost =
0.005          <br />#cpu_operator_cost = 0.0025             <br />effective_cache_size = 48MB<br
/>default_statistics_target= 100       <br />constraint_exclusion = partition <br /><br />Partition table Setup<br
/>---------------------<br/><br />CREATE TABLE measurement (<br />    id              bigint not null,<br />   
city_id        bigint not null,<br />    logdate         date not null,<br />    peaktemp        bigint,<br />   
unitsales      bigint,<br />    type            bigint,<br />    uuid            uuid,<br />    geom           
geometry<br/>);<br /><br /><br />CREATE TABLE measurement_y2006m02 (<br />    CHECK ( logdate >= DATE '2006-02-01'
ANDlogdate < DATE '2006-03-01' )<br />) INHERITS (measurement);<br />CREATE TABLE measurement_y2006m03 (<br />   
CHECK( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )<br />) INHERITS (measurement);<br />...<br
/>CREATETABLE measurement_y2007m11 (<br />    CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE
'2007-12-01')<br />) INHERITS (measurement);<br />CREATE TABLE measurement_y2007m12 (<br />    CHECK ( logdate >=
DATE'2007-12-01' AND logdate < DATE '2008-01-01' )<br />) INHERITS (measurement);<br />CREATE TABLE
measurement_y2008m01(<br />    CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )<br />)
INHERITS(measurement);<br /><br />Partition measurement_y2007m12 contains 38,261,732 rows<br /><br />Indexes on
partitionmeasurement_y2007m12:<br />    "pkey_measurement_y2007m12" PRIMARY KEY, btree (id), tablespace
"measurement_y2007"<br/>    "idx_measurement_uuid_y2003m12" btree (uuid), tablespace "measurement_y2007"<br />   
"idx_measurement_type_y2003m12"btree (type), tablespace "measurement_y2007"<br />    "idx_measurement_city_y2003m12"
btree(city_id), tablespace "measurement_y2007"<br />    "idx_measurement_logdate_y2003m12" btree (logdate), tablespace
"measurement_y2007"<br/>    "sidx_measurement_geom_y2003m12" gist (geom), tablespace "measurement_y2007"<br /><br
/><b>***Problem Query *** </b><br /><br />explain (analyze on, buffers on) Select * from measurement this_ <br
/>                                 where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20
08:01:04.22'::timestamp<br/>                                    and this_.city_id=25183 order by this_.logdate asc,
this_.peaktempasc, this_.unitsales asc limit 10000;<br /><br
/>                                                                                             QUERY
PLAN                                                                                              <br
/>-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br
/> Limit (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.803..51714.266 rows=10000 loops=1)<br />  
Buffers:shared hit=25614 read=39417<br />   ->  Sort  (cost=33849.98..33855.15 rows=2068 width=618) (actual
time=51710.799..51712.924rows=10000 loops=1)<br />         Sort Key: this_.logdate, this_.unitsales<br />         Sort
Method:top-N heapsort  Memory: 15938kB<br />         Buffers: shared hit=25614 read=39417<br />         ->  Append 
(cost=0.00..33736.09rows=2068 width=618) (actual time=50.210..50793.589 rows=312046 loops=1)<br />              
Buffers:shared hit=25608 read=39417<br />               ->  Seq Scan on measurement this_  (cost=0.00..0.00 rows=1
width=840)(actual time=0.002..0.002 rows=0 loops=1)<br />                     Filter: ((logdate >= '2007-12-19
23:38:41.22'::timestampwithout time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone) AND
(city_id= 25183))<br />               ->  Index Scan using idx_measurement_city_y2007m12 on measurement_y2007m12
this__1 (cost=0.56..33736.09 rows=2067 width=618) (actual time=50.206..50731.637 rows=312046 loops=1)<br
/>                    Index Cond: (city_id = 25183)<br />                     Filter:  ((logdate >= '2007-12-19
23:38:41.22'::timestampwithout time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone))<br
/>                    Buffers: shared hit=25608 read=39417<br /><br /> Total runtime: <b>51717.639 ms</b>   <--- ***
unacceptable***<br /><br />(15 rows)  <br /><br />Total Rows meeting query criteria<br
/>---------------------------------<br/><br />Select count(*) from measurement this_ where this_.logdate between
'2007-12-1923:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp and this_.city_id=25183;<br /><br />count<br
/>------<br/>312046</div><div><br />Total Rows in the partition table referenced<br
/>------------------------------------------<br/><br />Select count(*) from measurement_y2007m12;<br /><br />  count<br
/>---------<br/>38261732</div><div><br /></div><div><b>Does anyone know how to speed up this query? I removed the order
byclause and that significantly reduced the run time to approx. 2000-3000 ms. This query is being recorded repeatedly
<br/>in our logs and executes very slowly for our UI users from 12000 ms thru 68000 ms<br /><br />Any suggestions would
beappreciated.</b><br /><br />thanks<br mce_bogus="1" /></div></span> 
От:
Pavel Stehule
Дата:




2014-03-13 20:26 GMT+01:00 <>:
PostgreSQL 9.3.3 RHEL 6.4

Total db Server memory 64GB


# -----------------------------
# PostgreSQL configuration file
# -----------------------------
max_connections = 100
shared_buffers = 16GB
work_mem = 32MB                        
maintenance_work_mem = 1GB
seq_page_cost = 1.0                  
random_page_cost = 2.0               
cpu_tuple_cost = 0.03                 
#cpu_index_tuple_cost = 0.005          
#cpu_operator_cost = 0.0025            
effective_cache_size = 48MB
default_statistics_target = 100      
constraint_exclusion = partition

Partition table Setup
---------------------

CREATE TABLE measurement (
    id              bigint not null,
    city_id         bigint not null,
    logdate         date not null,
    peaktemp        bigint,
    unitsales       bigint,
    type            bigint,
    uuid            uuid,
    geom            geometry
);


CREATE TABLE measurement_y2006m02 (
    CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 (
    CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);
...
CREATE TABLE measurement_y2007m11 (
    CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2007m12 (
    CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2008m01 (
    CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
) INHERITS (measurement);

Partition measurement_y2007m12 contains 38,261,732 rows

Indexes on partition measurement_y2007m12:
    "pkey_measurement_y2007m12" PRIMARY KEY, btree (id), tablespace "measurement_y2007"
    "idx_measurement_uuid_y2003m12" btree (uuid), tablespace "measurement_y2007"
    "idx_measurement_type_y2003m12" btree (type), tablespace "measurement_y2007"
    "idx_measurement_city_y2003m12" btree (city_id), tablespace "measurement_y2007"
    "idx_measurement_logdate_y2003m12" btree (logdate), tablespace "measurement_y2007"
    "sidx_measurement_geom_y2003m12" gist (geom), tablespace "measurement_y2007"

*** Problem Query ***

explain (analyze on, buffers on) Select * from measurement this_
                                  where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp
                                    and this_.city_id=25183 order by this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000;

                                                                                              QUERY PLAN                                                                                              
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.803..51714.266 rows=10000 loops=1)
   Buffers: shared hit=25614 read=39417
   ->  Sort  (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.799..51712.924 rows=10000 loops=1)
         Sort Key: this_.logdate, this_.unitsales
         Sort Method: top-N heapsort  Memory: 15938kB
         Buffers: shared hit=25614 read=39417
         ->  Append  (cost=0.00..33736.09 rows=2068 width=618) (actual time=50.210..50793.589 rows=312046 loops=1)
               Buffers: shared hit=25608 read=39417
               ->  Seq Scan on measurement this_  (cost=0.00..0.00 rows=1 width=840) (actual time=0.002..0.002 rows=0 loops=1)
                     Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone) AND (city_id = 25183))
               ->  Index Scan using idx_measurement_city_y2007m12 on measurement_y2007m12 this__1  (cost=0.56..33736.09 rows=2067 width=618) (actual time=50.206..50731.637 rows=312046 loops=1)
                     Index Cond: (city_id = 25183)
                     Filter:  ((logdate >= '2007-12-19 23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone))
                     Buffers: shared hit=25608 read=39417

 Total runtime: 51717.639 ms   <--- *** unacceptable ***

(15 rows) 

Total Rows meeting query criteria
---------------------------------

Select count(*) from measurement this_ where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp and this_.city_id=25183;

count
------
312046

Total Rows in the partition table referenced
------------------------------------------

Select count(*) from measurement_y2007m12;

  count
---------
38261732

Does anyone know how to speed up this query? I removed the order by clause and that significantly reduced the run time to approx. 2000-3000 ms. This query is being recorded repeatedly
in our logs and executes very slowly for our UI users from 12000 ms thru 68000 ms

Any suggestions would be appreciated.


sort (ORDER BY clause) enforce a reading of complete partitions. And it is slow - it is strange so reading 300K rows needs a 5K sec. Probably your IO is overloaded.

Regards

Pavel Stehule
 

thanks

От:
Sergey Konoplev
Дата:

On Thu, Mar 13, 2014 at 12:26 PM,  <> wrote:
> *** Problem Query ***
>
> explain (analyze on, buffers on) Select * from measurement this_
>                                   where this_.logdate between '2007-12-19
> 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp
>                                     and this_.city_id=25183 order by
> this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000;
>
[...]
>  Total runtime: 51717.639 ms   <--- *** unacceptable ***

Try to create a multi-column index on the partition by (city_id,
logdate). Then run the original query and the query without peaktemp
and nitsales on the order by. Compare the results, and if the first
one will not be satisfying try to add these two columns to the end of
the column list of your multi-column index on the order as they appear
in your query. It should do the trick. If it wont, please, show the
plans.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979



От:
Ian Lawrence Barwick
Дата:

2014-03-14 4:26 GMT+09:00  <>:
> PostgreSQL 9.3.3 RHEL 6.4
>
> Total db Server memory 64GB

(...)
> effective_cache_size = 48MB

I'm not sure if this will help directly, but is the value for
'effective_cache_size' intentional? 48 *GB* would be a more likely
setting.

Regards

Ian Barwick