Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time

Поиск
Список
Период
Сортировка
От Clem Dickey
Тема Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time
Дата
Msg-id iv30d7$2881$1@news.hub.org
обсуждение исходный текст
Ответ на GROUP BY with reasonable timings in PLAN but unreasonable execution time  (Clem Dickey <dickeycl@us.ibm.com>)
Ответы Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time
Список pgsql-performance
On 07/05/2011 07:26 PM, Clem Dickey wrote:

Updates after belatedly reading the "slow queries" guidelines:

Version: PostgreSQL 8.4.8 on x86_64-redhat-linux-gnu, compiled by GCC
gcc (GCC) 4.4.5 20101112 (Red Hat 4.4.5-2), 64-bit

The query has always been slow; the table for this test case is never
updated. I don't run VACUUM but do run ANALYZE.

Originally all database config parameters were the default. Since
yesterday I have changed
  shared_buffers = 224MB
  effective_cache_size = 1024MB
but seen no change in behavior.

> Column | Type | Modifiers
> --------+---------+-----------
> y | integer | not null
> x | integer | not null
> k | integer | not null
> j | integer | not null
> z | integer | not null
> Indexes:
> "t_pkey" PRIMARY KEY, btree (j, k, x, y, z)
>
> The table population, in pseudocode, is this:
> for x in 0..9
> for y in 0..9999
> for z in 0..29
> INSERT INTO t VALUES(y,x,0,0,z)

> The query is:
>
> SELECT *
> FROM (
> SELECT * FROM t GROUP BY j,k,x,z,y
> ) AS f
> NATURAL JOIN t;

The EXPLAIN ANALYZE output is http://explain.depesz.com/s/KGk

Notes on the analysis:
1. I see that the planner estimates that GROUP BY will reduce 300K rows
to 30K, a bit odd because every row which the planner could examine is
in a unique group.
2. The JOIN is expected to produce one row. I'm not sure how the planner
came up with that estimate.

> By contrast, placing an intermediate result in a table "u" provides a
> result in about 3 seconds:

=> EXPLAIN ANALYZE CREATE TABLE u AS SELECT * FROM t GROUP BY
j,k,x,z,y;EXPLAIN ANALYZE SELECT * FROM u NATURAL JOIN t;DROP TABLE u;
                                                       QUERY PLAN

----------------------------------------------------------------------------------------------------------------------
  Group  (cost=44508.90..49008.90 rows=30000 width=20) (actual
time=1305.381..2028.385 rows=300000 loops=1)
    ->  Sort  (cost=44508.90..45258.90 rows=300000 width=20) (actual
time=1305.374..1673.843 rows=300000 loops=1)
          Sort Key: j, k, x, z, y
          Sort Method:  external merge  Disk: 8792kB
          ->  Seq Scan on t  (cost=0.00..4911.00 rows=300000 width=20)
(actual time=0.008..62.935 rows=300000 loops=1)
  Total runtime: 2873.590 ms
(6 rows)

                                                            QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------
  Merge Join  (cost=46229.86..72644.38 rows=1 width=20) (actual
time=1420.527..2383.507 rows=300000 loops=1)
    Merge Cond: ((t.j = u.j) AND (t.k = u.k) AND (t.x = u.x) AND (t.y =
u.y) AND (t.z = u.z))
    ->  Index Scan using t_pkey on t  (cost=0.00..14877.18 rows=300000
width=20) (actual time=0.013..118.244 rows=300000 loops=1)
    ->  Materialize  (cost=46229.86..50123.52 rows=311493 width=20)
(actual time=1420.498..1789.864 rows=300000 loops=1)
          ->  Sort  (cost=46229.86..47008.59 rows=311493 width=20)
(actual time=1420.493..1692.988 rows=300000 loops=1)
                Sort Key: u.j, u.k, u.x, u.y, u.z
                Sort Method:  external merge  Disk: 8784kB
                ->  Seq Scan on u  (cost=0.00..5025.93 rows=311493
width=20) (actual time=0.018..78.850 rows=300000 loops=1)
  Total runtime: 2424.870 ms
(9 rows)

(Adding an "ANALYZE" on the temporary table improves the JOIN estimated
fow count from 1 to about 299500, but does not change the plan.)

Clem Dickey

В списке pgsql-performance по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: 100% CPU Utilization when we run queries.
Следующее
От: Robert Klemme
Дата:
Сообщение: Re: 100% CPU Utilization when we run queries.