select max(field) from table much faster with a group by clause?

Поиск
Список
Период
Сортировка
От Palle Girgensohn
Тема select max(field) from table much faster with a group by clause?
Дата
Msg-id D26563C7B9CCEB0A338FF08E@rambutan.pingpong.net
обсуждение исходный текст
Ответы Re: select max(field) from table much faster with a group by clause?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: select max(field) from table much faster with a group by clause?  (hubert depesz lubaczewski <depesz@depesz.com>)
Список pgsql-performance
Hi,

I have a table "login" with approx 600,000 tuples, a person table with
approx 100000 tuples.

When running
 select max("when") from login where userid='userid'

it takes a second or two, but when adding "group by userid" the planner
decides on using another plan, and it gets *much* faster. See example below.

Number of tuples per user varies from zero to a couple of thousands. It
seems to slower when there are no tuples as all, but it is always slow.

This is only for max() and min(). For count(), the plan is the same, it
always uses "Aggregate".

Any ideas about this? Do we need to add "group by userid" to our code base
to optimize, or is there another way? Updating postgresql to 8.2 is a long
term option, but I'd like a short term option as well...

Regards,
Palle


pp=# select version();
                                             version
-------------------------------------------------------------------------------------------------
 PostgreSQL 8.1.8 on amd64-portbld-freebsd6.1, compiled by GCC cc (GCC)
3.4.4 [FreeBSD] 20050518
(1 row)

Time: 0,530 ms
pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' ;
 QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=323.80..323.81 rows=1 width=0) (actual
time=3478.781..3478.785 rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..323.80 rows=1 width=8) (actual
time=3478.768..3478.768 rows=0 loops=1)
           ->  Index Scan Backward using login_when_idx on "login"
(cost=0.00..131461.90 rows=406 width=8) (actual time=3478.759..3478.759
rows=0 loops=1)
                 Filter: (("when" IS NOT NULL) AND (userid =
'sarah.gilliam1'::text))
 Total runtime: 3478.868 ms
(6 rows)

Time: 3480,442 ms
pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen'
group by userid;
                                                             QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.00..648.44 rows=1 width=25) (actual
time=0.191..0.191 rows=0 loops=1)
   ->  Index Scan using login_userid_idx on "login"  (cost=0.00..646.40
rows=406 width=25) (actual time=0.183..0.183 rows=0 loops=1)
         Index Cond: (userid = 'sarah.gilliam1'::text)
 Total runtime: 0.243 ms
(4 rows)

Time: 0,938 ms
pp=# \d login
                  Table "public.login"
 Column |           Type           |     Modifiers
--------+--------------------------+--------------------
 userid | text                     |
 kursid | integer                  |
 when   | timestamp with time zone |
 mode   | text                     | default 'pm'::text
Indexes:
    "login_kurs_user_idx" btree (kursid, userid)
    "login_userid_idx" btree (userid)
    "login_when_idx" btree ("when")
Foreign-key constraints:
    "pp_fk1" FOREIGN KEY (userid) REFERENCES person(userid) ON UPDATE
CASCADE ON DELETE CASCADE
    "pp_fk2" FOREIGN KEY (kursid) REFERENCES course(id) ON UPDATE CASCADE
ON DELETE CASCADE



В списке pgsql-performance по дате отправления:

Предыдущее
От: Adam Tauno Williams
Дата:
Сообщение: Re: Hardware for PostgreSQL
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [Fwd: Re: Outer joins and Seq scans]