Re: max() versus order/limit (WAS: High update

Поиск
Список
Период
Сортировка
От Luke Lonergan
Тема Re: max() versus order/limit (WAS: High update
Дата
Msg-id C3E62232E3BCF24CBA20D72BFDCB6BF8017437CE@MI8NYCMAIL08.Mi8.com
обсуждение исходный текст
Ответы Re: max() versus order/limit (WAS: High update  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: max() versus order/limit (WAS: High update  ("Joshua D. Drake" <jd@commandprompt.com>)
Список pgsql-performance
Adam,

This optimization would require teaching the planner to use an index for
MAX/MIN when available.  It seems like an OK thing to do to me.

- Luke

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Adam Rich
> Sent: Sunday, January 14, 2007 8:52 PM
> To: 'Joshua D. Drake'; 'Tom Lane'
> Cc: 'Craig A. James'; 'PostgreSQL Performance'
> Subject: Re: [PERFORM] max() versus order/limit (WAS: High
> update activity, PostgreSQL vs BigDBMS)
>
>
> Did anybody get a chance to look at this?  Is it expected behavior?
> Everyone seemed so incredulous, I hoped maybe this exposed a
> bug that would be fixed in a near release.
>
>
> -----Original Message-----
> From: Adam Rich [mailto:adam.r@sbcglobal.net]
> Sent: Sunday, January 07, 2007 11:53 PM
> To: 'Joshua D. Drake'; 'Tom Lane'
> Cc: 'Craig A. James'; 'PostgreSQL Performance'
> Subject: RE: [PERFORM] High update activity, PostgreSQL vs BigDBMS
>
>
>
> Here's another, more drastic example... Here the order by / limit
> version
> runs in less than 1/7000 the time of the MAX() version.
>
>
> select max(item_id)
> from events e, receipts r, receipt_items ri
> where e.event_id=r.event_id and r.receipt_id=ri.receipt_id
>
> Aggregate  (cost=10850.84..10850.85 rows=1 width=4) (actual
> time=816.382..816.383 rows=1 loops=1)
>   ->  Hash Join  (cost=2072.12..10503.30 rows=139019 width=4) (actual
> time=155.177..675.870 rows=147383 loops=1)
>         Hash Cond: (ri.receipt_id = r.receipt_id)
>         ->  Seq Scan on receipt_items ri  (cost=0.00..4097.56
> rows=168196 width=8) (actual time=0.009..176.894 rows=168196 loops=1)
>         ->  Hash  (cost=2010.69..2010.69 rows=24571 width=4) (actual
> time=155.146..155.146 rows=24571 loops=1)
>               ->  Hash Join  (cost=506.84..2010.69 rows=24571 width=4)
> (actual time=34.803..126.452 rows=24571 loops=1)
>                     Hash Cond: (r.event_id = e.event_id)
>                     ->  Seq Scan on receipts r  (cost=0.00..663.58
> rows=29728 width=8) (actual time=0.006..30.870 rows=29728 loops=1)
>                     ->  Hash  (cost=469.73..469.73 rows=14843 width=4)
> (actual time=34.780..34.780 rows=14843 loops=1)
>                           ->  Seq Scan on events e  (cost=0.00..469.73
> rows=14843 width=4) (actual time=0.007..17.603 rows=14843 loops=1)
> Total runtime: 816.645 ms
>
> select item_id
> from events e, receipts r, receipt_items ri
> where e.event_id=r.event_id and r.receipt_id=ri.receipt_id
> order by item_id desc limit 1
>
>
> Limit  (cost=0.00..0.16 rows=1 width=4) (actual
> time=0.047..0.048 rows=1
> loops=1)
>   ->  Nested Loop  (cost=0.00..22131.43 rows=139019 width=4) (actual
> time=0.044..0.044 rows=1 loops=1)
>         ->  Nested Loop  (cost=0.00..12987.42 rows=168196 width=8)
> (actual time=0.032..0.032 rows=1 loops=1)
>               ->  Index Scan Backward using receipt_items_pkey on
> receipt_items ri  (cost=0.00..6885.50 rows=168196 width=8) (actual
> time=0.016..0.016 rows=1 loops=1)
>               ->  Index Scan using receipts_pkey on receipts r
> (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1
> loops=1)
>                     Index Cond: (r.receipt_id = ri.receipt_id)
>         ->  Index Scan using events_pkey on events e  (cost=0.00..0.04
> rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
>               Index Cond: (e.event_id = r.event_id)
> Total runtime: 0.112 ms
>
>
>
>
>
> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Joshua D.
> Drake
> Sent: Sunday, January 07, 2007 9:10 PM
> To: Adam Rich
> Cc: 'Craig A. James'; 'Guy Rouillier'; 'PostgreSQL Performance'
> Subject: Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS
>
>
> On Sun, 2007-01-07 at 20:26 -0600, Adam Rich wrote:
> > I'm using 8.2 and using order by & limit is still faster than MAX()
> > even though MAX() now seems to rewrite to an almost identical plan
> > internally.
>
>
> Gonna need you to back that up :) Can we get an explain analyze?
>
>
> > Count(*) still seems to use a full table scan rather than an index
> scan.
> >
>
> There is a TODO out there to help this. Don't know if it will
> get done.
>
> Joshua D. Drake
>
> --
>
>       === The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
> Providing the most comprehensive  PostgreSQL solutions since 1997
>              http://www.commandprompt.com/
>
> Donate to the PostgreSQL Project:
> http://www.postgresql.org/about/donate
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate
>
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Adam Rich"
Дата:
Сообщение: Re: max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS)
Следующее
От: Rolf Østvik (HA/EXA)
Дата:
Сообщение: Re: Problem with grouping, uses Sort and GroupAggregate, HashAggregate is better(?)