Re: Limits in subqueries...

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Limits in subqueries...
Дата
Msg-id 9000.961690991@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Limits in subqueries...  (Martijn van Oosterhout <kleptog@cupid.suninternet.com>)
Список pgsql-general
Martijn van Oosterhout <kleptog@cupid.suninternet.com> writes:
> I want, for each customer, to get the serial of the the
> highest timestamp for that customer, and for duplicate
> timestamps, the highest serial.

If you will take a non-SQL-standard solution, this will work:

select distinct on (custid) custid, serial from test
order by custid, stamp desc, serial desc;

(In pre-7.0, omit parens around distinct-on value.)  Basically this
works by ordering the data in the right way and then taking just the
first row out of each group with the same custid.

It's a lot harder if you want to be SQL-compliant ...

> The way I thought how to do that would be:

> select custid, serial
> from test b
> where serial =
>    (select serial
>     from test a
>     where a.custid = b.custid
>     order by serial desc limit 1);

> but thats not allowed.

Nope.  Subselects can't have order-by clauses; SQL thinks you shouldn't
do anything as grotty as depending on the order in which tuples are
processed.  The spec lets you sort the final output of a SELECT, but not
any intermediate steps.

> The best I could do is:

> select custid, max(serial)
> from test
> where (custid,stamp) in
>    (select custid, max(stamp)
>     from test
>     group by custid)
> group by custid;

> Which is ugly and only works in this case because
> max does what I want. It has a horrible plan though:

> Aggregate  (cost=1.36 rows=11 width=8)
>   ->  Group  (cost=1.36 rows=11 width=8)
>         ->  Sort  (cost=1.36 rows=11 width=8)
>               ->  Seq Scan on test  (cost=1.36 rows=11 width=8)
>                     SubPlan
>                       ->  Aggregate  (cost=1.36 rows=11 width=12)
>                             ->  Group  (cost=1.36 rows=11 width=12)
>                                   ->  Sort  (cost=1.36 rows=11 width=12)
>                                         ->  Seq Scan on test  (cost=1.36 rows=11 width=12)

> In the more general case, this won't work.

It'd work as well as your other solution, seems to me, since max()
should be able to do the same calculation that "order by" would.
What's the more general case you are visualizing?

> Is there a better way of doing this? Does postgres 7.0 do
> better in this case?

7.0 is smart enough to materialize the result of the subselect (ie,
do the subselect just once and cache the results).  That'd make
a pretty considerable difference in runtime, but it's still a lot
slower than the DISTINCT ON solution.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: "Stuart Grimshaw"
Дата:
Сообщение: PHP Install with Postgres support.
Следующее
От: "J.R. Belding"
Дата:
Сообщение: Re: PHP Install with Postgres support.