Обсуждение: Unique - first
I have a table (x,y,z) - I'd like to take the rows with unique x values - but, when more than one row have the same x value, I want the one with the minimal z value. How can I do that? I can imagine doing it with window functions, but also that regular SQL should be able to do it too.
Robert James wrote on 27.10.2013 14:04:
> I have a table (x,y,z) - I'd like to take the rows with unique x
> values - but, when more than one row have the same x value, I want the
> one with the minimal z value.
>
> How can I do that? I can imagine doing it with window functions, but
> also that regular SQL should be able to do it too.
>
>
Window functions *are* "regular" SQL ;)
select x,y,z
from (
select x,y,z,
min(y) over (partition by x) as min_y
from the_table
) t
where y = min_y;
Instead of min() you could also use row_number() or dense_rank() to find the minimum value.
A solution without window functions could be something like:
select t1.x, t1.y, t1.z
from table t1
join (select t2.x, min(t2.y) as min_y
from the_table t2
group by t2.x
) mt on mt.x = t1.x and mt.min_y = t1.y;
But I'm pretty sure the solution with the window function will perform better.
Thomas Kellerer <spam_eater@gmx.net> writes:
> Robert James wrote on 27.10.2013 14:04:
>> I have a table (x,y,z) - I'd like to take the rows with unique x
>> values - but, when more than one row have the same x value, I want the
>> one with the minimal z value.
>>
>> How can I do that? I can imagine doing it with window functions, but
>> also that regular SQL should be able to do it too.
> Window functions *are* "regular" SQL ;)
Indeed. The only other easy way I know of involves SELECT DISTINCT ON
(see the "weather reports" example in the SELECT reference page); but
that is most definitely not standard SQL, it is a Postgres-ism.
> A solution without window functions could be something like:
> select t1.x, t1.y, t1.z
> from table t1
> join (select t2.x, min(t2.y) as min_y
> from the_table t2
> group by t2.x
> ) mt on mt.x = t1.x and mt.min_y = t1.y;
Note that this doesn't work unless x and y form a primary key, else you
get multiple join rows (or no join rows, if one is NULL). In any case,
it's unlikely to be fast.
I think I've seen some even more esoteric solutions that use only
SQL-92-era features, but lack of caffeine prevents me from recalling them.
In any case, there's a good reason why we invented SELECT DISTINCT ON:
this is just not easy to do in minimal SQL.
regards, tom lane
On 27/10/2013 13:04, Robert James wrote: > I have a table (x,y,z) - I'd like to take the rows with unique x > values - but, when more than one row have the same x value, I want the > one with the minimal z value. > > How can I do that? I can imagine doing it with window functions, but > also that regular SQL should be able to do it too. My personal favourite approach, assuming you can rely on Postgres 9.0 or higher, is to define a first() aggregate as shown here: http://wiki.postgresql.org/wiki/First/last_%28aggregate%29 Once created, this can be used with the order_by_clause of the aggregate expression as shown here: http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES (That's why it requires 9.0, since earlier versions had no such clause). So in your case, you could run SELECT x, first(y order by z) as y_with_lowest_z FROM xyzzy GROUP BY x I find this a lot easier to understand than window functions; I've no idea how its performance compares. -- Rowan Collins [IMSoP]
On Sun, Oct 27, 2013 at 2:04 PM, Robert James <srobertjames@gmail.com> wrote:
I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.
You can use distinct on (which is a Postgresql extension to the SQL standard):
select distinct on(x) x, y, z
from the_table
order by x, z
Regards
Marcin Mańk