Обсуждение: Single Table Select With Aggregate Function

Поиск
Список
Период
Сортировка

Single Table Select With Aggregate Function

От
Rich Shepard
Дата:
   I'm probably not seeing the obvious so I keep making the same mistake. The
table holds water chemistry data from multiple streams, sites within each
stream, sampling dates, and many chemical constituents.

   What I need to do are three things:

   1.) Find the date and site for the maximum value of a specified constituent
on a named stream.

   2.) Find the values of that same constituent at other sites on the named
stream on that same date.

   3.) Find the date of the maximum value of a constituent for all sites on
the named stream.

   Thinking the last the easiest to do, I submitted this query:

select max(quant), site, sampdate from chemistry where stream = 'SheepCrk'
and param = 'TDS' group by site, sampdate;

but this gives me the value of each site and date, not the maximum for all
dates at a specific site. Postgres tells me that both site and sampdate must
be in the 'group by' clause.

   Obviously the syntax is more complex than I thought it would be and I
would like a pointer to the correct way to write these queries.

Rich



Re: Single Table Select With Aggregate Function

От
Tom Lane
Дата:
Rich Shepard <rshepard@appl-ecosys.com> writes:
>    What I need to do are three things:

>    1.) Find the date and site for the maximum value of a specified constituent
> on a named stream.

>    2.) Find the values of that same constituent at other sites on the named
> stream on that same date.

>    3.) Find the date of the maximum value of a constituent for all sites on
> the named stream.

You can do that type of thing using subqueries, eg

    select ... from mytab
      where col = (select max(col) from mytab where ...)

or if you don't mind a nonstandard construct, consider SELECT DISTINCT ON.
There's an example of how to use that in the SELECT reference page
(look for the weather-reports example).  In any case you're going to
have to consider what you want to have happen when there are multiple
occurrences of the maximum value.

            regards, tom lane

Re: Single Table Select With Aggregate Function

От
Ondrej Ivanič
Дата:
Hi,

On 4 January 2012 10:26, Rich Shepard <rshepard@appl-ecosys.com> wrote:
> select max(quant), site, sampdate from chemistry where stream = 'SheepCrk'
> and param = 'TDS' group by site, sampdate;
>
> but this gives me the value of each site and date, not the maximum for all
> dates at a specific site. Postgres tells me that both site and sampdate must
> be in the 'group by' clause.
>
>  Obviously the syntax is more complex than I thought it would be and I
> would like a pointer to the correct way to write these queries.

window functions might be helpful:
http://www.postgresql.org/docs/9.1/static/tutorial-window.html

p4=> select * from public.chemistry order by sampdate, site, quant desc;
 quant | site |  sampdate
-------+------+------------
   100 | 1    | 2012-01-03
    80 | 1    | 2012-01-03
    30 | 1    | 2012-01-03
   400 | 2    | 2012-01-03
   100 | 2    | 2012-01-03
    30 | 2    | 2012-01-03
   100 | 3    | 2012-01-03
    80 | 3    | 2012-01-03
    30 | 3    | 2012-01-03
    80 | 4    | 2012-01-03
(10 rows)

p4=> select distinct first_value(quant) over (partition by sampdate,
site order by quant desc), sampdate, site from public.chemistry order
by site;
 first_value |  sampdate  | site
-------------+------------+------
         100 | 2012-01-03 | 1
         400 | 2012-01-03 | 2
         100 | 2012-01-03 | 3
          80 | 2012-01-03 | 4
(4 rows)

--
Ondrej

Re: Single Table Select With Aggregate Function

От
"David Johnston"
Дата:
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Rich Shepard
Sent: Tuesday, January 03, 2012 6:27 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Single Table Select With Aggregate Function

   I'm probably not seeing the obvious so I keep making the same mistake.
The table holds water chemistry data from multiple streams, sites within
each stream, sampling dates, and many chemical constituents.

   What I need to do are three things:

   1.) Find the date and site for the maximum value of a specified
constituent on a named stream.

   2.) Find the values of that same constituent at other sites on the named
stream on that same date.

   3.) Find the date of the maximum value of a constituent for all sites on
the named stream.

   Thinking the last the easiest to do, I submitted this query:

select max(quant), site, sampdate from chemistry where stream = 'SheepCrk'
and param = 'TDS' group by site, sampdate;

but this gives me the value of each site and date, not the maximum for all
dates at a specific site. Postgres tells me that both site and sampdate must
be in the 'group by' clause.

   Obviously the syntax is more complex than I thought it would be and I
would like a pointer to the correct way to write these queries.

Rich


---------------------------------------------------------------

-- Base Query; most recent date for the maximum quant on a given site/stream
(not tested)
WITH
get_max_quant_for_each_site AS (
    SELCET max(quant) AS site_maximum, site, stream, param FROM
chemistry WHERE stream = 'SheepCrk' AND param = 'TDS' GROUP BY site, stream,
param
)
, locate_all_entries_with_the_max AS (
    SELECT site, sampledate, quant, stream
        , ROW_NUMBER() OVER (PARTITION BY site, stream, param ORDER
BY sampledate DESC) AS sample_row_number
    FROM (SELECT site, quant, sampledate, stream FROM chemistry)
    JOIN get_max_quant_for_each_site USING (site, stream, param)
    WHERE quant = site_maximum
)
, limit_to_one_per_site AS (
    SELECT site, sampledate, quant, stream, param
    FROM locate_all_entries_with_the_max
    WHERE sample_row_number = 1
)
SELECT * FROM limit_to_one_per_site;

-- The other queries should be able to manipulate the results of this query
to directly answer your question or use these results JOINed against the
chemistry table to get at what you need
-- My naming was done before deciding that you want more than just "site" in
joining/output but also "stream" and "param"
-- You can convert the "WITH" clauses into sub-queries to possibly improve
performance...though WITH is probably easier to follow along with during
initial development

David J.


Re: Single Table Select With Aggregate Function

От
Rich Shepard
Дата:
On Tue, 3 Jan 2012, Tom Lane wrote:

> You can do that type of thing using subqueries, eg
>
>     select ... from mytab
>       where col = (select max(col) from mytab where ...)

   Thanks, Tom. That's what I thought I needed.

> or if you don't mind a nonstandard construct, consider SELECT DISTINCT ON.
> There's an example of how to use that in the SELECT reference page
> (look for the weather-reports example).  In any case you're going to
> have to consider what you want to have happen when there are multiple
> occurrences of the maximum value.

   I'll look at that, too.

Much appreciated,

Rich

Re: Single Table Select With Aggregate Function

От
Rich Shepard
Дата:
On Wed, 4 Jan 2012, Ondrej Ivanič wrote:

> window functions might be helpful:
> http://www.postgresql.org/docs/9.1/static/tutorial-window.html

   Thanks. I'll carefully read this.

Much appreciated,

Rich