Re: select from grouped data

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: select from grouped data
Дата
Msg-id 20210825204256.GX17906@tamriel.snowman.net
обсуждение исходный текст
Ответ на select from grouped data  (Roger Mason <rmason@mun.ca>)
Ответы Re: select from grouped data  (James Keener <jim@jimkeener.com>)
Список pgsql-novice
Greetings,

* Roger Mason (rmason@mun.ca) wrote:
> I have written this function to process some multiline text data into a
> table with both text & numeric columns:
>
> CREATE OR REPLACE FUNCTION get_final_energy ()
>   RETURNS TABLE (
>     jid text,
>     "timestamp" text,
>     scf integer,
>     energy double precision
>   )
>   AS $$
>   WITH a AS (
>     SELECT
>       jid,
>       regexp_split_to_table(totenergy_out, '\n') AS teo
>     FROM
>       results
> ),
> b AS (
>   SELECT
>     results.jid,
>     results. "timestamp",
>     cast(
>       CASE WHEN split_part(a.teo, ' ', 2) = '' THEN
>         '0'
>       ELSE
>         split_part(a.teo, ' ', 2)
>       END AS integer) AS scf,
>     cast(
>       CASE WHEN split_part(a.teo, ' ', 3) = '' THEN
>         '0.0'
>       ELSE
>         split_part(a.teo, ' ', 3)
>       END AS double precision) AS energy
>   FROM
>     results,
>     a
>   WHERE
>     results.jid = a.jid
>   GROUP BY
>     results.jid,
>     results. "timestamp",
>     a.teo
>   --HAVING
>     -- scf = max(scf)
>     ORDER BY
>          timestamp ASC,
>          scf DESC
> )
> SELECT
>   *
> FROM
>   b;
>
> $$
> LANGUAGE sql;
>
> The output looks like:
>
>    jid     |    timestamp    | scf |     energy
> ------------+-----------------+-----+----------------
>  1250_1     | 20210805-114634 |  18 | -1316.43700819
>  1250_1     | 20210805-114634 |  17 | -1316.43700825
>  1250_1     | 20210805-114634 |  16 |  -1316.4370097
>  1250_1     | 20210805-114634 |  15 | -1316.43700991
>  1250_1     | 20210805-114634 |  14 | -1316.43699775
>  1250_1     | 20210805-114634 |  13 | -1316.43699117
>  1250_1     | 20210805-114634 |  12 | -1316.43750771
>  1250_1     | 20210805-114634 |  11 | -1316.43805358
>  1250_1     | 20210805-114634 |  10 | -1316.43857192
>  1250_1     | 20210805-114634 |   9 | -1316.43070942
>  1251_1     | 20210806-062539 |  18 | -1316.43700819
>  1251_1     | 20210806-062539 |  17 | -1316.43700826
> ....
>
> What I want is to get (for each group) the energy corresponding to the
> maximum value of scf.

* David G. Johnston (david.g.johnston@gmail.com) wrote:
> >
> >
> > The output looks like:
> >
> >    jid     |    timestamp    | scf |     energy
> > ------------+-----------------+-----+----------------
> >  1250_1     | 20210805-114634 |  18 | -1316.43700819
> >  1250_1     | 20210805-114634 |  17 | -1316.43700825
> >  1250_1     | 20210805-114634 |  16 |  -1316.4370097
> >  1250_1     | 20210805-114634 |  15 | -1316.43700991
> >  1250_1     | 20210805-114634 |  14 | -1316.43699775
> >  1250_1     | 20210805-114634 |  13 | -1316.43699117
> >  1250_1     | 20210805-114634 |  12 | -1316.43750771
> >  1250_1     | 20210805-114634 |  11 | -1316.43805358
> >  1250_1     | 20210805-114634 |  10 | -1316.43857192
> >  1250_1     | 20210805-114634 |   9 | -1316.43070942
> >  1251_1     | 20210806-062539 |  18 | -1316.43700819
> >  1251_1     | 20210806-062539 |  17 | -1316.43700826
> > ....
> >
> > What I want is to get (for each group) the energy corresponding to the
> > maximum value of scf.
> >
> >
> SELECT DISTINCT ON (jid) jid, timestamp, scf, energy [...] ORDER BY jid,
> scf DESC

While this works, it's generally better to use a LATERAL join as that's
part of the SQL standard while DISTINCT ON isn't.  Using a LATERAL join
also would allow you to have multiple rows (top-N) if you wanted.  You'd
do that using:

WITH jids AS (SELECT jid FROM results GROUP BY jid)
SELECT jids.jid, t.ts, t.scf, t.energy
FROM jids CROSS JOIN LATERAL
 (SELECT ts, scf, energy FROM results
  WHERE results.jid = jids.jid
  ORDER BY scf DESC
  LIMIT 1) AS t
;

A couple of notes on this: don't name a column "timestamp" and when it's
a timestamp, use the 'timestamptz' data type, not text.  Your jids sure
look like they should just be numbers instead of text too.  If you
already have a distinct set of jids somewhere (like in another table),
you could use that as the source table instead of the CTE that I'm using
above.

Thanks,

Stephen

Вложения

В списке pgsql-novice по дате отправления:

Предыдущее
От: Roger Mason
Дата:
Сообщение: Re: select from grouped data
Следующее
От: James Keener
Дата:
Сообщение: Re: select from grouped data