Re: select from grouped data
От | Stephen Frost |
---|---|
Тема | Re: select from grouped data |
Дата | |
Msg-id | 20210825204256.GX17906@tamriel.snowman.net обсуждение исходный текст |
Ответ на | select from grouped data (Roger Mason <rmason@mun.ca>) |
Ответы |
Re: select from grouped data
(James Keener <jim@jimkeener.com>)
|
Список | pgsql-novice |
Greetings, * Roger Mason (rmason@mun.ca) wrote: > I have written this function to process some multiline text data into a > table with both text & numeric columns: > > CREATE OR REPLACE FUNCTION get_final_energy () > RETURNS TABLE ( > jid text, > "timestamp" text, > scf integer, > energy double precision > ) > AS $$ > WITH a AS ( > SELECT > jid, > regexp_split_to_table(totenergy_out, '\n') AS teo > FROM > results > ), > b AS ( > SELECT > results.jid, > results. "timestamp", > cast( > CASE WHEN split_part(a.teo, ' ', 2) = '' THEN > '0' > ELSE > split_part(a.teo, ' ', 2) > END AS integer) AS scf, > cast( > CASE WHEN split_part(a.teo, ' ', 3) = '' THEN > '0.0' > ELSE > split_part(a.teo, ' ', 3) > END AS double precision) AS energy > FROM > results, > a > WHERE > results.jid = a.jid > GROUP BY > results.jid, > results. "timestamp", > a.teo > --HAVING > -- scf = max(scf) > ORDER BY > timestamp ASC, > scf DESC > ) > SELECT > * > FROM > b; > > $$ > LANGUAGE sql; > > The output looks like: > > jid | timestamp | scf | energy > ------------+-----------------+-----+---------------- > 1250_1 | 20210805-114634 | 18 | -1316.43700819 > 1250_1 | 20210805-114634 | 17 | -1316.43700825 > 1250_1 | 20210805-114634 | 16 | -1316.4370097 > 1250_1 | 20210805-114634 | 15 | -1316.43700991 > 1250_1 | 20210805-114634 | 14 | -1316.43699775 > 1250_1 | 20210805-114634 | 13 | -1316.43699117 > 1250_1 | 20210805-114634 | 12 | -1316.43750771 > 1250_1 | 20210805-114634 | 11 | -1316.43805358 > 1250_1 | 20210805-114634 | 10 | -1316.43857192 > 1250_1 | 20210805-114634 | 9 | -1316.43070942 > 1251_1 | 20210806-062539 | 18 | -1316.43700819 > 1251_1 | 20210806-062539 | 17 | -1316.43700826 > .... > > What I want is to get (for each group) the energy corresponding to the > maximum value of scf. * David G. Johnston (david.g.johnston@gmail.com) wrote: > > > > > > The output looks like: > > > > jid | timestamp | scf | energy > > ------------+-----------------+-----+---------------- > > 1250_1 | 20210805-114634 | 18 | -1316.43700819 > > 1250_1 | 20210805-114634 | 17 | -1316.43700825 > > 1250_1 | 20210805-114634 | 16 | -1316.4370097 > > 1250_1 | 20210805-114634 | 15 | -1316.43700991 > > 1250_1 | 20210805-114634 | 14 | -1316.43699775 > > 1250_1 | 20210805-114634 | 13 | -1316.43699117 > > 1250_1 | 20210805-114634 | 12 | -1316.43750771 > > 1250_1 | 20210805-114634 | 11 | -1316.43805358 > > 1250_1 | 20210805-114634 | 10 | -1316.43857192 > > 1250_1 | 20210805-114634 | 9 | -1316.43070942 > > 1251_1 | 20210806-062539 | 18 | -1316.43700819 > > 1251_1 | 20210806-062539 | 17 | -1316.43700826 > > .... > > > > What I want is to get (for each group) the energy corresponding to the > > maximum value of scf. > > > > > SELECT DISTINCT ON (jid) jid, timestamp, scf, energy [...] ORDER BY jid, > scf DESC While this works, it's generally better to use a LATERAL join as that's part of the SQL standard while DISTINCT ON isn't. Using a LATERAL join also would allow you to have multiple rows (top-N) if you wanted. You'd do that using: WITH jids AS (SELECT jid FROM results GROUP BY jid) SELECT jids.jid, t.ts, t.scf, t.energy FROM jids CROSS JOIN LATERAL (SELECT ts, scf, energy FROM results WHERE results.jid = jids.jid ORDER BY scf DESC LIMIT 1) AS t ; A couple of notes on this: don't name a column "timestamp" and when it's a timestamp, use the 'timestamptz' data type, not text. Your jids sure look like they should just be numbers instead of text too. If you already have a distinct set of jids somewhere (like in another table), you could use that as the source table instead of the CTE that I'm using above. Thanks, Stephen
Вложения
В списке pgsql-novice по дате отправления: