Re: select from grouped data

Поиск
Список
Период
Сортировка
От James Keener
Тема Re: select from grouped data
Дата
Msg-id 77511C44-068E-4047-A955-9C6C08BC1942@jimkeener.com
обсуждение исходный текст
Ответ на Re: select from grouped data  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: select from grouped data  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-novice
I don't know that "generally better" is the correct description. There are a lot of things not covered by the SQL standard and it would be crazy not to make use of them, e.g. insert on conflict and distinct on, especially when they make the query more readable and more correct. Worrying about sql portability is almost always time not well spent and an invitation to do things not as well as they can be.

Jim

On August 25, 2021 4:42:56 PM EDT, Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Roger Mason (rmason@mun.ca) wrote:
I have written this function to process some multiline text data into a
table with both text & numeric columns:

CREATE OR REPLACE FUNCTION get_final_energy ()
RETURNS TABLE (
jid text,
"timestamp" text,
scf integer,
energy double precision
)
AS $$
WITH a AS (
SELECT
jid,
regexp_split_to_table(totenergy_out, '\n') AS teo
FROM
results
),
b AS (
SELECT
results.jid,
results. "timestamp",
cast(
CASE WHEN split_part(a.teo, ' ', 2) = '' THEN
'0'
ELSE
split_part(a.teo, ' ', 2)
END AS integer) AS scf,
cast(
CASE WHEN split_part(a.teo, ' ', 3) = '' THEN
'0.0'
ELSE
split_part(a.teo, ' ', 3)
END AS double precision) AS energy
FROM
results,
a
WHERE
results.jid = a.jid
GROUP BY
results.jid,
results. "timestamp",
a.teo
--HAVING
-- scf = max(scf)
ORDER BY
timestamp ASC,
scf DESC
)
SELECT
*
FROM
b;

$$
LANGUAGE sql;

The output looks like:

jid | timestamp | scf | energy
------------+-----------------+-----+----------------
1250_1 | 20210805-114634 | 18 | -1316.43700819
1250_1 | 20210805-114634 | 17 | -1316.43700825
1250_1 | 20210805-114634 | 16 | -1316.4370097
1250_1 | 20210805-114634 | 15 | -1316.43700991
1250_1 | 20210805-114634 | 14 | -1316.43699775
1250_1 | 20210805-114634 | 13 | -1316.43699117
1250_1 | 20210805-114634 | 12 | -1316.43750771
1250_1 | 20210805-114634 | 11 | -1316.43805358
1250_1 | 20210805-114634 | 10 | -1316.43857192
1250_1 | 20210805-114634 | 9 | -1316.43070942
1251_1 | 20210806-062539 | 18 | -1316.43700819
1251_1 | 20210806-062539 | 17 | -1316.43700826
....

What I want is to get (for each group) the energy corresponding to the
maximum value of scf.

* David G. Johnston (david.g.johnston@gmail.com) wrote:


The output looks like:

jid | timestamp | scf | energy
------------+-----------------+-----+----------------
1250_1 | 20210805-114634 | 18 | -1316.43700819
1250_1 | 20210805-114634 | 17 | -1316.43700825
1250_1 | 20210805-114634 | 16 | -1316.4370097
1250_1 | 20210805-114634 | 15 | -1316.43700991
1250_1 | 20210805-114634 | 14 | -1316.43699775
1250_1 | 20210805-114634 | 13 | -1316.43699117
1250_1 | 20210805-114634 | 12 | -1316.43750771
1250_1 | 20210805-114634 | 11 | -1316.43805358
1250_1 | 20210805-114634 | 10 | -1316.43857192
1250_1 | 20210805-114634 | 9 | -1316.43070942
1251_1 | 20210806-062539 | 18 | -1316.43700819
1251_1 | 20210806-062539 | 17 | -1316.43700826
....

What I want is to get (for each group) the energy corresponding to the
maximum value of scf.


SELECT DISTINCT ON (jid) jid, timestamp, scf, energy [...] ORDER BY jid,
scf DESC

While this works, it's generally better to use a LATERAL join as that's
part of the SQL standard while DISTINCT ON isn't. Using a LATERAL join
also would allow you to have multiple rows (top-N) if you wanted. You'd
do that using:

WITH jids AS (SELECT jid FROM results GROUP BY jid)
SELECT jids.jid, t.ts, t.scf, t.energy
FROM jids CROSS JOIN LATERAL
(SELECT ts, scf, energy FROM results
WHERE results.jid = jids.jid
ORDER BY scf DESC
LIMIT 1) AS t
;

A couple of notes on this: don't name a column "timestamp" and when it's
a timestamp, use the 'timestamptz' data type, not text. Your jids sure
look like they should just be numbers instead of text too. If you
already have a distinct set of jids somewhere (like in another table),
you could use that as the source table instead of the CTE that I'm using
above.

Thanks,

Stephen

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

В списке pgsql-novice по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: select from grouped data
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: select from grouped data