Обсуждение: enumerate groups given a certain value
Hello,
Here is a sql problem, which I thought simple at first, but for which I
ended up with a solution I find surprisingly complicated.
I really think I could have achieved a much easier way of handling this,
but I do not manage to find the trick allowing a very simple and
efficient query to solve the problem.
Let's present it with a sample case. The initial table is the following
one :
--------------------
drop table if exists test_gen ;
create table test_gen as
select *
from (select chr((round(random()* 25) +65)::int) as id , random()* 100 as valfrom generate_series(1,200) as
gorder by id
) as foo
select * from test_gen;
-------------------
What I want to do is to enumerate lines for each group of id, following
the order of val.
For example :
id val gen
A 2.65105138532817 1
A 38.9289360493422 2
A 74.6089164167643 3
B 2.01512188650668 1
B 11.4642047323287 2
B 31.2643219716847 3
B 65.8427979797125 4
C 0.759994331747293 1
C 11.8905796203762 2
C 13.7388648930937 3
C 49.1934351157397 4
C 83.1861903425306 5
D 45.8268967922777 1
D 57.1161589119583 2
E 9.72125697880983 1
E 61.324825277552 2
E 70.3348958399147 3
F 0.49891234234237 1
Here is the solution I ended up with :
---------------------------
-- first count number of ids per group
drop table test_gen2 ;
create table test_gen2 as
select t1.*, t2.nb
from test_gen as t1,(SELECT id, count(*) as nb FROM test_gen GROUP BY id) as t2
WHERE t1.id =t2.id
ORDER BY t1.id;
create sequence seq_test_gen start with 1;
create sequence seq_test_gen2 start with 1;
-- get the table with the order set (gen is our order)
select*
from(select foo1.*, nextval('seq_test_gen') as serialfrom ( select * from test_gen2 order
by id, val ) as foo1) as t1,(select foo.*, nextval('seq_test_gen2') as serialfrom ( select
gb1.*, generate_series(1, gb1.nb) as gen from ( select id, nb from
test_gen2 group by id, nb ) as gb1 order by gb1.id, gen ) as foo) as t2
wheret1.serial = t2.serial;
-----------------------------------
The problem seems to be as easy as : <sort my two sets and put them side
to side>. But I could not find a better way to do that than putting a
serial on left and right side and do a join on this serial.
I also tried to find a solution using a modulo but could not manage to
get it work.
Anybody for a ray of light on a different approach ? This look like a
recurrent problem, isn't there an experienced sql programmer here who
tackled this issued a couple of time ?
Thanks for any help,
Vincent
Hello, Picavet.
> Anybody for a ray of light on a different approach ? This look like a
> recurrent problem, isn't there an experienced sql programmer here who
> tackled this issued a couple of time ?
Actually, I'm not very experienced in SQL. But from my point of view
this problem could be solved much more easily using plpgsql rather
than plain SQL.
Your initial query have been a little bit modified for convenience:
CREATE TABLE test_gen AS
SELECT *
FROM(SELECT chr((round(random()* 25) +65)::int) AS id, random()* 100 AS val, 0::INTEGER AS genFROM
generate_series(1,200)as gORDER BY id
) foo;
CREATE OR REPLACE FUNCTION enum_groups (varchar) RETURNS SETOF test_gen AS
$body$
DECLAREr chip.test_gen%ROWTYPE;_id VARCHAR;i INTEGER := 0;q TEXT;
BEGINq := 'SELECT * FROM ' || $1 || ' ORDER BY id ASC, val ASC;';FOR r IN EXECUTE q LOOP IF ((_id IS NULL) OR (_id =
r.id))THEN i := i + 1; ELSE i := 1; END IF; _id := r.id; r.gen := i; RETURN NEXT r;END LOOP;RETURN;
END;
$body$
LANGUAGE 'plpgsql';
Now you can SELECT * FROM enum_groups('test_gen') and you'll get following:
id val gen
B 2,35326588153839 1
B 11,4269167650491 2
B 11,9314394891262 3
B 27,9016905929893 4
B 28,548994101584 5
B 48,8151242025197 6
B 50,215089507401 7
B 59,613792411983 8
B 61,2281930632889 9
B 80,49540463835 10
C 5,86635880172253 1
C 11,5974457468838 2
C 15,8136531710625 3
C 29,8465201631188 4
C 52,9871591832489 5
C 57,3461000341922 6
C 63,3344274014235 7
...
HTH
PS. Sorry, I forget to reply all first time.
--
Best regards. Yuri.