Обсуждение: group by range of values
Hello, Here's my table: db=# select * from tester order by birth_year; birth_year | salary ------------+-------- 1946 | 78000 1949 | 61000 1951 | 58000 1953 | 56000 1958 | 52000 1962 | 50000 1965 | 45000 1967 | 60000 1968 | 57000 1970 | 47000 1972 | 32000 1973 | 42000 (12 rows) How can I display the average salary grouped by decade of birth year? That is, is it possible to display the average salary of those born in the 1940's, the average salary of those born in the 1950's, average salary of those born in the 1960's, and those born in the 1970's, all in one result table? Something like: decade | average(salary) -------+----------------- 1940 | 69500 1950 | 53333.33 1960 | 53000 1970 | 40333.33 Thanks in advance, C
Carol Cheung wrote:
> Hello,
>
> Here's my table:
>
> db=# select * from tester order by birth_year;
> birth_year | salary
> ------------+--------
> 1946 | 78000
> 1949 | 61000
> 1951 | 58000
> 1953 | 56000
> 1958 | 52000
> 1962 | 50000
> 1965 | 45000
> 1967 | 60000
> 1968 | 57000
> 1970 | 47000
> 1972 | 32000
> 1973 | 42000
> (12 rows)
>
> How can I display the average salary grouped by decade of birth year?
> That is, is it possible to display the average salary of those born in
> the 1940's, the average salary of those born in the 1950's, average
> salary of those born in the 1960's, and those born in the 1970's, all in
> one result table?
> Something like:
>
> decade | average(salary)
> -------+-----------------
> 1940 | 69500
> 1950 | 53333.33
> 1960 | 53000
> 1970 | 40333.33
>
Assuming birth_year is an integer (if it's not, then just change the
query to cast it to one before the division), one possible approach
might be:
select birth_year / 10 || '0' as decade, avg(salary::numeric) as average_salary from tester group by
decade order by decade asc;
-Jon
--
Senior Systems Developer
Media Matters for America
http://mediamatters.org/
Hello you can use CASE like SELECT CASE WHEN birth_year BETWEEN 1940 AND 1949 THEN 1940 WHEN birth_year BETWEEN 1950 AND 1959 THEN 1950 WHEN birth_yearBETWEEN 1960 AND 1969 THEN 1960 WHEN birth_year BETWEEN 1970 AND 1979 THEN 1979 END, AVG(salary) FROM tester GROUPBY 1 ORDER BY 1; Regards Pavel Stehule 2007/7/27, Carol Cheung <cacheung@consumercontact.com>: > Hello, > > Here's my table: > > db=# select * from tester order by birth_year; > birth_year | salary > ------------+-------- > 1946 | 78000 > 1949 | 61000 > 1951 | 58000 > 1953 | 56000 > 1958 | 52000 > 1962 | 50000 > 1965 | 45000 > 1967 | 60000 > 1968 | 57000 > 1970 | 47000 > 1972 | 32000 > 1973 | 42000 > (12 rows) > > How can I display the average salary grouped by decade of birth year? > That is, is it possible to display the average salary of those born in > the 1940's, the average salary of those born in the 1950's, average > salary of those born in the 1960's, and those born in the 1970's, all in > one result table? > Something like: > > decade | average(salary) > -------+----------------- > 1940 | 69500 > 1950 | 53333.33 > 1960 | 53000 > 1970 | 40333.33 > > Thanks in advance, > C > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
On 7/27/07, Carol Cheung <cacheung@consumercontact.com> wrote: > Something like: > > decade | average(salary) > -------+----------------- > 1940 | 69500 > 1950 | 53333.33 > 1960 | 53000 > 1970 | 40333.33 CREATE TABLE tester ( birth_year integer, salary numeric(10,2) ); SELECT SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' AS DECADE , AVG(SALARY) AS AVG_SALARY FROM TESTER GROUP BY SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' ORDER BY DECADE;
On 7/27/07, I wrote: > On 7/27/07, Carol Cheung <cacheung@consumercontact.com> wrote: > > Something like: > > > > decade | average(salary) > > -------+----------------- > > 1940 | 69500 > > 1950 | 53333.33 > > 1960 | 53000 > > 1970 | 40333.33 > > CREATE TABLE tester ( > birth_year integer, > salary numeric(10,2) > ); > > SELECT > SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' > AS DECADE > , AVG(SALARY) AS AVG_SALARY > FROM TESTER > GROUP BY > SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' > ORDER BY DECADE; A bit simpler, if year is int: SELECT BIRTH_YEAR/10*10 AS DECADE , AVG(SALARY) AS AVG_SALARY FROM TESTER GROUP BY BIRTH_YEAR/10*10 ORDER BY DECADE;
2007/7/27, Carol Cheung <cacheung@consumercontact.com>:
> db=# select * from tester order by birth_year;
> birth_year | salary
> ------------+--------
> 1946 | 78000
> 1949 | 61000
What is the data type of the birth_year column? I'd suggest using
date if you can, as what it is is a date with year precision. You
can't specify such a precision, but you can decide that all
birth_year's will have month and year of January 1 (enforced by a
CHECK constraint, if you wish), or you could just choose to ignore
the month and year part in your calculations.
> How can I display the average salary grouped by decade of birth year?
> That is, is it possible to display the average salary of those born in
> the 1940's, the average salary of those born in the 1950's, average
> salary of those born in the 1960's, and those born in the 1970's,
> all in
> one result table?
> Something like:
>
> decade | average(salary)
> -------+-----------------
> 1940 | 69500
> 1950 | 53333.33
> 1960 | 53000
> 1970 | 40333.33
Here's an example:
CREATE TABLE salaries (birth_year DATE PRIMARY KEY, salary NUMERIC
NOT NULL);
INSERT INTO salaries (birth_year, salary) VALUES ('1946-01-01',78000), ('1949-01-01',61000), ('1951-01-01',58000)
,('1953-01-01',56000), ('1958-01-01',52000), ('1962-01-01',50000) , ('1965-01-01',45000), ('1967-01-01',60000),
('1968-01-01',57000) , ('1970-01-01',47000), ('1972-01-01',32000), ('1973-01-01',42000);
SELECT birth_decade, AVG(salary)
FROM ( SELECT birth_year , date_trunc('decade', birth_year)::date as birth_decade , salary FROM
salaries)as salaries_with_decades
GROUP BY birth_decade
ORDER BY birth_decade;
birth_decade | avg
--------------+--------------------
1940-01-01 | 69500.000000000000
1950-01-01 | 55333.333333333333
1960-01-01 | 53000.000000000000
1970-01-01 | 40333.333333333333
(4 rows)
If birth_year is an integer column, here's another way to do it,
taking advantage of the fact that integer division truncates.
CREATE TABLE salaries (birth_year INTEGER PRIMARY KEY, salary NUMERIC
NOT NULL);
INSERT INTO salaries (birth_year, salary) VALUES (1946,78000), (1949,61000), (1951,58000), (1953,56000),
(1958,52000) , (1962,50000), (1965,45000), (1967,60000), (1968,57000),
(1970,47000) , (1972,32000), (1973,42000);
SELECT birth_decade, AVG(salary)
FROM ( SELECT birth_year , birth_year / 10 * 10 as birth_decade , salary FROM salaries) as
salaries_with_decades
GROUP BY birth_decade
ORDER BY birth_decade;
birth_decade | avg
--------------+-------------------- 1940 | 69500.000000000000 1950 | 55333.333333333333 1960 |
53000.000000000000 1970 | 40333.333333333333
(4 rows)
Hope this gives you some options.
Michael Glaesemann
grzm seespotcode net