Обсуждение: cumulative count
Hi list,
This is my first post to pgsql, so hopefully I'm not asking something
that has been answered a thousand time before. I've looked online, and
through the archives, but I haven't found anything that answers my
question specifically:
Say I have a table like this:
date | user
------------------+---------------------
20050201 | Bill
20050210 | Steve
20050224 | Sally
20050311 | Martha
20050316 | Ryan
20050322 | Phil
20050330 | William
20050415 | Mary
20050428 | Susan
20050503 | Jim
and I want to run a query that returns a *count* of the number of users
*each month*, ordered by year and *month*, with an additional column
that is a *running total of the count*, as in:
year | month | count | run_count
-------------------+----------------+----------------+-----------------
2005 | 02 | 3 | 3
2005 | 03 | 4 | 7
2005 | 04 | 2 | 9
2005 | 05 | 1 | 10
I can get almost everything I want with:
SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
FROM (SELECT EXTRACT(year from added_date) AS year,
EXTRACT(month FROM added_date) AS month,
TO_CHAR(added_date, 'Month') AS month_name,
COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
(SELECT EXTRACT(year FROM added_date) AS year,
EXTRACT(month FROM added_date) AS month,
TO_CHAR(added_date, 'Month') AS month_name,
COUNT(*) AS count
FROM users_table GROUP BY 1, 2, 3) AS b
WHERE a.year >= b.year AND a.month >= b.month
GROUP BY 1, 2, 3, 4
ORDER BY a.year, a.month asc;
but I can't quite figure out the running total of the count. The above
example works right up to the end of the first year, then the values no
longer make sense. My guess is it's something to do with my WHERE
clause, but I can't think of a better way to do things.
Any ideas?
Cheers,
Carson
In article <49381902.7080209@gmail.com>,
Carson Farmer <carson.farmer@gmail.com> writes:
> date | user
> ------------------+---------------------
> 20050201 | Bill
> 20050210 | Steve
> 20050224 | Sally
> 20050311 | Martha
> 20050316 | Ryan
> 20050322 | Phil
> 20050330 | William
> 20050415 | Mary
> 20050428 | Susan
> 20050503 | Jim
> and I want to run a query that returns a *count* of the number of
> users *each month*, ordered by year and *month*, with an additional
> column that is a *running total of the count*, as in:
> year | month | count | run_count
> -------------------+----------------+----------------+-----------------
> 2005 | 02 | 3 | 3
> 2005 | 03 | 4 | 7
> 2005 | 04 | 2 | 9
> 2005 | 05 | 1 | 10
> I can get almost everything I want with:
> SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
> FROM (SELECT EXTRACT(year from added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
> (SELECT EXTRACT(year FROM added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count
> FROM users_table GROUP BY 1, 2, 3) AS b
> WHERE a.year >= b.year AND a.month >= b.month
> GROUP BY 1, 2, 3, 4
> ORDER BY a.year, a.month asc;
> but I can't quite figure out the running total of the count. The above
> example works right up to the end of the first year, then the values
> no longer make sense. My guess is it's something to do with my WHERE
> clause, but I can't think of a better way to do things.
Yes, your WHERE condition is the problem. It should be
WHERE a.year > b.year OR a.year = b.year AND a.month > b.month.
You could simplify the date logic by doing the year/month split later, e.g.
CREATE TEMP TABLE tmp AS
SELECT date_trunc('month', date) AS dt, count(*) AS count
FROM users_table
GROUP BY dt;
SELECT extract(YEAR FROM t1.dt) AS year,
extract(MONTH FROM t1.dt) AS month,
t1.count,
sum(t2.count) AS run_count
FROM tmp t1
LEFT JOIN tmp t2 ON t2.dt <= t1.dt
GROUP BY year, month, t1.count
ORDER BY year, month;
(AFAIK the WITH clause in PostgreSQL 8.4 would let you get rid of
the temp table.)
On Thu, Dec 04, 2008 at 07:32:59PM +0100, Harald Fuchs wrote:
> In article <49381902.7080209@gmail.com>,
> Carson Farmer <carson.farmer@gmail.com> writes:
>
> > date | user
> > ------------------+---------------------
> > 20050201 | Bill
> > 20050210 | Steve
> > 20050224 | Sally
> > 20050311 | Martha
> > 20050316 | Ryan
> > 20050322 | Phil
> > 20050330 | William
> > 20050415 | Mary
> > 20050428 | Susan
> > 20050503 | Jim
>
> > and I want to run a query that returns a *count* of the number of
> > users *each month*, ordered by year and *month*, with an additional
> > column that is a *running total of the count*, as in:
>
> > year | month | count | run_count
> > -------------------+----------------+----------------+-----------------
> > 2005 | 02 | 3 | 3
> > 2005 | 03 | 4 | 7
> > 2005 | 04 | 2 | 9
> > 2005 | 05 | 1 | 10
>
> > I can get almost everything I want with:
>
> > SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
> > FROM (SELECT EXTRACT(year from added_date) AS year,
> > EXTRACT(month FROM added_date) AS month,
> > TO_CHAR(added_date, 'Month') AS month_name,
> > COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
> > (SELECT EXTRACT(year FROM added_date) AS year,
> > EXTRACT(month FROM added_date) AS month,
> > TO_CHAR(added_date, 'Month') AS month_name,
> > COUNT(*) AS count
> > FROM users_table GROUP BY 1, 2, 3) AS b
> > WHERE a.year >= b.year AND a.month >= b.month
> > GROUP BY 1, 2, 3, 4
> > ORDER BY a.year, a.month asc;
>
> > but I can't quite figure out the running total of the count. The above
> > example works right up to the end of the first year, then the values
> > no longer make sense. My guess is it's something to do with my WHERE
> > clause, but I can't think of a better way to do things.
>
> Yes, your WHERE condition is the problem. It should be
> WHERE a.year > b.year OR a.year = b.year AND a.month > b.month.
>
> You could simplify the date logic by doing the year/month split later, e.g.
>
> CREATE TEMP TABLE tmp AS
>
> SELECT extract(YEAR FROM t1.dt) AS year,
> extract(MONTH FROM t1.dt) AS month,
> t1.count,
> sum(t2.count) AS run_count
> FROM tmp t1
> LEFT JOIN tmp t2 ON t2.dt <= t1.dt
> GROUP BY year, month, t1.count
> ORDER BY year, month;
What about:
SELECT extract(YEAR FROM t1.dt) AS year,
extract(MONTH FROM t1.dt) AS month,
t1.count,
sum(t2.count) AS run_count
FROM (
SELECT date_trunc('month', date) AS dt, count(*) AS count
FROM users_table
GROUP BY dt
) AS t1
LEFT JOIN tmp t2 ON t2.dt <= t1.dt
GROUP BY year, month, t1.count
ORDER BY year, month;
Regards,
Gerhard
Вложения
2008/12/4 Carson Farmer <carson.farmer@gmail.com>: > Hi list, > > This is my first post to pgsql, so hopefully I'm not asking something that > has been answered a thousand time before. I've looked online, and through > the archives, but I haven't found anything that answers my question > specifically: > > Say I have a table like this: > > date | user > ------------------+--------------------- > 20050201 | Bill > 20050210 | Steve > 20050224 | Sally > 20050311 | Martha > 20050316 | Ryan > 20050322 | Phil > 20050330 | William > 20050415 | Mary > 20050428 | Susan > 20050503 | Jim > > and I want to run a query that returns a *count* of the number of users > *each month*, ordered by year and *month*, with an additional column that is > a *running total of the count*, as in: > > year | month | count | run_count > -------------------+----------------+----------------+----------------- > 2005 | 02 | 3 | 3 2005 > | 03 | 4 | 7 > 2005 | 04 | 2 | 9 > 2005 | 05 | 1 | 10 > > I can get almost everything I want with: > > SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total > FROM (SELECT EXTRACT(year from added_date) AS year, > EXTRACT(month FROM added_date) AS month, > TO_CHAR(added_date, 'Month') AS month_name, > COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a, > (SELECT EXTRACT(year FROM added_date) AS year, > EXTRACT(month FROM added_date) AS month, > TO_CHAR(added_date, 'Month') AS month_name, > COUNT(*) AS count > FROM users_table GROUP BY 1, 2, 3) AS b > WHERE a.year >= b.year AND a.month >= b.month > GROUP BY 1, 2, 3, 4 > ORDER BY a.year, a.month asc; > > but I can't quite figure out the running total of the count. The above > example works right up to the end of the first year, then the values no > longer make sense. My guess is it's something to do with my WHERE clause, > but I can't think of a better way to do things. > > Any ideas? > hate selfjoins. It is really slow for any bigger datasets. Write SRF function (stored function that returns table). Regards Pavel Stehule > Cheers, > > Carson > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Thu, Dec 04, 2008 at 05:53:06PM +0000, Carson Farmer wrote: > Hi list, > > This is my first post to pgsql, so hopefully I'm not asking something > that has been answered a thousand time before. I've looked online, and > through the archives, but I haven't found anything that answers my > question specifically: > > Say I have a table like this: > > date | user > ------------------+--------------------- > 20050201 | Bill > 20050210 | Steve > 20050224 | Sally > 20050311 | Martha > 20050316 | Ryan > 20050322 | Phil > 20050330 | William > 20050415 | Mary > 20050428 | Susan > 20050503 | Jim > > and I want to run a query that returns a *count* of the number of users > *each month*, ordered by year and *month*, with an additional column > that is a *running total of the count*, as in: In 8.4, you'll have direct SQL support for this using OLAP a.k.a. windowing functions, so don't build too many of these dodgy hacks into your application. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Hi, I have a question concerning psql. I found that psql has a defined command '-t' and that it turns off printing of column names and result row count footers, etc. what I look for, is a command, which would turn off result row count footer, but would print column names. is there an easy way to do this? regards, Matt ---------------------------------------------------- Wirus Filipiński znowu atakuje http://klik.wp.pl/?adr=http%3A%2F%2Fprorocznia.pl%2Ff.html%3Fi%3D37160O-937882398O0&sid=575
> I have a question concerning psql. I found that psql has a defined > command '-t' and that it turns off printing of column names > and result > row count footers, etc. > > what I look for, is a command, which would turn off result row count > footer, but would print column names. > > is there an easy way to do this? Start psql with the -P footer switch. 8.3.3 works this way. Allan The material contained in this email may be confidential, privileged or copyrighted. If you are not the intended recipient,use, disclosure or copying of this information is prohibited. If you have received this document in error, pleaseadvise the sender and delete the document. Neither OneSteel nor the sender accept responsibility for any viruses containedin this email or any attachments.
>> I have a question concerning psql. I found that psql has a defined >> command '-t' and that it turns off printing of column names >> and result >> row count footers, etc. >> >> what I look for, is a command, which would turn off result row count >> footer, but would print column names. >> >> is there an easy way to do this? >Start psql with the -P footer switch. 8.3.3 works this way. > >Allan Works:) Thx a lot! Regards Matt