Re: grouping subsets
| От | Joshua Tolley | 
|---|---|
| Тема | Re: grouping subsets | 
| Дата | |
| Msg-id | 4c52f6b3.02b28f0a.02e2.ffff9a9e@mx.google.com обсуждение исходный текст | 
| Ответ на | Re: grouping subsets (Tim Landscheidt <tim@tim-landscheidt.de>) | 
| Список | pgsql-sql | 
On Thu, Jul 22, 2010 at 11:31:23AM +0000, Tim Landscheidt wrote:
> Richard Huxton <dev@archonet.com> wrote:
>
> >>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes. 
> >>> Is there a way to get
>
> >>> | 2 | B |
> >>> | 4 | C |
> >>> | 4 | B |
> >>> | 3 | D |
>
> >>> by SQL only?
>
> >> I think, the problem is that there are 2 identical groups. I think, you
> >> can write a pl/pgsql-proc, selecting all ordered by the date-field and
> >> walking through the result to do the grouping, checking if the 2nd
> >> column is different from the previous.
>
> >> With plain SQL it's maybe possible too, but i don't know how ...
>
> > It should be do-able in 8.4 onwards, look into windowing
> > functions. In particular the lag() function:
>
> > SELECT
> >     mycode,
> >     mydate,
> >     lag(mycode) OVER (ORDER BY mydate) AS prev_code
> > FROM
> >     mytable
> > ORDER BY mydate;
>
> > It should be possible to use that as a subquery with an
> > outer query that compares mycode=prev_code to get a run
> > length.
>
> Hmmm. Can the outer query be done without using "WITH
> RECURSIVE"?
How about this:
select   a, b, c, d, sum
from (   select       a, b, c, d, new_partition,       sum(e) over (partition by partition_num)    from (       select
        a, b, c, d, e,           case when               lag(a, 1, null) over (order by d) is null or
lag(a,1, null) over (order by d) != a or               lag(b, 1, null) over (order by d) is null or
lag(b,1, null) over (order by d) != b or               lag(c, 1, null) over (order by d) is null or
lag(c,1, null) over (order by d) != c                   then nextval('a')               else currval('a')           end
aspartition_num,           case when               lag(a, 1, null) over (order by d) is null or               lag(a, 1,
null)over (order by d) != a or               lag(b, 1, null) over (order by d) is null or               lag(b, 1, null)
over(order by d) != b or               lag(c, 1, null) over (order by d) is null or               lag(c, 1, null) over
(orderby d) != c                   then 'T'::boolean               else 'f'::boolean           end as new_partition
 from foo   ) bar 
) baz
where   new_partition;
Here's my test table:
5432 josh@postgres# select * from foo;a | b | c |           d            | e
---+---+---+------------------------+---1 | 9 | 1 | 2007-01-01 00:00:05-07 | 89 | 2 | 1 | 2007-01-01 00:00:10-07 | 49 |
2| 1 | 2007-01-01 00:00:15-07 | 69 | 2 | 1 | 2007-01-01 00:00:20-07 | 26 | 5 | 7 | 2007-01-01 00:00:25-07 | 34 | 9 | 0
|2007-01-01 00:00:30-07 | 04 | 9 | 0 | 2007-01-01 00:00:35-07 | 75 | 2 | 7 | 2007-01-01 00:01:25-07 | 75 | 2 | 7 |
2007-01-0100:01:30-07 | 75 | 2 | 7 | 2007-01-01 00:01:35-07 | 95 | 2 | 7 | 2007-01-01 00:01:40-07 | 25 | 2 | 7 |
2007-01-0100:01:45-07 | 55 | 2 | 7 | 2007-01-01 00:01:50-07 | 85 | 2 | 7 | 2007-01-01 00:01:55-07 | 55 | 2 | 7 |
2007-01-0100:02:00-07 | 97 | 8 | 8 | 2007-01-01 00:02:05-07 | 77 | 8 | 8 | 2007-01-01 00:02:10-07 | 89 | 3 | 0 |
2007-01-0100:02:15-07 | 09 | 3 | 0 | 2007-01-01 00:02:20-07 | 89 | 2 | 1 | 2007-01-01 00:02:25-07 | 39 | 2 | 1 |
2007-01-0100:02:30-07 | 3 
(21 rows)
...and these results...a | b | c |           d            | sum
---+---+---+------------------------+-----1 | 9 | 1 | 2007-01-01 00:00:05-07 |   89 | 2 | 1 | 2007-01-01 00:00:10-07 |
126| 5 | 7 | 2007-01-01 00:00:25-07 |   34 | 9 | 0 | 2007-01-01 00:00:30-07 |   75 | 2 | 7 | 2007-01-01 00:01:25-07 |
527| 8 | 8 | 2007-01-01 00:02:05-07 |  159 | 3 | 0 | 2007-01-01 00:02:15-07 |   89 | 2 | 1 | 2007-01-01 00:02:25-07 |
6
(8 rows)
--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com
		
	В списке pgsql-sql по дате отправления: