Re: grouping subsets

Поиск
Список
Период
Сортировка
От Oliveiros d'Azevedo Cristina
Тема Re: grouping subsets
Дата
Msg-id 5BE4369386904E178DD4B537C759B2B0@marktestcr.marktest.pt
обсуждение исходный текст
Ответ на grouping subsets  (Rainer Stengele <rainer.stengele@diplan.de>)
Ответы Re: grouping subsets  (Rainer Stengele <rainer.stengele@diplan.de>)
Список pgsql-sql
I See.

And the analysis you need to do, the sum of the rows with the same keys 
(until they change) will have to be done over all
table?
Or just over some predefined interval ?

Best,
Oliveiros
----- Original Message ----- 
From: "Rainer Stengele" <rainer.stengele@diplan.de>
Newsgroups: gmane.comp.db.postgresql.sql
To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
Cc: <pgsql-sql@postgresql.org>
Sent: Friday, July 30, 2010 10:35 AM
Subject: Re: grouping subsets


> the table may include up to maybe 30 entries per day, average maybe 10-15
> After a year this makes about 10.000 entries - maximum, average about 5000 
> entries.
>
> For the problem described I have to use a Microsoft SQL database and would 
> like to use pure SQL.
> As I use postgres on my Linux servers I found this newsgroup and thought I 
> ask here.
>
> Thanks!
> Rainer
>
> Am 29.07.2010 15:31, schrieb Oliveiros d'Azevedo Cristina:
>> Fine.
>>
>> Please advice me,
>>
>> How long can
>> your table be? Thousands? Millions of records?
>>
>> Do you really need it in pure SQL
>> ?
>>
>> It seems to me that it might be possible, I'm just affraid that the query 
>> would become too complex and thus slow...
>>
>> Best,
>> Oliveiros
>>
>> ----- Original Message ----- From: "Rainer Stengele" 
>> <rainer.stengele@diplan.de>
>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>> Cc: <pgsql-sql@postgresql.org>
>> Sent: Thursday, July 29, 2010 1:10 PM
>> Subject: Re: grouping subsets
>>
>>
>>> No. This is by accident.
>>> We have to assume that the combinations do change anytime, and many 
>>> times per day.
>>>
>>> So
>>>
>>> "Or is it possible to have the same combination on one day with several 
>>> sets?"
>>>
>>> YES!
>>>
>>> Rainer
>>>
>>>
>>>
>>> Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
>>>> Yes. This is somewhat more complicated because it has more constraints.
>>>> I've noticed that a given combination doesn't appear with holes on a 
>>>> certain day.
>>>>
>>>> For ex, on a daily basis, we have every three key combinations 
>>>> together.
>>>>
>>>> We dont have things like
>>>> 2010-7-01  1726 3212 1428
>>>> 2010-7-01  1726 3212 1428
>>>> ...                318   1846 1012
>>>> 2010-7-01  1726 3212 1428
>>>>
>>>> Can I assume that, for a certain day , the records for the same three 
>>>> combination are all together? There is just one set per day for a given 
>>>> combination?
>>>>
>>>> Or is it possible to have the same combination on one day with several 
>>>> sets?
>>>>
>>>> Best,
>>>> Oliveiros
>>>>
>>>>
>>>> ----- Original Message ----- From: "Rainer Stengele" 
>>>> <rainer.stengele@diplan.de>
>>>> Newsgroups: gmane.comp.db.postgresql.sql
>>>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>>>> Cc: <>
>>>> Sent: Thursday, July 29, 2010 10:41 AM
>>>> Subject: Re: grouping subsets
>>>>
>>>>
>>>>> Howdy Cristina,
>>>>>
>>>>> unfortunately things are more complicated. I have inserted an excerpt 
>>>>> of the real data here:
>>>>>
>>>>> ================================================================================
>>>>> TableID         MasterID dtBegin                           dtEnd idR 
>>>>> idL      idB     consumption
>>>>> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 
>>>>> 01:32:59.670           1726     3212     1428     279
>>>>> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 
>>>>> 02:32:09.373           1726     3212     1428     183
>>>>> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 
>>>>> 08:28:28.117           318      1846     1012     30
>>>>> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 
>>>>> 10:31:34.393           318      1846     1012     487
>>>>> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 
>>>>> 11:32:20.903           318      1846     1012     472
>>>>> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 
>>>>> 12:31:35.730           318      1846     1012     195
>>>>> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 
>>>>> 13:32:13.950           318      1846     1012     338
>>>>> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 
>>>>> 14:02:57.710           318      1846     1012     105
>>>>> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 
>>>>> 15:32:33.810           318      1846     1012     187
>>>>> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 
>>>>> 16:02:27.260           318      1846     1012     108
>>>>> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 
>>>>> 17:32:49.367           318      1846     1012     179
>>>>> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 
>>>>> 18:29:01.733           318      1846     1012     256
>>>>> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 
>>>>> 20:32:53.850           318      1846     1012     239
>>>>> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 
>>>>> 21:29:13.497           318      1846     1012     39
>>>>> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 
>>>>> 22:32:46.503           318      1846     1012     157
>>>>> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 
>>>>> 23:32:22.543           318      1846     1012     218
>>>>> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 
>>>>> 00:32:30.390           318      1846     1012     248
>>>>> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 
>>>>> 01:32:41.983           318      1846     1012     204
>>>>> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 
>>>>> 02:32:56.983           318      1846     1012     42
>>>>> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 
>>>>> 03:32:28.013           318      1846     1012     135
>>>>> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 
>>>>> 04:32:55.250           318      1846     1012     743
>>>>> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 
>>>>> 05:32:44.547           318      1846     1012     277
>>>>> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 
>>>>> 06:32:37.340           318      1846     1012     121
>>>>> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 
>>>>> 07:32:39.113           318      1846     1012     239
>>>>> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 
>>>>> 09:33:01.200           318      1846     1012     512
>>>>> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 
>>>>> 10:33:01.903           318      1846     1012     139
>>>>> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 
>>>>> 11:32:56.560           318      1846     1012     722
>>>>> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 
>>>>> 12:32:14.700            318      1846     1012     163
>>>>> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 
>>>>> 13:32:59.500            318      1846     1012     284
>>>>> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 
>>>>> 14:18:05.080            318      1846     1012     254
>>>>> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 
>>>>> 16:32:53.437           1706     3541     1511     161
>>>>> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 
>>>>> 17:32:57.950           1706     3541     1511     250
>>>>> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 
>>>>> 18:32:54.337           1706     3541     1511     302
>>>>> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 
>>>>> 19:32:37.950           1706     3541     1511     276
>>>>> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 
>>>>> 20:32:57.433           1706     3541     1511     209
>>>>> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 
>>>>> 21:30:44.540           1706     3541     1511     76
>>>>> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 
>>>>> 22:33:00.603           1706     3541     1511     287
>>>>> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 
>>>>> 23:59:31.607           1706     3541     1511     1877
>>>>> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 
>>>>> 00:54:40.153           1706     3541     1511     1798
>>>>> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 
>>>>> 01:59:37.920           1706     3541     1511     1125
>>>>> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 
>>>>> 02:59:45.433           1706     3541     1511     1629
>>>>> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 
>>>>> 03:59:54.920           1706     3541     1511     2462
>>>>> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 
>>>>> 04:01:48.810           1706     3541     1511     70
>>>>> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 
>>>>> 06:28:09.697           1726     3212     1428     46
>>>>> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03 
>>>>> 07:32:45.653           1726     3212     1428     251
>>>>> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 
>>>>> 08:32:57.217           1726     3212     1428     284
>>>>> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 
>>>>> 09:32:47.903           1726     3212     1428     227
>>>>> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 
>>>>> 10:33:02.063           1726     3212     1428     285
>>>>> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 
>>>>> 23:33:45.530           1726     3212     1428     1286
>>>>> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 
>>>>> 00:33:08.107           1726     3212     1428     297
>>>>> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 
>>>>> 01:32:48.300           1726     3212     1428     333
>>>>> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 
>>>>> 02:32:48.780           1726     3212     1428     270
>>>>> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 
>>>>> 03:32:56.043           1726     3212     1428     317
>>>>> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05 
>>>>> 04:33:14.947           1726     3212     1428     1623
>>>>> ================================================================================
>>>>>
>>>>> Description:
>>>>> 1. Column: some ID
>>>>> 2. Column: reference to another table
>>>>> 3. and 4. column: timestamp from/to of the item
>>>>> 5. Column: ID R
>>>>> 6. Column: ID L
>>>>> 7. Column: ID B
>>>>> 8. Column: Sum of components
>>>>>
>>>>> Requirement:
>>>>> Sum over all components (from column 8) for each combination of ID R, 
>>>>> ID L, ID B, but (!)
>>>>> rows with same keys (R,L,B) should be summed up only until the keys 
>>>>> change.
>>>>> Do not sum up the components for identical keys, if there are other 
>>>>> keys between them.
>>>>>
>>>>> Example result:
>>>>>
>>>>> idR       idL      idB                 SUM
>>>>> 1726     3212     1428                 462
>>>>> 318     1846     1012                 ...
>>>>> 1706     3541     1511                 ...
>>>>> 1726     3212     1428                 ...
>>>>>
>>>>>
>>>>> Note that the first and last entry here has the same keys
>>>>>
>>>>> Maybe you find a similar monster SQL solving such a requirement.
>>>>> Thanks for considering!
>>>>>
>>>>> Rainer
>>>>>
>>>>>
>>>>>
>>>>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>>>>> Howdy, Rainer.
>>>>>>
>>>>>> It's been a while, so I don't know if you are still interested in 
>>>>>> this problem or if you, in the meantime, found yourself a solution,
>>>>>> but I've tried this on a local copy of the example you provided   and 
>>>>>> it seems to work.
>>>>>>
>>>>>> The problem is that I suspect that if you have several thousands of 
>>>>>> records on your table it will become slow...
>>>>>>
>>>>>> Best,
>>>>>> Oliveiros
>>>>>>
>>>>>> SELECT SUM(tudo.parcela),tudo.a
>>>>>> FROM
>>>>>> (
>>>>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>>>>> FROM
>>>>>> (
>>>>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>>>>> FROM
>>>>>> yourTable se
>>>>>> LEFT JOIN
>>>>>> (
>>>>>> SELECT a.*
>>>>>> FROM yourTable a
>>>>>> JOIN yourTable b
>>>>>> ON (b.b <> a.b)
>>>>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>>>>
>>>>>> )
>>>>>> ) pr
>>>>>> ON pr.b = se.b
>>>>>> AND    se.c >= pr.c
>>>>>> GROUP BY se.a,se.b,se.c
>>>>>> ) fo
>>>>>> LEFT JOIN
>>>>>> (
>>>>>> SELECT a.*
>>>>>> FROM yourTable a
>>>>>> JOIN yourTable b
>>>>>> ON (b.b <> a.b)
>>>>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>>>>> )
>>>>>> ) th
>>>>>> ON fo.a = th.b
>>>>>> AND fo.b <= th.c
>>>>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>>>>> ) tudo
>>>>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> To: <pgsql-sql@postgresql.org>
>>>>>> Sent: Thursday, July 22, 2010 9:09 AM
>>>>>> Subject: [SQL] grouping subsets
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> having a table similar to
>>>>>>>
>>>>>>> | 1 | B | [2010-07-15 Do] |
>>>>>>> | 1 | B | [2010-07-16 Fr] |
>>>>>>> |---+---+-----------------|
>>>>>>> | 2 | C | [2010-07-17 Sa] |
>>>>>>> | 2 | C | [2010-07-18 So] |
>>>>>>> |---+---+-----------------|
>>>>>>> | 1 | B | [2010-07-19 Mo] |
>>>>>>> | 1 | B | [2010-07-20 Di] |
>>>>>>> | 1 | B | [2010-07-21 Mi] |
>>>>>>> | 1 | B | [2010-07-22 Do] |
>>>>>>> |---+---+-----------------|
>>>>>>> | 3 | D | [2010-07-23 Fr] |
>>>>>>>
>>>>>>> a simple group by gives me:
>>>>>>>
>>>>>>> | 6 | B |
>>>>>>> | 4 | C |
>>>>>>> | 3 | D |
>>>>>>>
>>>>>>>
>>>>>>> What I want to get is the values grouped by "subset", where a subset 
>>>>>>> is a set of rows with identical column until the colum changes.
>>>>>>> Is there a way to get
>>>>>>>
>>>>>>> | 2 | B |
>>>>>>> | 4 | C |
>>>>>>> | 4 | B |
>>>>>>> | 3 | D |
>>>>>>>
>>>>>>> by SQL only?
>>>>>>>
>>>>>>> - Rainer
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>>>>> To make changes to your subscription:
>>>>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>>>>
>>>>>>
>>>>
>>
>> 



В списке pgsql-sql по дате отправления:

Предыдущее
От: Rainer Stengele
Дата:
Сообщение: Re: grouping subsets
Следующее
От: Joshua Tolley
Дата:
Сообщение: Re: grouping subsets