Re: grouping subsets

Поиск
Список
Период
Сортировка
От Rainer Stengele
Тема Re: grouping subsets
Дата
Msg-id 4C529CF6.5040408@diplan.de
обсуждение исходный текст
Ответ на Re: grouping subsets  ("Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>)
Список pgsql-sql
the table may include up to maybe 30 entries per day, average maybe 10-15
After a year this makes about 10.000 entries - maximum, average about 5000 entries.

For the problem described I have to use a Microsoft SQL database and would like to use pure SQL.
As I use postgres on my Linux servers I found this newsgroup and thought I ask here.

Thanks!
Rainer

Am 29.07.2010 15:31, schrieb Oliveiros d'Azevedo Cristina:
> Fine.
> 
> Please advice me,
> 
> How long can
> your table be? Thousands? Millions of records?
> 
> Do you really need it in pure SQL
> ?
> 
> It seems to me that it might be possible, I'm just affraid that the query would become too complex and thus slow...
> 
> Best,
> Oliveiros
> 
> ----- Original Message ----- From: "Rainer Stengele" <rainer.stengele@diplan.de>
> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
> Cc: <pgsql-sql@postgresql.org>
> Sent: Thursday, July 29, 2010 1:10 PM
> Subject: Re: grouping subsets
> 
> 
>> No. This is by accident.
>> We have to assume that the combinations do change anytime, and many times per day.
>>
>> So
>>
>> "Or is it possible to have the same combination on one day with several sets?"
>>
>> YES!
>>
>> Rainer
>>
>>
>>
>> Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
>>> Yes. This is somewhat more complicated because it has more constraints.
>>> I've noticed that a given combination doesn't appear with holes on a certain day.
>>>
>>> For ex, on a daily basis, we have every three key combinations together.
>>>
>>> We dont have things like
>>> 2010-7-01  1726 3212 1428
>>> 2010-7-01  1726 3212 1428
>>> ...                318   1846 1012
>>> 2010-7-01  1726 3212 1428
>>>
>>> Can I assume that, for a certain day , the records for the same three combination are all together? There is just
oneset per day for a given combination?
 
>>>
>>> Or is it possible to have the same combination on one day with several sets?
>>>
>>> Best,
>>> Oliveiros
>>>
>>>
>>> ----- Original Message ----- From: "Rainer Stengele" <rainer.stengele@diplan.de>
>>> Newsgroups: gmane.comp.db.postgresql.sql
>>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>>> Cc: <>
>>> Sent: Thursday, July 29, 2010 10:41 AM
>>> Subject: Re: grouping subsets
>>>
>>>
>>>> Howdy Cristina,
>>>>
>>>> unfortunately things are more complicated. I have inserted an excerpt of the real data here:
>>>>
>>>> ================================================================================
>>>> TableID         MasterID dtBegin                           dtEnd idR idL      idB     consumption
>>>> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 01:32:59.670           1726     3212
1428    279
 
>>>> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 02:32:09.373           1726     3212
1428    183
 
>>>> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 08:28:28.117           318      1846
1012    30
 
>>>> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 10:31:34.393           318      1846
1012    487
 
>>>> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 11:32:20.903           318      1846
1012    472
 
>>>> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 12:31:35.730           318      1846
1012    195
 
>>>> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 13:32:13.950           318      1846
1012    338
 
>>>> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 14:02:57.710           318      1846
1012    105
 
>>>> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 15:32:33.810           318      1846
1012    187
 
>>>> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 16:02:27.260           318      1846
1012    108
 
>>>> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 17:32:49.367           318      1846
1012    179
 
>>>> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 18:29:01.733           318      1846
1012    256
 
>>>> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 20:32:53.850           318      1846
1012    239
 
>>>> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 21:29:13.497           318      1846
1012    39
 
>>>> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 22:32:46.503           318      1846
1012    157
 
>>>> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 23:32:22.543           318      1846
1012    218
 
>>>> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 00:32:30.390           318      1846
1012    248
 
>>>> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 01:32:41.983           318      1846
1012    204
 
>>>> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 02:32:56.983           318      1846
1012    42
 
>>>> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 03:32:28.013           318      1846
1012    135
 
>>>> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 04:32:55.250           318      1846
1012    743
 
>>>> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 05:32:44.547           318      1846
1012    277
 
>>>> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 06:32:37.340           318      1846
1012    121
 
>>>> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 07:32:39.113           318      1846
1012    239
 
>>>> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 09:33:01.200           318      1846
1012    512
 
>>>> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 10:33:01.903           318      1846
1012    139
 
>>>> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 11:32:56.560           318      1846
1012    722
 
>>>> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 12:32:14.700            318      1846
1012    163
 
>>>> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 13:32:59.500            318      1846
1012    284
 
>>>> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 14:18:05.080            318      1846
1012    254
 
>>>> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 16:32:53.437           1706     3541
1511    161
 
>>>> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 17:32:57.950           1706     3541
1511    250
 
>>>> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 18:32:54.337           1706     3541
1511    302
 
>>>> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 19:32:37.950           1706     3541
1511    276
 
>>>> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 20:32:57.433           1706     3541
1511    209
 
>>>> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 21:30:44.540           1706     3541
1511    76
 
>>>> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 22:33:00.603           1706     3541
1511    287
 
>>>> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 23:59:31.607           1706     3541
1511    1877
 
>>>> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 00:54:40.153           1706     3541
1511    1798
 
>>>> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 01:59:37.920           1706     3541
1511    1125
 
>>>> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 02:59:45.433           1706     3541
1511    1629
 
>>>> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 03:59:54.920           1706     3541
1511    2462
 
>>>> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 04:01:48.810           1706     3541
1511    70 
>>>> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 06:28:09.697           1726     3212
1428    46
 
>>>> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03 07:32:45.653           1726     3212
1428    251
 
>>>> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 08:32:57.217           1726     3212
1428    284
 
>>>> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 09:32:47.903           1726     3212
1428    227
 
>>>> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 10:33:02.063           1726     3212
1428    285
 
>>>> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 23:33:45.530           1726     3212
1428    1286
 
>>>> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 00:33:08.107           1726     3212
1428    297
 
>>>> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 01:32:48.300           1726     3212
1428    333
 
>>>> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 02:32:48.780           1726     3212
1428    270
 
>>>> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 03:32:56.043           1726     3212
1428    317
 
>>>> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05 04:33:14.947           1726     3212
1428    1623
 
>>>> ================================================================================
>>>>
>>>> Description:
>>>> 1. Column: some ID
>>>> 2. Column: reference to another table
>>>> 3. and 4. column: timestamp from/to of the item
>>>> 5. Column: ID R
>>>> 6. Column: ID L
>>>> 7. Column: ID B
>>>> 8. Column: Sum of components
>>>>
>>>> Requirement:
>>>> Sum over all components (from column 8) for each combination of ID R, ID L, ID B, but (!)
>>>> rows with same keys (R,L,B) should be summed up only until the keys change.
>>>> Do not sum up the components for identical keys, if there are other keys between them.
>>>>
>>>> Example result:
>>>>
>>>> idR       idL      idB                 SUM
>>>> 1726     3212     1428                 462
>>>> 318     1846     1012                 ...
>>>> 1706     3541     1511                 ...
>>>> 1726     3212     1428                 ...
>>>>
>>>>
>>>> Note that the first and last entry here has the same keys
>>>>
>>>> Maybe you find a similar monster SQL solving such a requirement.
>>>> Thanks for considering!
>>>>
>>>> Rainer
>>>>
>>>>
>>>>
>>>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>>>> Howdy, Rainer.
>>>>>
>>>>> It's been a while, so I don't know if you are still interested in this problem or if you, in the meantime, found
yourselfa solution,
 
>>>>> but I've tried this on a local copy of the example you provided   and it seems to work.
>>>>>
>>>>> The problem is that I suspect that if you have several thousands of records on your table it will become slow...
>>>>>
>>>>> Best,
>>>>> Oliveiros
>>>>>
>>>>> SELECT SUM(tudo.parcela),tudo.a
>>>>> FROM
>>>>> (
>>>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>>>> FROM
>>>>> (
>>>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>>>> FROM
>>>>> yourTable se
>>>>> LEFT JOIN
>>>>> (
>>>>> SELECT a.*
>>>>> FROM yourTable a
>>>>> JOIN yourTable b
>>>>> ON (b.b <> a.b)
>>>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>>>
>>>>> )
>>>>> ) pr
>>>>> ON pr.b = se.b
>>>>> AND    se.c >= pr.c
>>>>> GROUP BY se.a,se.b,se.c
>>>>> ) fo
>>>>> LEFT JOIN
>>>>> (
>>>>> SELECT a.*
>>>>> FROM yourTable a
>>>>> JOIN yourTable b
>>>>> ON (b.b <> a.b)
>>>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>>>> )
>>>>> ) th
>>>>> ON fo.a = th.b
>>>>> AND fo.b <= th.c
>>>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>>>> ) tudo
>>>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> To: <pgsql-sql@postgresql.org>
>>>>> Sent: Thursday, July 22, 2010 9:09 AM
>>>>> Subject: [SQL] grouping subsets
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> having a table similar to
>>>>>>
>>>>>> | 1 | B | [2010-07-15 Do] |
>>>>>> | 1 | B | [2010-07-16 Fr] |
>>>>>> |---+---+-----------------|
>>>>>> | 2 | C | [2010-07-17 Sa] |
>>>>>> | 2 | C | [2010-07-18 So] |
>>>>>> |---+---+-----------------|
>>>>>> | 1 | B | [2010-07-19 Mo] |
>>>>>> | 1 | B | [2010-07-20 Di] |
>>>>>> | 1 | B | [2010-07-21 Mi] |
>>>>>> | 1 | B | [2010-07-22 Do] |
>>>>>> |---+---+-----------------|
>>>>>> | 3 | D | [2010-07-23 Fr] |
>>>>>>
>>>>>> a simple group by gives me:
>>>>>>
>>>>>> | 6 | B |
>>>>>> | 4 | C |
>>>>>> | 3 | D |
>>>>>>
>>>>>>
>>>>>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column
untilthe colum changes.
 
>>>>>> Is there a way to get
>>>>>>
>>>>>> | 2 | B |
>>>>>> | 4 | C |
>>>>>> | 4 | B |
>>>>>> | 3 | D |
>>>>>>
>>>>>> by SQL only?
>>>>>>
>>>>>> - Rainer
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>>>> To make changes to your subscription:
>>>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>>>
>>>>>
>>>
> 
> 



В списке pgsql-sql по дате отправления:

Предыдущее
От: venkat
Дата:
Сообщение: Re: How to get geometry enabled Tables form Postgresql/postgis
Следующее
От: "Oliveiros d'Azevedo Cristina"
Дата:
Сообщение: Re: grouping subsets