Re: partitioning a dataset + employing hysteresis condition

Поиск
Список
Период
Сортировка
От Gavin Flower
Тема Re: partitioning a dataset + employing hysteresis condition
Дата
Msg-id 4EC2CB7E.10009@archidevsys.co.nz
обсуждение исходный текст
Ответ на Re: partitioning a dataset + employing hysteresis condition  (Amit Dor-Shifer <amit.dor.shifer@gmail.com>)
Ответы Re: partitioning a dataset + employing hysteresis condition  (David Johnston <polobo@yahoo.com>)
Список pgsql-general
On 14/11/11 18:35, Amit Dor-Shifer wrote:

On Mon, Nov 14, 2011 at 4:29 PM, Amit Dor-Shifer <amit.dor.shifer@gmail.com> wrote:
Hi,
I've got this table:
create table phone_calls
(
    start_time timestamp,
    device_id integer,
    term_status integer
);

It describes phone call events. A 'term_status' is a sort-of an exit status for the call, whereby a value != 0 indicates some sort of error.
Given that, I wish to retrieve data on devices with a persisting error on them, of a specific type. I.E. that their last term_status was, say 2. I'd like to employ some hysteresis on the query: only consider a device as errorring if:
1. the last "good" (0) term_status pre-dates a "bad" (2) term_status.
2. it has at least N "bad" term_status events following the last "good" one.
3. The time span between the first "bad" term_status event and the last one is >= T minutes

For instance, w/the following data set:

INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '10 minutes', 1, 2, 0);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '9 minutes', 1, 2, 1);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '7 minutes', 1, 2, 1);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '6 minutes', 1, 2, 1);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '5 minutes', 1, 2, 0);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '4 minutes', 1, 2, 2);
INSERT INTO phone_calls(start_time, device_id, term_status)VALUES (now() - interval '3 minutes', 1, 2, 2);
INSERT INTO phone_calls(start_time, device_id, term_status)VALUES (now() - interval '2 minutes', 1, 2, 2);

with N=3, T=3
The query should return device_id 2 as errorring, as it registered 3 "bad" events for at least 3 minutes.

I assume some partitioning needs to be employed here, but am not very sure-footed on the subject.

Would appreciate some guidance.
10x,

... fixed data set:

INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '10 minutes', 2, 0);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '9 minutes', 2, 1);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '7 minutes', 2, 1);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '6 minutes', 2, 1);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '5 minutes', 2, 0);
INSERT INTO phone_calls(start_time, device_id, term_status) VALUES (now() - interval '4 minutes', 2, 2);
INSERT INTO phone_calls(start_time, device_id, term_status)VALUES (now() - interval '3 minutes', 2, 2);
INSERT INTO phone_calls(start_time, device_id, term_status)VALUES (now() - interval '2 minutes', 2, 2);
Hi ,

This is my attempt...

DROP TABLE IF EXISTS phone_call;


CREATE TABLE phone_call
(
    device_id   int         NOT NULL,
    start_time  timestamptz NOT NULL,
    term_status int         NOT NULL,
    PRIMARY KEY (device_id, start_time, term_status)
);


INSERT INTO
    phone_call
    (
        device_id,
        start_time,
        term_status
    )
VALUES
    (10, '20100701T151433', 0),
    (20, '20100701T151533', 0),
    (20, '20100701T151633', 2),
    (30, '20100701T151433', 0),
    (30, '20100701T151533', 2),
    (30, '20100701T151633', 2),
    (40, '20100701T004022', 0),
    (40, '20100701T004122', 2),
    (40, '20100701T004622', 2),
    (40, '20100701T010022', 2),
    (40, '20100701T012122', 2),
    (50, '20100701T120000', 0),
    (50, '20100701T120100', 2),
    (50, '20100701T120200', 2),
    (50, '20100701T120300', 2),
    (60, '20100701T090000', 0),
    (60, '20100701T090200', 2),
    (60, '20100701T100000', 0),
    (60, '20100701T100100', 2),
    (60, '20100701T100200', 2),
    (60, '20100701T100300', 2),
    (60, '20100701T101000', 2),
    (60, '20100701T102000', 2),
    (60, '20100701T104000', 2),
    (60, '20100701T105000', 2),
    (60, '20100701T105200', 2),
    (60, '20100701T105600', 2),
    (60, '20100701T500300', 0),
    (60, '20100701T501400', 2);
INSERT INTO phone_call(start_time, device_id, term_status) VALUES (now() - interval '10 minutes', 2, 0);
INSERT INTO phone_call(start_time, device_id, term_status) VALUES (now() - interval '9 minutes', 2, 1);
INSERT INTO phone_call(start_time, device_id, term_status) VALUES (now() - interval '7 minutes', 2, 1);
INSERT INTO phone_call(start_time, device_id, term_status) VALUES (now() - interval '6 minutes', 2, 1);
INSERT INTO phone_call(start_time, device_id, term_status) VALUES (now() - interval '5 minutes', 2, 0);
INSERT INTO phone_call(start_time, device_id, term_status) VALUES (now() - interval '4 minutes', 2, 2);
INSERT INTO phone_call(start_time, device_id, term_status)VALUES (now() - interval '3 minutes', 2, 2);
INSERT INTO phone_call(start_time, device_id, term_status)VALUES (now() - interval '2 minutes', 2, 2);   

TABLE phone_call;

SELECT DISTINCT
    pc.device_id
FROM
    phone_call  pc
WHERE
    pc.term_status = 0
    AND 3 <=
    (
        SELECT count(*)
        FROM phone_call  pc1
        WHERE
                pc1.device_id = pc.device_id
            AND pc1.term_status = 2
            AND pc1.start_time > pc.start_time
            AND pc1.start_time <
            COALESCE
            (
                (
                    SELECT min(pc2.start_time)
                    FROM phone_call  pc2
                    WHERE
                            pc2.term_status = 0
                        AND pc2.start_time > pc.start_time
                )
                , '9999-12-31'
            )
        GROUP BY
            pc1.device_id
        HAVING
            max(pc1.start_time) >= min(pc1.start_time) + interval '2 minute'
    )
ORDER BY
    pc.device_id
/**/;/**/






Regards,
Gavin

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: how to adjust auto increment id offset?
Следующее
От: Gavin Flower
Дата:
Сообщение: Re: partitioning a dataset + employing hysteresis condition