Re: Statistics query

Поиск

Список

Период

Сортировка

От	Steve Crawford
Тема	Re: Statistics query
Дата	10 апреля 2013 г. 20:51:58
Msg-id	5165A6B8.1000500@pinpointresearch.com обсуждение исходный текст
Ответ на	Re: Statistics query (Chris Curvey <chris@chriscurvey.com>)
Ответы	Re: SOLVED Statistics query
Список	pgsql-general

Дерево обсуждения

On 04/10/2013 10:31 AM, Chris Curvey wrote:

On Wed, Apr 10, 2013 at 12:30 PM, Steve Crawford <scrawford@pinpointresearch.com> wrote:
I'm seeking ideas on the best way to craft the following query. I've stripped everything down to the bare essentials and simplified it below.

Input data has a timestamp (actually an int received from the system in the form of a Unix epoch), a unit identifier and a status:

event_time | unit_id | status
------------+---------+--------
1357056011 | 60 | 1
1357056012 | 178 | 0
1357056019 | 168 | 0
1357056021 | 3 | 0
1357056021 | 4 | 1
1357056021 | 179 | 0
1357056022 | 0 | 1
1357056022 | 1 | 0
1357056023 | 2 | 0
1357056024 | 9 | 0
1357056025 | 5 | 0
1357056025 | 6 | 0
1357056026 | 7 | 1
...

A given unit_id cannot have two events at the same time (enforced by constraints).

Given a point in time I would like to:

1. Identify all distinct unit_ids with an entry that exists in the preceding hour then

2. Count both the total events and sum the status=1 events for the most recent 50 events for each unit_id that fall within a limited period (e.g. don't look at data earlier than midnight). So unit_id 60 might have 50 events in the last 15 minutes while unit_id 4 might have only 12 events after midnight.

The output would look something like:

unit_id | events | status_1_count
---------+--------+----------------
1 | 50 | 34
2 | 27 | 18
1 | 50 | 34
1 | 2 | 0
...

Each sub-portion is easy and while I could use external processing or set-returning functions I was hoping first to find the secret-sauce to glue everything together into a single query.

Cheers,
Steve

something like

select unit_id, count(*), sum(status)
from mytable a
where event_time >= [whatever unix epoch translates to "last midnight"]
and exists
( select *
from mytable b
where b.unit_id = a.unit_id
and b.epoch >= [unix epoch that translates to "one hour ago"])
group by unit _id;

1) I think figuring out the unix epoch should be reasonable...but I don't know how to do it off the top of my head.
2) I could completely be misunderstanding this. I'm not sure why the example results would have unit id 1 repeated. (which my suggestion WON'T do)

Because I screwed up cutting and pasting to make an example. The unit_id in the output should, in fact, be distinct:

unit_id | events | status_1_count ---------+--------+---------------- 1 | 50 | 34 2 | 27 | 18 3 | 50 | 34 4 | 2 | 0

You are correct, epoch is easy:
abstime(epoch)
or
extract(epoch from timestamptz)
depending on which direction you are going or for an hour difference just subtract 3600.

The solution, however, misses the important complicating gotcha. The units I want listed are only those that have had at least one event in the last hour. But for each such unit, I only want the statistics to reflect the most-recent 50 events (even if those events occurred earlier than the current hour) provided the event occurred on the current date. So the events column can never be less than 1 nor more than 50.

For example...

One unit might have a single event at the start of the last hour but 49 more in the preceding 10 minutes. I want to see that unit and the stats for those 50 events.

Same thing if a unit has 50 events clustered at the end of an hour - I don't want the earlier ones.

Another might have 50 events early in the day but none this hour. I don't want to see that one.

But I do want to see the one that had an event in the last hour late in the day along with the 48 other events that have accumulated since midnight.

Cheers,
Steve

В списке pgsql-general по дате отправления:

Предыдущее

От: Chris Curvey
Дата: 10 апреля 2013 г., 20:31:55
Сообщение: Re: Statistics query

Следующее

От: John R Pierce
Дата: 10 апреля 2013 г., 22:28:34
Сообщение: Re: How to convert US date format to European date format ?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Statistics query

Предыдущее

Следующее