Обсуждение: Histogram generator

Поиск
Список
Период
Сортировка

Histogram generator

От
Patrick May
Дата:
Hi,

    I have a table containing events with timestamps.  I would like to generate a histogram of the number of each type
ofevent for each half-hour period from 8:00 am to 6:00 pm.  Are there any tools that will help me do this? 

Thanks,

Patrick


Вложения

Re: Histogram generator

От
Dann Corbit
Дата:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Patrick May
> Sent: Tuesday, July 27, 2010 3:59 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Histogram generator
>
> Hi,
>
>     I have a table containing events with timestamps.  I would like
> to generate a histogram of the number of each type of event for each
> half-hour period from 8:00 am to 6:00 pm.  Are there any tools that
> will help me do this?

GROUP BY is essentially a histogram generator.
The age() function will give you an interval.

I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM
Tuesdayand 8AM Wednesday supposed to be grouped together or not?) 

I guess that if you are more specific about exactly what you want you can get a better answer.  Your requirement is a
littlebit vague or ambiguous. 

Re: Histogram generator

От
Patrick May
Дата:
On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
I have a table containing events with timestamps.  I would like
to generate a histogram of the number of each type of event for each
half-hour period from 8:00 am to 6:00 pm.  Are there any tools that
will help me do this?

GROUP BY is essentially a histogram generator.
The age() function will give you an interval.

I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?)

I guess that if you are more specific about exactly what you want you can get a better answer.  Your requirement is a little bit vague or ambiguous.

I don't mind using Excel to generate the actual graphic.  Ideally I'd get output something like this:

      date    start      end    event count
---------- -------- -------- -------- -----
2010-07-27 08:00:00 08:29:59  EVENT_1    20
2010-07-27 08:00:00 08:29:59  EVENT_2    15
2010-07-27 08:30:00 08:59:59  EVENT_1    10
2010-07-27 08:30:00 08:59:59  EVENT_3     5

I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.

Thanks,

Patrick

Вложения

Re: Histogram generator

От
Steve Atkins
Дата:
On Jul 27, 2010, at 6:07 PM, Patrick May wrote:

> On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
>>>     I have a table containing events with timestamps.  I would like
>>> to generate a histogram of the number of each type of event for each
>>> half-hour period from 8:00 am to 6:00 pm.  Are there any tools that
>>> will help me do this?
>>
>> GROUP BY is essentially a histogram generator.
>> The age() function will give you an interval.
>>
>> I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM
Tuesdayand 8AM Wednesday supposed to be grouped together or not?) 
>>
>> I guess that if you are more specific about exactly what you want you can get a better answer.  Your requirement is
alittle bit vague or ambiguous. 
>
>     I don't mind using Excel to generate the actual graphic.  Ideally I'd get output something like this:
>
>       date    start      end    event count
> ---------- -------- -------- -------- -----
> 2010-07-27 08:00:00 08:29:59  EVENT_1    20
> 2010-07-27 08:00:00 08:29:59  EVENT_2    15
> 2010-07-27 08:30:00 08:59:59  EVENT_1    10
> 2010-07-27 08:30:00 08:59:59  EVENT_3     5
>
> I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.

There's probably a better way, but something like this might work:

select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*)
frombar group by 1, 2 order by 1 asc; 

Cheers,
  Steve


Re: Histogram generator

От
Patrick May
Дата:
On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote:
> On Jul 27, 2010, at 6:07 PM, Patrick May wrote:
>> On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
>>>>     I have a table containing events with timestamps.  I would like
>>>> to generate a histogram of the number of each type of event for each
>>>> half-hour period from 8:00 am to 6:00 pm.  Are there any tools that
>>>> will help me do this?
>>>
>>> GROUP BY is essentially a histogram generator.
>>> The age() function will give you an interval.
>>>
>>> I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM
Tuesdayand 8AM Wednesday supposed to be grouped together or not?) 
>>>
>>> I guess that if you are more specific about exactly what you want you can get a better answer.  Your requirement is
alittle bit vague or ambiguous. 
>>
>>     I don't mind using Excel to generate the actual graphic.  Ideally I'd get output something like this:
>>
>>      date    start      end    event count
>> ---------- -------- -------- -------- -----
>> 2010-07-27 08:00:00 08:29:59  EVENT_1    20
>> 2010-07-27 08:00:00 08:29:59  EVENT_2    15
>> 2010-07-27 08:30:00 08:59:59  EVENT_1    10
>> 2010-07-27 08:30:00 08:59:59  EVENT_3     5
>>
>> I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.
>
> There's probably a better way, but something like this might work:
>
> select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event,
count(*)from bar group by 1, 2 order by 1 asc; 

    Thanks!  It looks like interval is what I need to play with.

Regards,

Patrick


Вложения

Re: Histogram generator

От
Sam Mason
Дата:
On Tue, Jul 27, 2010 at 09:25:05PM -0400, Patrick May wrote:
> On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote:
> > select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event,
count(*)from bar group by 1, 2 order by 1 asc; 
>
>     Thanks!  It looks like interval is what I need to play with.

Another useful tool to use is the classic unix "seconds since epoch".
You could turn the key expression from above into:

  timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60)

I'd probably go with Steve's version here, it's a bit more obvious
what's going on.  Also note, that if you don't really care about what
the specific groups are, just that you have a set of evenly divided
30minute periods you don't need to convert back to a date, so could just
use:

  floor(date_part('epoch',foo) / (30*60))

One final note, if you're dealing with lots of data and the above
expression is slow, you could delay converting back to a date until
"after" the grouping, i.e:

  SELECT timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60) AS t, COUNT(*)
  FROM data
  GROUP BY floor(date_part('epoch',foo) / (30*60));

This will save PG from converting back to a date for every row when it's
going to chuck most of them away anyway.

Hope that gives you some more ideas!

--
  Sam  http://samason.me.uk/