Обсуждение: Histogram generator
Hi, I have a table containing events with timestamps. I would like to generate a histogram of the number of each type ofevent for each half-hour period from 8:00 am to 6:00 pm. Are there any tools that will help me do this? Thanks, Patrick
Вложения
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Patrick May > Sent: Tuesday, July 27, 2010 3:59 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Histogram generator > > Hi, > > I have a table containing events with timestamps. I would like > to generate a histogram of the number of each type of event for each > half-hour period from 8:00 am to 6:00 pm. Are there any tools that > will help me do this? GROUP BY is essentially a histogram generator. The age() function will give you an interval. I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesdayand 8AM Wednesday supposed to be grouped together or not?) I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a littlebit vague or ambiguous.
On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote:
I have a table containing events with timestamps. I would liketo generate a histogram of the number of each type of event for eachhalf-hour period from 8:00 am to 6:00 pm. Are there any tools thatwill help me do this?
GROUP BY is essentially a histogram generator.
The age() function will give you an interval.
I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesday and 8AM Wednesday supposed to be grouped together or not?)
I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is a little bit vague or ambiguous.
I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this:
date start end event count
---------- -------- -------- -------- -----
2010-07-27 08:00:00 08:29:59 EVENT_1 20
2010-07-27 08:00:00 08:29:59 EVENT_2 15
2010-07-27 08:30:00 08:59:59 EVENT_1 10
2010-07-27 08:30:00 08:59:59 EVENT_3 5
I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them.
Thanks,
Patrick
Вложения
On Jul 27, 2010, at 6:07 PM, Patrick May wrote: > On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote: >>> I have a table containing events with timestamps. I would like >>> to generate a histogram of the number of each type of event for each >>> half-hour period from 8:00 am to 6:00 pm. Are there any tools that >>> will help me do this? >> >> GROUP BY is essentially a histogram generator. >> The age() function will give you an interval. >> >> I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesdayand 8AM Wednesday supposed to be grouped together or not?) >> >> I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is alittle bit vague or ambiguous. > > I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this: > > date start end event count > ---------- -------- -------- -------- ----- > 2010-07-27 08:00:00 08:29:59 EVENT_1 20 > 2010-07-27 08:00:00 08:29:59 EVENT_2 15 > 2010-07-27 08:30:00 08:59:59 EVENT_1 10 > 2010-07-27 08:30:00 08:59:59 EVENT_3 5 > > I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them. There's probably a better way, but something like this might work: select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*) frombar group by 1, 2 order by 1 asc; Cheers, Steve
On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote: > On Jul 27, 2010, at 6:07 PM, Patrick May wrote: >> On Jul 27, 2010, at 7:12 PM, Dann Corbit wrote: >>>> I have a table containing events with timestamps. I would like >>>> to generate a histogram of the number of each type of event for each >>>> half-hour period from 8:00 am to 6:00 pm. Are there any tools that >>>> will help me do this? >>> >>> GROUP BY is essentially a histogram generator. >>> The age() function will give you an interval. >>> >>> I am not sure if you want to combine half hour periods from different days with the same time or not (IOW are 8AM Tuesdayand 8AM Wednesday supposed to be grouped together or not?) >>> >>> I guess that if you are more specific about exactly what you want you can get a better answer. Your requirement is alittle bit vague or ambiguous. >> >> I don't mind using Excel to generate the actual graphic. Ideally I'd get output something like this: >> >> date start end event count >> ---------- -------- -------- -------- ----- >> 2010-07-27 08:00:00 08:29:59 EVENT_1 20 >> 2010-07-27 08:00:00 08:29:59 EVENT_2 15 >> 2010-07-27 08:30:00 08:59:59 EVENT_1 10 >> 2010-07-27 08:30:00 08:59:59 EVENT_3 5 >> >> I'm quite familiar with SQL, but I'm not sure how to generate the half hour intervals without hard coding them. > > There's probably a better way, but something like this might work: > > select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*)from bar group by 1, 2 order by 1 asc; Thanks! It looks like interval is what I need to play with. Regards, Patrick
Вложения
On Tue, Jul 27, 2010 at 09:25:05PM -0400, Patrick May wrote: > On Jul 27, 2010, at 9:21 PM, Steve Atkins wrote: > > select date_trunc('hour', foo) + interval '30 minutes' * floor(extract(minute from foo) / 30) as start, event, count(*)from bar group by 1, 2 order by 1 asc; > > Thanks! It looks like interval is what I need to play with. Another useful tool to use is the classic unix "seconds since epoch". You could turn the key expression from above into: timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60) I'd probably go with Steve's version here, it's a bit more obvious what's going on. Also note, that if you don't really care about what the specific groups are, just that you have a set of evenly divided 30minute periods you don't need to convert back to a date, so could just use: floor(date_part('epoch',foo) / (30*60)) One final note, if you're dealing with lots of data and the above expression is slow, you could delay converting back to a date until "after" the grouping, i.e: SELECT timestamp 'epoch' + interval '1 second' * floor(date_part('epoch',foo) / (30*60)) * (30*60) AS t, COUNT(*) FROM data GROUP BY floor(date_part('epoch',foo) / (30*60)); This will save PG from converting back to a date for every row when it's going to chuck most of them away anyway. Hope that gives you some more ideas! -- Sam http://samason.me.uk/