Re: partial time stamp query
От | Brent Wood |
---|---|
Тема | Re: partial time stamp query |
Дата | |
Msg-id | B30242D206AB9543A3406649674DB4198EFB1006@hamwexdr01.niwa.local обсуждение исходный текст |
Ответ на | Fwd: partial time stamp query (Kirk Wythers <wythe001@umn.edu>) |
Список | pgsql-general |
Hi Kirk, We have a (near) real time data database for instrument observations from o= ur research vessels. All observations (summarised to one minute intervals -= the actual raw data is in netCDF, this database makes for easier access & = meets most users needs) go into a single table, with other tables for metad= ata about platforms, instruments, etc. Now approaching 350m records, so rea= sonably substantial. Underlying optimisations include partitioned readings table, with a separate partition for each year (now 23= years) clustered index on timestamp for the previous years partitions. largeish filesystem block size - tested to work well with the clustered ind= ex & small size records) These generally make a big difference to performance. To address one issue,= much like yours, where some users want hourly data for a year, some want d= aily data for 10 years & some want 1 minute data for the last month (& some= , no doubt, want one minute data for 20+ years!) I introduced an integer co= lumn called timer. This value is set according to the time (not date) of ea= ch record. Along the lines of (from memory) :an even no of minutes after the hour is 2= , 5 minutes is 4, 10 minutes is 8, 15 minute is 16, 30 minutes is 32, 60 mi= nutes is 64, 6 hourly is 128, 12:00 AM is 256 & 12:00PM is 512. When any = timestamp is in more than one category (eg: 12:00 is all of even, 5, 15m 30= m 60 minutes), the timer value is set to the largest appropriate one. So a request for: 1 minute data is select from table; 2 minute data is select from table where timer >=3D2 and timer !=3D15 and = timer !=3D4; hourly data is select from table where timer >=3D64 and timer !=3D 15 and = timer !=3D 4; etc 5 & 15 minute add a bit of complexity, but we gave the users what they want= ed. This has worked well for years now, & we have an internal web (mapserve= r/openlayers based) application allowing users to visualise & download thei= r selected data - they choose from an interval pick list & the SQL is hidde= n. Some extra enhancements are the automatic collation of lat & lon gps rea= dings into a Postgis point for each reading record, & the automatic aggrega= tion of daily points into daily track lines, so the track for any selected = set of dates can easily be displayed on a map (the platforms are mobile ves= sels - not fixed sites) You might adapt some of these ideas for your use case? Cheers Brent Wood Programme leader: Environmental Information Delivery NIWA DDI: +64 (4) 3860529 ________________________________________ From: pgsql-general-owner@postgresql.org [pgsql-general-owner@postgresql.or= g] on behalf of Kirk Wythers [wythe001@umn.edu] Sent: Tuesday, February 05, 2013 5:58 AM To: pgsql-general@postgresql.org Subject: Fwd: [GENERAL] partial time stamp query Thanks. That worked great! Now I am trying to aggregate these same fifteen = minute to hourly. I have tried using date_trunk: date_trunc('hour', derived_tsoil_fifteen_min_stacked.time2), but date_truck only seems to aggriage the timestamp. I thought I could use AVG(derived_tsoil_fifteen_min_stacked.value) in combination with date_trunk, but I still get 15 minute values, not the h= ourly average from the four 15 minute records. rowid date_truck time2 site canopy plot variable name value avg 2010-07-07_00:00:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:00:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 21.06 21.0599994659424 2010-07-07_00:15:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:15:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.96 20.9599990844727 2010-07-07_00:30:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:30:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.88 20.8799991607666 2010-07-07_00:45:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:45:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.8 20.7999992370605 2010-07-07_01:00:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:00:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.72 20.7199993133545 2010-07-07_01:15:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:15:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.64 20.6399993896484 2010-07-07_01:30:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:30:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.55 20.5499992370605 2010-07-07_01:45:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:45:00 cfc = closed a2 tsoil_sc tsoil_avg1_sc 20.47 20.4699993133545 I was tying to get two records out of this set, with the 'avg" column repre= senting the mean of the first and last four of each 15 minute records. Perhaps date_trunk only works for the timestamp? On Feb 4, 2013, at 8:50 AM, Misa Simic <misa.simic@gmail.com<mailto:misa.si= mic@gmail.com>> wrote: WHERE derived_tsoil_fifteen_min_stacked.time2::date =3D '2010-07-07'::date On Monday, February 4, 2013, Kirk Wythers wrote: I am trying to write a query that grabs one particular day from a timestamp= column. The data are ordered in 15 minute chunks like this: 2010-07-07 12:45:00 2010-07-07 13:00:00 2010-07-07 13:15:00 2010-07-07 13:30:00 etc=85 If I wanted all records from july 7th 2010, I would expect 4 x 24 =3D 96 re= cords per day. I have tried the '=3D' operator, like this WHERE derived_tsoil_fifteen_min_stacked.time2 =3D '2010-07-07*' but that grabs nothing, and using the '~' operator grabs everything with a = 2010 or 07 in it=85 in other words all days from July of 2010. Any suggestions would be much appreciated. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Please consider the environment before printing this email. NIWA is the trading name of the National Institute of Water & Atmospheric R= esearch Ltd.
В списке pgsql-general по дате отправления:
Следующее
От: Michael HarrisДата:
Сообщение: Hot Standby has PANIC: WAL contains references to invalid pages