Обсуждение: filtering based on table of start/end times
Hi,
At first glance, this seemed simple to implement, but this is giving me
a bit of a headache.
Say we have a table as follows:
CREATE TABLE voltage_series
( voltage_record_id integer NOT NULL DEFAULT nextval('voltage_series_logger_record_id_seq'::regclass), "time" timestamp
withouttime zone NOT NULL, voltage numeric, CONSTRAINT voltage_series_pkey PRIMARY KEY (voltage_record_id));
So it contains a time series of voltage measurements. Now suppose we
have another table of start/end times that we'd like to use to filter
out (or keep) records in voltage_series:
CREATE TABLE voltage_log
( record_id integer NOT NULL DEFAULT nextval('voltage_log_record_id_seq'::regclass), time_beg timestamp without time
zoneNOT NULL, time_end timestamp without time zone NOT NULL, CONSTRAINT voltage_log_pkey PRIMARY KEY (record_id));
where each record represents start/end times where the voltage
measurement should be removed/kept. The goal is to retrieve the records
in voltage_series that are not included in any of the periods defined by
the start/end times in voltage_log.
I've looked at the OVERLAPS operator, but it's not evident to me whether
that is the best approach. Any tips would be appreciated.
Cheers,
--
Seb
On 11/7/2014 3:12 PM, Seb wrote:
Hi,
At first glance, this seemed simple to implement, but this is giving me
a bit of a headache.
Say we have a table as follows:
CREATE TABLE voltage_series
( voltage_record_id integer NOT NULL DEFAULT nextval('voltage_series_logger_record_id_seq'::regclass), "time" timestamp without time zone NOT NULL, voltage numeric, CONSTRAINT voltage_series_pkey PRIMARY KEY (voltage_record_id));
So it contains a time series of voltage measurements. Now suppose we
have another table of start/end times that we'd like to use to filter
out (or keep) records in voltage_series:
CREATE TABLE voltage_log
( record_id integer NOT NULL DEFAULT nextval('voltage_log_record_id_seq'::regclass), time_beg timestamp without time zone NOT NULL, time_end timestamp without time zone NOT NULL, CONSTRAINT voltage_log_pkey PRIMARY KEY (record_id));
where each record represents start/end times where the voltage
measurement should be removed/kept. The goal is to retrieve the records
in voltage_series that are not included in any of the periods defined by
the start/end times in voltage_log.
I've looked at the OVERLAPS operator, but it's not evident to me whether
that is the best approach. Any tips would be appreciated.
Cheers,
Hello Seb,
Any reason this won't work for you?
SELECT *
FROM voltage_series
WHERE time NOT IN (SELECT DISTINCT time FROM voltage_series, voltage_log WHERE time BETWEEN time_beg AND time_end);
Might not be the fastest way to do it, if the tables are large. Apologies if I've not understood your question properly.
Regards,
Bryan
-- ______________ Postdoctoral Researcher GA Cooperative Fish & Wildlife Research Unit Warnell School of Forestry & Natural Resources University of Georgia Athens 30602-2152
I already sent this but used a wrong address. Sorry for the mess.
On 07.11.2014 21:12, Seb wrote:
> Hi,
>
> At first glance, this seemed simple to implement, but this is giving me
> a bit of a headache.
>
> Say we have a table as follows:
>
> CREATE TABLE voltage_series
> (
> voltage_record_id integer NOT NULL DEFAULT nextval('voltage_series_logger_record_id_seq'::regclass),
> "time" timestamp without time zone NOT NULL,
> voltage numeric,
> CONSTRAINT voltage_series_pkey PRIMARY KEY (voltage_record_id));
>
> So it contains a time series of voltage measurements. Now suppose we
> have another table of start/end times that we'd like to use to filter
> out (or keep) records in voltage_series:
>
> CREATE TABLE voltage_log
> (
> record_id integer NOT NULL DEFAULT nextval('voltage_log_record_id_seq'::regclass),
> time_beg timestamp without time zone NOT NULL,
> time_end timestamp without time zone NOT NULL,
> CONSTRAINT voltage_log_pkey PRIMARY KEY (record_id));
>
> where each record represents start/end times where the voltage
> measurement should be removed/kept. The goal is to retrieve the records
> in voltage_series that are not included in any of the periods defined by
> the start/end times in voltage_log.
>
> I've looked at the OVERLAPS operator, but it's not evident to me whether
> that is the best approach. Any tips would be appreciated.
>
> Cheers,
>
Something like this should work:
SELECT *
FROM voltage_series AS vs
LEFT JOIN voltage_log vl ON vs.time BETWEEN vl.time_beg AND vl.time_end
WHERE vl.id IS NULL
This is untested, but I think it should work.
greetings
Tim
On Sat, 08 Nov 2014 11:26:53 +0100,
Tim Schumacher <tim@bandenkrieg.hacked.jp> wrote:
> I already sent this but used a wrong address. Sorry for the mess.
> On 07.11.2014 21:12, Seb wrote:
>> Hi,
>> At first glance, this seemed simple to implement, but this is giving
>> me a bit of a headache.
>> Say we have a table as follows:
>> CREATE TABLE voltage_series ( voltage_record_id integer NOT NULL
>> DEFAULT nextval('voltage_series_logger_record_id_seq'::regclass),
>> "time" timestamp without time zone NOT NULL, voltage numeric,
>> CONSTRAINT voltage_series_pkey PRIMARY KEY (voltage_record_id));
>> So it contains a time series of voltage measurements. Now suppose we
>> have another table of start/end times that we'd like to use to filter
>> out (or keep) records in voltage_series:
>> CREATE TABLE voltage_log ( record_id integer NOT NULL DEFAULT
>> nextval('voltage_log_record_id_seq'::regclass), time_beg timestamp
>> without time zone NOT NULL, time_end timestamp without time zone NOT
>> NULL, CONSTRAINT voltage_log_pkey PRIMARY KEY (record_id));
>> where each record represents start/end times where the voltage
>> measurement should be removed/kept. The goal is to retrieve the
>> records in voltage_series that are not included in any of the periods
>> defined by the start/end times in voltage_log.
>> I've looked at the OVERLAPS operator, but it's not evident to me
>> whether that is the best approach. Any tips would be appreciated.
>> Cheers,
> Something like this should work:
> SELECT * FROM voltage_series AS vs LEFT JOIN voltage_log vl ON vs.time
> BETWEEN vl.time_beg AND vl.time_end WHERE vl.id IS NULL
> This is untested, but I think it should work.
Thank you all for your suggestions. The above proved very fast with
the millions of records and several other joins involved.
--
Seb
On 07.11.2014 21:12, Seb wrote:
> Hi,
>
> At first glance, this seemed simple to implement, but this is giving me
> a bit of a headache.
>
> Say we have a table as follows:
>
> CREATE TABLE voltage_series
> (
> voltage_record_id integer NOT NULL DEFAULT nextval('voltage_series_logger_record_id_seq'::regclass),
> "time" timestamp without time zone NOT NULL,
> voltage numeric,
> CONSTRAINT voltage_series_pkey PRIMARY KEY (voltage_record_id));
>
> So it contains a time series of voltage measurements. Now suppose we
> have another table of start/end times that we'd like to use to filter
> out (or keep) records in voltage_series:
>
> CREATE TABLE voltage_log
> (
> record_id integer NOT NULL DEFAULT nextval('voltage_log_record_id_seq'::regclass),
> time_beg timestamp without time zone NOT NULL,
> time_end timestamp without time zone NOT NULL,
> CONSTRAINT voltage_log_pkey PRIMARY KEY (record_id));
>
> where each record represents start/end times where the voltage
> measurement should be removed/kept. The goal is to retrieve the records
> in voltage_series that are not included in any of the periods defined by
> the start/end times in voltage_log.
>
> I've looked at the OVERLAPS operator, but it's not evident to me whether
> that is the best approach. Any tips would be appreciated.
>
> Cheers,
>
Something like this should work:
SELECT *
FROM voltage_series AS vs
LEFT JOIN voltage_log vl ON vs.time BETWEEN vl.time_beg AND vl.time_end
WHERE vl.id IS NULL
This is untested, but I think it should work.
greetings
Tim
On Sun, 09 Nov 2014 12:43:22 -0600, Seb <spluque@gmail.com> wrote: > On Sat, 08 Nov 2014 11:26:53 +0100, > Tim Schumacher <tim@bandenkrieg.hacked.jp> wrote: [...] >> Something like this should work: >> SELECT * FROM voltage_series AS vs LEFT JOIN voltage_log vl ON >> vs.time BETWEEN vl.time_beg AND vl.time_end WHERE vl.id IS NULL >> This is untested, but I think it should work. > Thank you all for your suggestions. The above proved very fast with > the millions of records and several other joins involved. Sorry to come back with a related issue, which is proving troublesome. There's another log table, that looks just like voltage_log, but has an additional column with an integer indicating what problem occurred during the period: CREATE TABLE voltage_diagnostic_log ( record_id serial, time_beg timestamp without time zone NOT NULL, time_end timestamp without time zone NOT NULL, diagnosticinteger, CONSTRAINT voltage_log_pkey PRIMARY KEY (record_id)); So that a view can be built for the voltage_series table where its columns can be adjusted based on the diagnostic integer (there are other columns besides voltage in the actual table), if the time stamp falls within a period of the voltage_diagnostic_log table. To illustrate, the view needs to be able to have field definitions such as (pseudo-code): CASE WHEN diagnostic=1 THEN voltage * 0.88 ELSE voltage END AS voltage_corrected, CASE WHEN diagnostic=2 THEN pressure - 2.5 ELSE pressure END AS pressure_corrected, The problem is that each record in voltage_series can have several matching records in voltage_diagnostic_log. Any insights welcome! -- Seb
On Wed, 12 Nov 2014 15:49:57 -0600, Seb <spluque@gmail.com> wrote: > Sorry to come back with a related issue, which is proving troublesome. > There's another log table, that looks just like voltage_log, but has > an additional column with an integer indicating what problem occurred > during the period: > CREATE TABLE voltage_diagnostic_log ( record_id serial, time_beg > timestamp without time zone NOT NULL, time_end timestamp without time > zone NOT NULL, diagnostic integer, CONSTRAINT voltage_log_pkey PRIMARY > KEY (record_id)); [...] For posterity's sake, the only solution I was able to find was to first create a view with a separate boolean column for each diagnostic value via crosstab(). From there it was possible to use a WITH subquery to remove rows with a particular diagnostic value (as suggested previously), and then have the main SELECT statement do a second left join to the crosstab view so that it could make use of the rest of the boolean columns as sources for CASE statements for each field. Cheers, -- Seb