At work I've found the following custom aggregate function very
helpful. But I would like to extend it and I'm not sure how to do
this.
The function is last_value_when. When you call last_value_when(value,
cond) over w it gives you the last value in the window where cond was
true.
Here is my implementation.
-----
CREATE OR REPLACE FUNCTION last_value_when_sfunc(state anyelement,
value anyelement, cond BOOL) RETURNS anyelement AS
$$
SELECT CASE WHEN cond THEN value ELSE state END;
$$
LANGUAGE SQL;
COMMENT ON FUNCTION .last_value_when_sfunc (anyelement, anyelement,
bool) IS 'Helper function for tracking last matching in window';
CREATE AGGREGATE last_value_when(anyelement, bool) (
SFUNC = expression.last_value_when_sfunc,
STYPE = anyelement);
COMMENT ON AGGREGATE last_value_when (anyelement, bool) IS 'Aggregate
function for tracking the last value when a condition was true';
-----
Here is an example of its use.
SELECT timestamp,
, measurement_type
, value
, timestamp
, last_value_when(timestamp, measurement_type = 'foo') over w as
last_foo_timestamp
, last_value_when(value, measurement_type = 'foo') over w as last_foo_value
FROM some_table
WINDOW w AS (
ORDER BY timestamp,
CASE WHEN measurement_type = 'foo' THEN 0 ELSE 1 END
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
This gives me for every row the last time/value for foo, no matter how
many rows back it might have been.
For working with time series I've found this incredibly powerful. In
fact I think that it is worth including in Postgres!
But here is my problem. I would like to build equivalents of lead/lag
with the same idea. That is where I currently do lag(some_column, 2)
I'd like to instead be able to say lag_when(some_column, 2,
some_condition). So I could pick out not just the last measurement of
foo, but the last 3 measurements of foo.
How would I do that?