Possible. Slightly different object identifiers used:
CREATE TABLE t(
id int,
mon int,
val int
);
INSERT INTO t VALUES
(17,201301,123),
(42,201301,456),
(42,201303,789);
WITH dr AS (
SELECT to_char(generate_series(to_date(min(mon)::text, 'YYYYMM'),
greatest(to_date(max(mon)::text,'YYYYMM'),
date(date_trunc('mon', now()))), '1 mon'::interval),
'YYYYMM')::numeric mon
FROM t
)
, x AS (
FROM dr
CROSS JOIN (SELECT DISTINCT id FROM t) s
)
, g AS (
SELECT
x.id, x.mon, t.val, CASE WHEN t.val IS NOT NULL THEN 1 ELSE NULL END grp
FROM x
LEFT JOIN t USING (id, mon)
)
, nr AS (
SELECT
g.id, g.mon, g.val, g.grp, sum(g.grp) OVER (ORDER BY id,mon) gnr
FROM g
)
SELECT *,
coalesce(val, min(val) OVER (PARTITION BY gnr)) the_one
FROM nr
ORDER BY 1,2;
1) “dr” is used to generate a range of months from the minimal found in the “t” table up to either current or the max one found in the “t”, whichever is bigger. A bit tricky query, if you can get the series of month some other way — feel free;
2) “x” will create a CROSS join of all the “id” with all the months;
3) “g” will create almost ready result with all the gaps in place, new service column is introduced to create groups;
4) within “nr” group identifiers are being summed, thus forming a unique group number for each entry and gap rows that follows it;
5) finally, NULL entries are replaced with the correct ones.
To obtain the desired output, you should “SELECT id, mon, the_one” in the last query. Feel free to query each of the intermediate steps to see how data transforms.
You might want to get rid of the CTEs and write a bunch of subqueries to avoid optimization fences of CTEs, as for bigger tables this construct will be performing badly.
--
Victor Y. Yegorov