Обсуждение: I need to fill up a sparse table in an view

Поиск
Список
Период
Сортировка

I need to fill up a sparse table in an view

От
Andreas
Дата:
Hi,

I need to fill up a sparse table in an view.
The table holds some numbers relating months and objects.
The month is an integer in the format YYYYMM.

To make it more convenient to manage this table I decidet to let a value 
be good till the next entry.
E.g. if there is an entry in january and march, the january entry is 
good in february, too.


So the table looks like.
my_numbers ( object_id int, month int, some_nr int )

( 17, 201301, 123 ),
( 42, 201301, 456 ),
( 42, 201303, 789 ),

Now I need a view that fills the gaps up till the current month.


( 17, 201301, 123 ),
( 17, 201302, 123 ),      <-- filled gap
( 17, 201303, 123 ),      <-- filled gap
( 42, 201301, 456 ),
( 42, 201302, 456 ),      <-- filled gap
( 42, 201303, 789 ),


Is this possible?



Re: I need to fill up a sparse table in an view

От
Victor Yegorov
Дата:
2013/3/2 Andreas <maps.on@gmx.net>
So the table looks like.
my_numbers ( object_id int, month int, some_nr int )

( 17, 201301, 123 ),
( 42, 201301, 456 ),
( 42, 201303, 789 ),

Now I need a view that fills the gaps up till the current month.


( 17, 201301, 123 ),
( 17, 201302, 123 ),      <-- filled gap
( 17, 201303, 123 ),      <-- filled gap
( 42, 201301, 456 ),
( 42, 201302, 456 ),      <-- filled gap
( 42, 201303, 789 ),


Is this possible?

Possible. Slightly different object identifiers used:

CREATE TABLE t(
    id  int,
    mon int,
    val int
);
INSERT INTO t VALUES
(17,201301,123),
(42,201301,456),
(42,201303,789);


Then the query (check results here http://sqlfiddle.com/#!12/ce8fa/1 ):

WITH dr AS (
    SELECT to_char(generate_series(to_date(min(mon)::text, 'YYYYMM'),
                                   greatest(to_date(max(mon)::text,'YYYYMM'),
                                            date(date_trunc('mon', now()))), '1 mon'::interval),
                     'YYYYMM')::numeric mon
      FROM t
    )
, x AS (
    SELECT s.id, dr.mon
      FROM dr
      CROSS JOIN (SELECT DISTINCT id FROM t) s
    )
, g AS (
    SELECT x.id, x.mon, t.val, CASE WHEN t.val IS NOT NULL THEN 1 ELSE NULL END grp
      FROM x
      LEFT JOIN t USING (id, mon)
    )
, nr AS (
    SELECT g.id, g.mon, g.val, g.grp, sum(g.grp) OVER (ORDER BY id,mon) gnr
      FROM g
    )
SELECT *,
       coalesce(val, min(val) OVER (PARTITION BY gnr)) the_one
  FROM nr
 ORDER BY 1,2;

1) “dr” is used to generate a range of months from the minimal found in the “t” table up to either current or the max one found in the “t”, whichever is bigger. A bit tricky query, if you can get the series of month some other way — feel free;
2) “x” will create a CROSS join of all the “id” with all the months;
3) “g” will create almost ready result with all the gaps in place, new service column is introduced to create groups;
4) within “nr” group identifiers are being summed, thus forming a unique group number for each entry and gap rows that follows it;
5) finally, NULL entries are replaced with the correct ones.

To obtain the desired output, you should “SELECT id, mon, the_one” in the last query. Feel free to query each of the intermediate steps to see how data transforms.

You might want to get rid of the CTEs and write a bunch of subqueries to avoid optimization fences of CTEs, as for bigger tables this construct will be performing badly.


--
Victor Y. Yegorov