Обсуждение: Re: Crossing/Rotating table rows to rows and columns

Поиск
Список
Период
Сортировка

Re: Crossing/Rotating table rows to rows and columns

От
Christophe Pettus
Дата:

> On Sep 2, 2020, at 19:58, Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
> I've tried to use crosstabN(text sql), to solve the problem directly in the datasource layer, but apparently
tablefuncis not supported in the datamodel Squema  

"tablefunc" is an extension, so you will need to create it in your database before using it:

    CREATE EXTENSION tablefunc;

--
-- Christophe Pettus
   xof@thebuild.com




Crossing/Rotating table rows to rows and columns

От
Iuri Sampaio
Дата:
Hi there,
How would I convert/rotate (i.e. cross table) the following datasource, which is originally returned by rows (i.e. datetime and total), to rows as hours and columns as dates, where the columns (dates) will be assigned with "total" as their value.

Here it is the chunk of code to convert from base64url to binary PNG

select date_trunc('hour', o.creation_date) AS datetime,
COUNT(1) AS total
FROM cr_items ci, acs_objects o, cr_revisions cr
WHERE ci.item_id = o.object_id
AND ci.item_id = cr.item_id
AND ci.latest_revision = cr.revision_id
AND ci.content_type = :content_type
AND o.creation_date BETWEEN :creation_date::date - INTERVAL '6 day' AND :creation_date::date + INTERVAL '1 day'
GROUP BY 1 ORDER BY datetime ASC''

From the table structure, such as:

hour | total
------------------------+-------
2020-07-26 02:00:00+00 | 1
2020-07-26 04:00:00+00 | 7
2020-07-26 05:00:00+00 | 6
2020-07-26 06:00:00+00 | 6
2020-07-26 07:00:00+00 | 17
2020-07-26 08:00:00+00 | 17
2020-07-26 09:00:00+00 | 6
2020-07-26 10:00:00+00 | 8
2020-07-26 11:00:00+00 | 14
2020-07-26 12:00:00+00 | 16
2020-07-26 13:00:00+00 | 10
2020-07-26 14:00:00+00 | 17
2020-07-26 15:00:00+00 | 15
2020-07-26 16:00:00+00 | 2
2020-07-27 00:00:00+00 | 1
2020-07-27 06:00:00+00 | 1
..
2020-08-01 07:00:00+00 | 7
2020-08-01 08:00:00+00 | 4
2020-08-01 09:00:00+00 | 7
2020-08-01 10:00:00+00 | 10
2020-08-01 11:00:00+00 | 20
2020-08-01 12:00:00+00 | 25
2020-08-01 13:00:00+00 | 18
2020-08-01 14:00:00+00 | 14
2020-08-01 15:00:00+00 | 12
2020-08-01 16:00:00+00 | 4
(91 rows)

It would result in the table, as in:

hour 2020-7-26 2020-7-27 ... 2020-7-31 2020-8-01
0:00:00
1:00:00
2:00:00
3:00:00
4:00:00
5:00:00 1
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6
14:00:00 8 10 9 7 14 11 4
15:00:00 21 10 4 2 13 15
16:00:00 12 15 11 10 22 22
17:00:00 30 14 11 28 10 29
18:00:00 1
19:00:00
20:00:00
21:00:00
22:00:00
23:00:00

I've tried to use crosstabN(text sql), to solve the problem directly in the datasource layer, but apparently tablefunc is not supported in the datamodel Squema  https://www.postgresql.org/docs/9.2/tablefunc.html

ERROR: function crosstab(unknown, unknown) does not exist
LINE 1: select * from crosstab('select o.creation_date::date AS day ...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.



Thus, I was wondering if there is a better approach to write a beautiful code out of it.

Does anyone have an idea on how to write this crosstable display?

Best wishes,
I



Re: Crossing/Rotating table rows to rows and columns

От
Iuri Sampaio
Дата:
Indeed, tablefunc -> crosstab would be a solution to it. However, the number and label of columns dynamically change depending on the records returned in the creation_date range, declared in the WHERE clasue.

select date_trunc('hour', o.creation_date) AS datetime,
COUNT(1) AS total
FROM cr_items ci, acs_objects o, cr_revisions cr
WHERE ci.item_id = o.object_id
AND ci.item_id = cr.item_id
AND ci.latest_revision = cr.revision_id
AND ci.content_type = :content_type
AND o.creation_date BETWEEN :creation_date::date - INTERVAL '6 day' AND :creation_date::date + INTERVAL '1 day'
GROUP BY 1 ORDER BY datetime ASC'



Furthermore, how would datetime column (o.creation_date) be split into rows and columns as in: 

From the table structure, such as:

hour | total
------------------------+-------
2020-07-26 02:00:00+00 | 1
2020-07-26 04:00:00+00 | 7
2020-07-26 05:00:00+00 | 6
2020-07-26 06:00:00+00 | 6
2020-07-26 07:00:00+00 | 17
2020-07-26 08:00:00+00 | 17
2020-07-26 09:00:00+00 | 6
2020-07-26 10:00:00+00 | 8
2020-07-26 11:00:00+00 | 14
2020-07-26 12:00:00+00 | 16
2020-07-26 13:00:00+00 | 10
2020-07-26 14:00:00+00 | 17
2020-07-26 15:00:00+00 | 15
2020-07-26 16:00:00+00 | 2
2020-07-27 00:00:00+00 | 1
2020-07-27 06:00:00+00 | 1
..
2020-08-01 07:00:00+00 | 7
2020-08-01 08:00:00+00 | 4
2020-08-01 09:00:00+00 | 7
2020-08-01 10:00:00+00 | 10
2020-08-01 11:00:00+00 | 20
2020-08-01 12:00:00+00 | 25
2020-08-01 13:00:00+00 | 18
2020-08-01 14:00:00+00 | 14
2020-08-01 15:00:00+00 | 12
2020-08-01 16:00:00+00 | 4
(91 rows)



to the target pivot table:

hour 2020-7-26 2020-7-27 ... 2020-7-31 2020-8-01
0:00:00
1:00:00
2:00:00
3:00:00
4:00:00
5:00:00 1
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6
14:00:00 8 10 9 7 14 11 4
15:00:00 21 10 4 2 13 15
16:00:00 12 15 11 10 22 22
17:00:00 30 14 11 28 10 29
18:00:00 1
19:00:00
20:00:00
21:00:00
22:00:00
23:00:00


So far, I tried to simplify the query to actually get an idea of the target pivot table, removing datetime interval conditionals. Unfortunately it returns an error.
ERROR:  return and sql tuple descriptions are incompatible 


SELECT * FROM CROSSTAB('select EXTRACT(hour FROM o.creation_date)::text AS hour, o.creation_date::date::text AS day, COUNT(1) FROM cr_items ci, acs_objects o, cr_revisions cr WHERE ci.item_id = o.object_id AND ci.item_id = cr.item_id AND ci.latest_revision = cr.revision_id AND ci.content_type = ''qt_face'' GROUP BY o.creation_date, 1 ORDER BY hour ASC') AS t ("hour" TEXT, "day" NUMERIC);
ERROR:  return and sql tuple descriptions are incompatible




On Muh. 14, 1442 AH, at 23:58, Christophe Pettus <xof@thebuild.com> wrote:



On Sep 2, 2020, at 19:58, Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
I've tried to use crosstabN(text sql), to solve the problem directly in the datasource layer, but apparently tablefunc is not supported in the datamodel Squema

"tablefunc" is an extension, so you will need to create it in your database before using it:

CREATE EXTENSION tablefunc;

--
-- Christophe Pettus
  xof@thebuild.com


Re: Crossing/Rotating table rows to rows and columns

От
Christophe Pettus
Дата:

> On Sep 3, 2020, at 17:20, Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
>
> From the table structure, such as:
>
> hour | total
> ------------------------+-------
> 2020-07-26 02:00:00+00 | 1
> 2020-07-26 04:00:00+00 | 7
> 2020-07-26 05:00:00+00 | 6
> 2020-07-26 06:00:00+00 | 6
> 2020-07-26 07:00:00+00 | 17
> 2020-07-26 08:00:00+00 | 17
> 2020-07-26 09:00:00+00 | 6
> 2020-07-26 10:00:00+00 | 8
> 2020-07-26 11:00:00+00 | 14
> 2020-07-26 12:00:00+00 | 16
> 2020-07-26 13:00:00+00 | 10
> 2020-07-26 14:00:00+00 | 17
> 2020-07-26 15:00:00+00 | 15
> 2020-07-26 16:00:00+00 | 2
> 2020-07-27 00:00:00+00 | 1
> 2020-07-27 06:00:00+00 | 1
> ..
> 2020-08-01 07:00:00+00 | 7
> 2020-08-01 08:00:00+00 | 4
> 2020-08-01 09:00:00+00 | 7
> 2020-08-01 10:00:00+00 | 10
> 2020-08-01 11:00:00+00 | 20
> 2020-08-01 12:00:00+00 | 25
> 2020-08-01 13:00:00+00 | 18
> 2020-08-01 14:00:00+00 | 14
> 2020-08-01 15:00:00+00 | 12
> 2020-08-01 16:00:00+00 | 4
> (91 rows)
>
> to the target pivot table:
> hour 2020-7-26 2020-7-27 ... 2020-7-31 2020-8-01
> 0:00:00
> 1:00:00
> 2:00:00
> 3:00:00
> 4:00:00
> 5:00:00 1
> 6:00:00 2 2 4 22 7 4
> 7:00:00 8 2 3 8 1
> 8:00:00 3 8 4 1 9 4
> 9:00:00 4 6 2 35 8
> 10:00:00 9 19 14 2 10 2
> 11:00:00 11 8 7 13 10 13 10
> 12:00:00 12 7 18 12 8 12 5
> 13:00:00 6 14 8 24 10 6 6
> 14:00:00 8 10 9 7 14 11 4
> 15:00:00 21 10 4 2 13 15
> 16:00:00 12 15 11 10 22 22
> 17:00:00 30 14 11 28 10 29
> 18:00:00 1
> 19:00:00
> 20:00:00
> 21:00:00
> 22:00:00
> 23:00:00

You can do that particular type of operation with aggregates:

    SELECT hour, array_agg(total) as totals FROM (select date_trunc('hour', o.creation_date) AS datetime,
            COUNT(1) AS total
            FROM cr_items ci, acs_objects o, cr_revisions cr
            WHERE ci.item_id = o.object_id
            AND ci.item_id = cr.item_id
            AND ci.latest_revision = cr.revision_id
            AND ci.content_type = :content_type
            AND o.creation_date BETWEEN :creation_date::date - INTERVAL '6 day' AND :creation_date::date + INTERVAL '1
day'
            GROUP BY 1) t;

This will not include any of the "hour"s that do not have a "total", but you can achieve that with a left outer join
againsta subquery that uses generate_series to create a single-row table with all of the desired hours in it. 

--
-- Christophe Pettus
   xof@thebuild.com




Re: Crossing/Rotating table rows to rows and columns

От
Iuri Sampaio
Дата:
You can do that particular type of operation with aggregates:

not quite.

Your query misses GROUP BY in the end.

ERROR:  column "t.datetime" must appear in the GROUP BY clause or be used in an aggregate function


Furthermore, aggregate function misses concatenation of datetime and total. So, I fixed it and its result is not what I need. Please, see bellow.

SELECT datetime, array_agg(datetime || ', ' || total) as totals FROM (select date_trunc('hour', o.creation_date) AS datetime, COUNT(1) AS total FROM cr_items ci, acs_objects o, cr_revisions cr WHERE ci.item_id = o.object_id AND ci.item_id = cr.item_id AND ci.latest_revision = cr.revision_id AND ci.content_type = 'qt_face' AND o.creation_date BETWEEN '2020-09-02'::date - INTERVAL '6 day' AND '2020-09-02'::date + INTERVAL '1 day' GROUP BY datetime, 1) t GROUP BY datetime;
        datetime        |             totals             
------------------------+--------------------------------
 2020-08-27 06:00:00+00 | {"2020-08-27 06:00:00+00, 4"}
 2020-08-27 07:00:00+00 | {"2020-08-27 07:00:00+00, 1"}
 2020-08-27 08:00:00+00 | {"2020-08-27 08:00:00+00, 4"}
 2020-08-27 09:00:00+00 | {"2020-08-27 09:00:00+00, 8"}
 2020-08-27 10:00:00+00 | {"2020-08-27 10:00:00+00, 2"}
 2020-08-27 11:00:00+00 | {"2020-08-27 11:00:00+00, 10"}
 2020-08-27 12:00:00+00 | {"2020-08-27 12:00:00+00, 5"}
 2020-08-27 13:00:00+00 | {"2020-08-27 13:00:00+00, 6"}
 2020-08-27 14:00:00+00 | {"2020-08-27 14:00:00+00, 4"}
 2020-08-27 15:00:00+00 | {"2020-08-27 15:00:00+00, 15"}
 2020-08-27 16:00:00+00 | {"2020-08-27 16:00:00+00, 22"}
 2020-08-27 17:00:00+00 | {"2020-08-27 17:00:00+00, 37"}
 2020-08-27 18:00:00+00 | {"2020-08-27 18:00:00+00, 14"}
 2020-08-28 06:00:00+00 | {"2020-08-28 06:00:00+00, 3"}
 2020-08-28 07:00:00+00 | {"2020-08-28 07:00:00+00, 1"}
 2020-08-28 08:00:00+00 | {"2020-08-28 08:00:00+00, 7"}
 2020-08-28 09:00:00+00 | {"2020-08-28 09:00:00+00, 5"}




The target structure must have dates as columns, and that turns out to be complex because the number of columns is not static, plus the labels of the columns change based in the day the query is executed, thus CROSSTAB doesn’t seem to be applicable.  




hour 2020-7-26 2020-7-27 ... 2020-7-31 2020-8-01
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6


hour 2020-7-30 2020-7-31 ... 2020-8-01 2020-8-02
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6


hour 2020-8-10 2020-8-11 ... 2020-8-12 2020-8-16
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6




Is there anyway to rotate lines to columns, without loosing the lines per hour, as in the target pivot table bellow:


hour 2020-7-26 2020-7-27 ... 2020-7-31 2020-8-01
0:00:00
1:00:00
2:00:00
3:00:00
4:00:00
5:00:00 1
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6
14:00:00 8 10 9 7 14 11 4
15:00:00 21 10 4 2 13 15
16:00:00 12 15 11 10 22 22
17:00:00 30 14 11 28 10 29
18:00:00 1
19:00:00
20:00:00
21:00:00
22:00:00
23:00:00



On Muh. 15, 1442 AH, at 21:22, Christophe Pettus <xof@thebuild.com> wrote:



On Sep 3, 2020, at 17:20, Iuri Sampaio <iuri.sampaio@gmail.com> wrote:

From the table structure, such as:

hour | total
------------------------+-------
2020-07-26 02:00:00+00 | 1
2020-07-26 04:00:00+00 | 7
2020-07-26 05:00:00+00 | 6
2020-07-26 06:00:00+00 | 6
2020-07-26 07:00:00+00 | 17
2020-07-26 08:00:00+00 | 17
2020-07-26 09:00:00+00 | 6
2020-07-26 10:00:00+00 | 8
2020-07-26 11:00:00+00 | 14
2020-07-26 12:00:00+00 | 16
2020-07-26 13:00:00+00 | 10
2020-07-26 14:00:00+00 | 17
2020-07-26 15:00:00+00 | 15
2020-07-26 16:00:00+00 | 2
2020-07-27 00:00:00+00 | 1
2020-07-27 06:00:00+00 | 1
..
2020-08-01 07:00:00+00 | 7
2020-08-01 08:00:00+00 | 4
2020-08-01 09:00:00+00 | 7
2020-08-01 10:00:00+00 | 10
2020-08-01 11:00:00+00 | 20
2020-08-01 12:00:00+00 | 25
2020-08-01 13:00:00+00 | 18
2020-08-01 14:00:00+00 | 14
2020-08-01 15:00:00+00 | 12
2020-08-01 16:00:00+00 | 4
(91 rows)

to the target pivot table:
hour 2020-7-26 2020-7-27 ... 2020-7-31 2020-8-01
0:00:00
1:00:00
2:00:00
3:00:00
4:00:00
5:00:00 1
6:00:00 2 2 4 22 7 4
7:00:00 8 2 3 8 1
8:00:00 3 8 4 1 9 4
9:00:00 4 6 2 35 8
10:00:00 9 19 14 2 10 2
11:00:00 11 8 7 13 10 13 10
12:00:00 12 7 18 12 8 12 5
13:00:00 6 14 8 24 10 6 6
14:00:00 8 10 9 7 14 11 4
15:00:00 21 10 4 2 13 15
16:00:00 12 15 11 10 22 22
17:00:00 30 14 11 28 10 29
18:00:00 1
19:00:00
20:00:00
21:00:00
22:00:00
23:00:00

You can do that particular type of operation with aggregates:

SELECT hour, array_agg(total) as totals FROM (select date_trunc('hour', o.creation_date) AS datetime,
           COUNT(1) AS total
           FROM cr_items ci, acs_objects o, cr_revisions cr
           WHERE ci.item_id = o.object_id
           AND ci.item_id = cr.item_id
           AND ci.latest_revision = cr.revision_id
           AND ci.content_type = :content_type
           AND o.creation_date BETWEEN :creation_date::date - INTERVAL '6 day' AND :creation_date::date + INTERVAL '1 day'
           GROUP BY 1) t;

This will not include any of the "hour"s that do not have a "total", but you can achieve that with a left outer join against a subquery that uses generate_series to create a single-row table with all of the desired hours in it.

--
-- Christophe Pettus
  xof@thebuild.com