Обсуждение: Subquery for column names of tablefunc crosstab queries

Поиск
Список
Период
Сортировка

Subquery for column names of tablefunc crosstab queries

От
Julia Jacobson
Дата:
Hello everybody out there familiar with the tablefunc module,

Having read the documentation of the tablefunc module,
I'm wondering whether it is possible to get the values for the names of
the columns for the crosstab query from a subquery.
A minimal example would look like this:

CREATE TABLE mytable(rowid text, rowdt timestamp, temperature int);
INSERT INTO cth VALUES('test1','01 March 2003','42');
INSERT INTO cth VALUES('test2','02 March 2003','53');
INSERT INTO cth VALUES('test3','03 March 2003','49');

SELECT * FROM crosstab
(
   'SELECT rowid, rowdt, temperature FROM mytable ORDER BY 1',
   'SELECT DISTINCT rowdt FROM mytable ORDER BY 1'
)
AS
(
        rowid text  # This works fine
        # Here should be a subquery to get column names
        # automatically,
        # like 'SELECT rowdt FROM mytable'
);

Thanks in advance,
Julia

Re: Subquery for column names of tablefunc crosstab queries

От
Joe Conway
Дата:
On 01/21/2011 05:48 AM, Julia Jacobson wrote:
> Having read the documentation of the tablefunc module,
> I'm wondering whether it is possible to get the values for the names of
> the columns for the crosstab query from a subquery.
> A minimal example would look like this:
>
> CREATE TABLE mytable(rowid text, rowdt timestamp, temperature int);
> INSERT INTO cth VALUES('test1','01 March 2003','42');
> INSERT INTO cth VALUES('test2','02 March 2003','53');
> INSERT INTO cth VALUES('test3','03 March 2003','49');
>
> SELECT * FROM crosstab
> (
>   'SELECT rowid, rowdt, temperature FROM mytable ORDER BY 1',
>   'SELECT DISTINCT rowdt FROM mytable ORDER BY 1'
> )
> AS
> (
>        rowid text  # This works fine
>        # Here should be a subquery to get column names
>        # automatically,
>        # like 'SELECT rowdt FROM mytable'
> );

No, it is not possible. The column definition list needs to be available
to the parser. I usually recommend running this as two queries from your
application. The first does:

SELECT DISTINCT rowdt::date FROM mytable ORDER BY 1;
   rowdt
------------
 2003-03-01
 2003-03-02
 2003-03-03
(3 rows)


Then the application dynamically builds the second query and executes
it. Following your example, something like:

SELECT * FROM crosstab
(
  $$SELECT rowid, rowdt::date, temperature FROM mytable ORDER BY 1$$,
  $$VALUES('2003-03-01'),('2003-03-02'),('2003-03-03')$$
)
AS
(
   rowid text,
   "2003-03-01" int,
   "2003-03-02" int,
   "2003-03-03" int
);
 rowid | 2003-03-01 | 2003-03-02 | 2003-03-03
-------+------------+------------+------------
 test1 |         42 |            |
 test2 |            |         53 |
 test3 |            |            |         49
(3 rows)

HTH,

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Вложения

Re: Subquery for column names of tablefunc crosstab queries

От
Julia Jacobson
Дата:
Thanks for your answer. The fact that the column definition list must be
available to the parser was not obvious to me.
Instead of building the second query dynamically in an application, I
would prefer a user-defined function, because there will probably be
several applications in need of it and (please correct me if I'm wrong)
I hope for a better performance by creating a permanent view of my
crosstab query.
So I have tried something like:

CREATE FUNCTION denorm_table(mytable_name) RETURNS text AS '
DECLARE
     tab_def_list TEXT;
BEGIN
     # First query to create table definition list
     SELECT DISTINCT rowdt::date INTO tab_def_list FROM mytable_name;
     # Crosstab Query
     SELECT * FROM crosstab
     (
       'SELECT rowid, rowdt, temperature FROM mytable ORDER BY 1',
       'SELECT DISTINCT rowdt FROM mytable ORDER BY 1'
     )
     AS
     (
             rowid text,
             tab_def_list
     );
END;
' LANGUAGE plpgsql;

Being rather unexperienced in PL/pgSQL, I have problems to get the code
working.
Two further problems are the fact that the names of columns are not
allowed to begin with a number and every entry in the table definition
list must not only contain the name of the column but of course also a
data type (always the same - int).
Is it possible for a newbie to solve my problem by a user-defined
function in PL/pgSQL or is it rather complicated?


On 01/21/2011 16:08, Joe Conway wrote:
> On 01/21/2011 05:48 AM, Julia Jacobson wrote:
>> Having read the documentation of the tablefunc module,
>> I'm wondering whether it is possible to get the values for the names of
>> the columns for the crosstab query from a subquery.
>> A minimal example would look like this:
>>
>> CREATE TABLE mytable(rowid text, rowdt timestamp, temperature int);
>> INSERT INTO cth VALUES('test1','01 March 2003','42');
>> INSERT INTO cth VALUES('test2','02 March 2003','53');
>> INSERT INTO cth VALUES('test3','03 March 2003','49');
>>
>> SELECT * FROM crosstab
>> (
>>    'SELECT rowid, rowdt, temperature FROM mytable ORDER BY 1',
>>    'SELECT DISTINCT rowdt FROM mytable ORDER BY 1'
>> )
>> AS
>> (
>>         rowid text  # This works fine
>>         # Here should be a subquery to get column names
>>         # automatically,
>>         # like 'SELECT rowdt FROM mytable'
>> );
>
> No, it is not possible. The column definition list needs to be available
> to the parser. I usually recommend running this as two queries from your
> application. The first does:
>
> SELECT DISTINCT rowdt::date FROM mytable ORDER BY 1;
>     rowdt
> ------------
>   2003-03-01
>   2003-03-02
>   2003-03-03
> (3 rows)
>
>
> Then the application dynamically builds the second query and executes
> it. Following your example, something like:
>
> SELECT * FROM crosstab
> (
>    $$SELECT rowid, rowdt::date, temperature FROM mytable ORDER BY 1$$,
>    $$VALUES('2003-03-01'),('2003-03-02'),('2003-03-03')$$
> )
> AS
> (
>     rowid text,
>     "2003-03-01" int,
>     "2003-03-02" int,
>     "2003-03-03" int
> );
>   rowid | 2003-03-01 | 2003-03-02 | 2003-03-03
> -------+------------+------------+------------
>   test1 |         42 |            |
>   test2 |            |         53 |
>   test3 |            |            |         49
> (3 rows)
>
> HTH,
>
> Joe


Re: Subquery for column names of tablefunc crosstab queries

От
Joe Conway
Дата:
On 01/21/2011 09:34 AM, Julia Jacobson wrote:
> Two further problems are the fact that the names of columns are not
> allowed to begin with a number and every entry in the table definition
> list must not only contain the name of the column but of course also a
> data type (always the same - int).
> Is it possible for a newbie to solve my problem by a user-defined
> function in PL/pgSQL or is it rather complicated?

It's a bit tricky to get correct. The following is a bit ugly, and just
barely tested, but works at least for your presented case:

-------------------------------------------
CREATE OR REPLACE FUNCTION generate_crosstab_sql(relname text,
                                                 grpattr text,
                                                 grpattrtyp text,
                                                 catattr text,
                                                 valattr text,
                                                 valattrtyp text,
                                                 whereclause text)
RETURNS text AS $$
  DECLARE
    crosstabsql  text;
    coldef       text;
    catdef       text;
    rec      record;
  BEGIN
    coldef := '(' || grpattr || ' ' || grpattrtyp;
    catdef := 'VALUES';
    FOR rec IN EXECUTE
    'SELECT DISTINCT ' || catattr ||
    ' AS c FROM ' || relname || ' WHERE ' ||
    whereclause || ' ORDER BY 1'
    LOOP
      coldef := coldef || ',"' || rec.c || '" ' || valattrtyp;
      IF catdef = 'VALUES' THEN
        catdef := catdef || '($v$' || rec.c || '$v$)';
      ELSE
        catdef := catdef || ',($v$' || rec.c || '$v$)';
      END IF;
    END LOOP;
    coldef := coldef || ')';

    IF catdef != 'VALUES' THEN
      crosstabsql :=
      $ct$SELECT * FROM crosstab ('SELECT $ct$ ||
      grpattr || $ct$,$ct$ ||
      catattr || $ct$,$ct$ || valattr ||
      $ct$ FROM $ct$ || relname ||
      $ct$ WHERE $ct$ || whereclause ||
      $ct$ ORDER BY 1,2','$ct$ ||
      catdef || $ct$') AS $ct$ || coldef;
    END IF;
    RETURN crosstabsql;
  END;
$$ LANGUAGE plpgsql;
-------------------------------------------

Then this call:
-------------------------------------------
SELECT generate_crosstab_sql('mytable',
                             'rowid',
                             'text',
                             'rowdt::date',
                             'temperature',
                             'int',
                             '1 = 1');
-------------------------------------------

Produces this SQL:
-------------------------------------------
SELECT * FROM crosstab ('SELECT rowid,rowdt::date,temperature FROM
mytable WHERE 1 = 1 ORDER BY
1,2','VALUES($v$2003-03-01$v$),($v$2003-03-02$v$),($v$2003-03-03$v$)')
AS (rowid
text,"2003-03-01" int,"2003-03-02" int,"2003-03-03" int)
-------------------------------------------

Which produces this result:
-------------------------------------------
SELECT * FROM crosstab ('SELECT rowid,rowdt::date,temperature FROM
mytable WHERE 1 = 1 ORDER BY
1,2','VALUES($v$2003-03-01$v$),($v$2003-03-02$v$),($v$2003-03-03$v$)')
AS (rowid
contrib_regression(# text,"2003-03-01" int,"2003-03-02" int,"2003-03-03"
int);
 rowid | 2003-03-01 | 2003-03-02 | 2003-03-03
-------+------------+------------+------------
 test1 |         42 |            |
 test2 |            |         53 |
 test3 |            |            |         49
(3 rows)
-------------------------------------------

You might need some adjustments to get this to do exactly what you want.
Also please test it for correctness ;-)

HTH,

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Вложения

Re: Subquery for column names of tablefunc crosstab queries

От
Sim Zacks
Дата:
What we have done in the past is dynamically build the crosstab query
while getting the field names through another query. Then we select the
query into a table and then select * from the table to get the result.


Sim



On 01/21/2011 07:34 PM, Julia Jacobson wrote:

> Thanks for your answer. The fact that the column definition list must
> be available to the parser was not obvious to me.
> Instead of building the second query dynamically in an application, I
> would prefer a user-defined function, because there will probably be
> several applications in need of it and (please correct me if I'm
> wrong) I hope for a better performance by creating a permanent view of
> my crosstab query.
> So I have tried something like:
>
> CREATE FUNCTION denorm_table(mytable_name) RETURNS text AS '
> DECLARE
>     tab_def_list TEXT;
> BEGIN
>     # First query to create table definition list
>     SELECT DISTINCT rowdt::date INTO tab_def_list FROM mytable_name;
>     # Crosstab Query
>     SELECT * FROM crosstab
>     (
>       'SELECT rowid, rowdt, temperature FROM mytable ORDER BY 1',
>       'SELECT DISTINCT rowdt FROM mytable ORDER BY 1'
>     )
>     AS
>     (
>             rowid text,
>             tab_def_list
>     );
> END;
> ' LANGUAGE plpgsql;
>
> Being rather unexperienced in PL/pgSQL, I have problems to get the
> code working.
> Two further problems are the fact that the names of columns are not
> allowed to begin with a number and every entry in the table definition
> list must not only contain the name of the column but of course also a
> data type (always the same - int).
> Is it possible for a newbie to solve my problem by a user-defined
> function in PL/pgSQL or is it rather complicated?
>
>
> On 01/21/2011 16:08, Joe Conway wrote:
>> On 01/21/2011 05:48 AM, Julia Jacobson wrote:
>>> Having read the documentation of the tablefunc module,
>>> I'm wondering whether it is possible to get the values for the names of
>>> the columns for the crosstab query from a subquery.
>>> A minimal example would look like this:
>>>
>>> CREATE TABLE mytable(rowid text, rowdt timestamp, temperature int);
>>> INSERT INTO cth VALUES('test1','01 March 2003','42');
>>> INSERT INTO cth VALUES('test2','02 March 2003','53');
>>> INSERT INTO cth VALUES('test3','03 March 2003','49');
>>>
>>> SELECT * FROM crosstab
>>> (
>>>    'SELECT rowid, rowdt, temperature FROM mytable ORDER BY 1',
>>>    'SELECT DISTINCT rowdt FROM mytable ORDER BY 1'
>>> )
>>> AS
>>> (
>>>         rowid text  # This works fine
>>>         # Here should be a subquery to get column names
>>>         # automatically,
>>>         # like 'SELECT rowdt FROM mytable'
>>> );
>>
>> No, it is not possible. The column definition list needs to be available
>> to the parser. I usually recommend running this as two queries from your
>> application. The first does:
>>
>> SELECT DISTINCT rowdt::date FROM mytable ORDER BY 1;
>>     rowdt
>> ------------
>>   2003-03-01
>>   2003-03-02
>>   2003-03-03
>> (3 rows)
>>
>>
>> Then the application dynamically builds the second query and executes
>> it. Following your example, something like:
>>
>> SELECT * FROM crosstab
>> (
>>    $$SELECT rowid, rowdt::date, temperature FROM mytable ORDER BY 1$$,
>>    $$VALUES('2003-03-01'),('2003-03-02'),('2003-03-03')$$
>> )
>> AS
>> (
>>     rowid text,
>>     "2003-03-01" int,
>>     "2003-03-02" int,
>>     "2003-03-03" int
>> );
>>   rowid | 2003-03-01 | 2003-03-02 | 2003-03-03
>> -------+------------+------------+------------
>>   test1 |         42 |            |
>>   test2 |            |         53 |
>>   test3 |            |            |         49
>> (3 rows)
>>
>> HTH,
>>
>> Joe
>
>


Re: Subquery for column names of tablefunc crosstab queries

От
markw
Дата:
Thanks Joe

I'm very much learning as I go.

I've followed your example from top to bottom - your sample code works -
however its not clear to me how to execute the generate_crosstab_sql
function to produce the results in one single step.

I've tried this:

CREATE OR REPLACE VIEW mycrosstabresults AS select * FROM (SELECT
generate_crosstab_sql('mytable',
                             'rowid',
                             'text',
                             'rowdt::date',
                             'temperature',
                             'int',
                             '1 = 1')) as crosstabresults;

but this simply populates the view with the string below, not the actual
crosstab results.

SELECT * FROM crosstab ('SELECT rowid,rowdt::date,temperature FROM
mytable WHERE 1 = 1 ORDER BY
1,2','VALUES($v$2003-03-01$v$),($v$2003-03-02$v$),($v$2003-03-03$v$)')
AS (rowid
text,"2003-03-01" int,"2003-03-02" int,"2003-03-03" int)

Is it possible to incorporate and execute this string, all within my CREATE
OR REPLACE VIEW statement?

I would appreciate any help you can give.

Many thanks



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Subquery-for-column-names-of-tablefunc-crosstab-queries-tp3351437p3374199.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.