Обсуждение: using explain output within pgsql
Hi,
I'm starting up a datawarehouse with patitioning.
my etl processes write directly into the corresponding partitions instead of using triggers.
The reports I run in the datawarehouse are stored in a cache within the same database.
Now I'd like to store besides the results the dependencies to the tables which were used to generate the report. with this information i could invalidate cache results for the tables I'm going to import with my etl processes.
explain analyze gives me the information which table or patition is read from for each report. e.g
explain analyze (FORMAT YAML) create table cache.report234 as select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27' and week <= '2011-07-11' group by col1,col2;
now I'd like to store the output of explain analyze in a pgsql variable for further processing. that looks something like this.
DO $$declare l_explain text;
begin
l_explain := explain analyze (FORMAT YAML) create table cache.report234 as select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27' and week <= '2011-07-11' group by col1,col2;
select l_explain;
end$$;
But that doesn't work. I get a syntax error.
Does anybody has an idea how to retrieve the output of explain within pgsql and store this in a variable?
An alternative would be any other way to extract the information about tables used by arbitrary sql statements.
best regards,
Uwe
I'm starting up a datawarehouse with patitioning.
my etl processes write directly into the corresponding partitions instead of using triggers.
The reports I run in the datawarehouse are stored in a cache within the same database.
Now I'd like to store besides the results the dependencies to the tables which were used to generate the report. with this information i could invalidate cache results for the tables I'm going to import with my etl processes.
explain analyze gives me the information which table or patition is read from for each report. e.g
explain analyze (FORMAT YAML) create table cache.report234 as select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27' and week <= '2011-07-11' group by col1,col2;
now I'd like to store the output of explain analyze in a pgsql variable for further processing. that looks something like this.
DO $$declare l_explain text;
begin
l_explain := explain analyze (FORMAT YAML) create table cache.report234 as select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27' and week <= '2011-07-11' group by col1,col2;
select l_explain;
end$$;
But that doesn't work. I get a syntax error.
Does anybody has an idea how to retrieve the output of explain within pgsql and store this in a variable?
An alternative would be any other way to extract the information about tables used by arbitrary sql statements.
best regards,
Uwe
On Sunday, July 10, 2011 11:54:10 am Uwe Bartels wrote: > Hi, > > I'm starting up a datawarehouse with patitioning. > my etl processes write directly into the corresponding partitions instead > of using triggers. > > The reports I run in the datawarehouse are stored in a cache within the > same database. > Now I'd like to store besides the results the dependencies to the tables > which were used to generate the report. with this information i could > invalidate cache results for the tables I'm going to import with my etl > processes. > > explain analyze gives me the information which table or patition is read > from for each report. e.g > explain analyze (FORMAT YAML) create table cache.report234 as select > col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27' > and week <= '2011-07-11' group by col1,col2; > > now I'd like to store the output of explain analyze in a pgsql variable for > further processing. that looks something like this. > > DO $$declare l_explain text; > begin > l_explain := explain analyze (FORMAT YAML) create table cache.report234 as > select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= > '2011-06-27' and week <= '2011-07-11' group by col1,col2; > select l_explain; > end$$; > > But that doesn't work. I get a syntax error. From here: http://www.postgresql.org/docs/9.0/interactive/sql-explain.html I believe you are looking for: explain (ANALYZE, FORMAT YAML) create table... > > Does anybody has an idea how to retrieve the output of explain within pgsql > and store this in a variable? > An alternative would be any other way to extract the information about > tables used by arbitrary sql statements. > > best regards, > Uwe -- Adrian Klaver adrian.klaver@gmail.com
Hello 2011/7/10 Uwe Bartels <uwe.bartels@gmail.com>: > Hi, > > I'm starting up a datawarehouse with patitioning. > my etl processes write directly into the corresponding partitions instead of > using triggers. > > The reports I run in the datawarehouse are stored in a cache within the same > database. > Now I'd like to store besides the results the dependencies to the tables > which were used to generate the report. with this information i could > invalidate cache results for the tables I'm going to import with my etl Hello try FOR l_explain IN EXPLAIN ANALYZE ... LOOP ... Regards Pavel Stehule > processes. > > explain analyze gives me the information which table or patition is read > from for each report. e.g > explain analyze (FORMAT YAML) create table cache.report234 as select > col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27' > and week <= '2011-07-11' group by col1,col2; > > now I'd like to store the output of explain analyze in a pgsql variable for > further processing. that looks something like this. > > DO $$declare l_explain text; > begin > l_explain := explain analyze (FORMAT YAML) create table cache.report234 as > select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= > '2011-06-27' and week <= '2011-07-11' group by col1,col2; > select l_explain; > end$$; > > But that doesn't work. I get a syntax error. > > Does anybody has an idea how to retrieve the output of explain within pgsql > and store this in a variable? > An alternative would be any other way to extract the information about > tables used by arbitrary sql statements. > > best regards, > Uwe >
Hi Adrian,
yes. but this is not the cause for the syntax error.
thanks,
Uwe
yes. but this is not the cause for the syntax error.
thanks,
Uwe
On 10 July 2011 21:20, Adrian Klaver <adrian.klaver@gmail.com> wrote:
From here:On Sunday, July 10, 2011 11:54:10 am Uwe Bartels wrote:
> Hi,
>
> I'm starting up a datawarehouse with patitioning.
> my etl processes write directly into the corresponding partitions instead
> of using triggers.
>
> The reports I run in the datawarehouse are stored in a cache within the
> same database.
> Now I'd like to store besides the results the dependencies to the tables
> which were used to generate the report. with this information i could
> invalidate cache results for the tables I'm going to import with my etl
> processes.
>
> explain analyze gives me the information which table or patition is read
> from for each report. e.g
> explain analyze (FORMAT YAML) create table cache.report234 as select
> col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27'
> and week <= '2011-07-11' group by col1,col2;
>
> now I'd like to store the output of explain analyze in a pgsql variable for
> further processing. that looks something like this.
>
> DO $$declare l_explain text;
> begin
> l_explain := explain analyze (FORMAT YAML) create table cache.report234 as
> select col1,col2,sum(clicks) from dwh.event_log_weekly where week >=
> '2011-06-27' and week <= '2011-07-11' group by col1,col2;
> select l_explain;
> end$$;
>
> But that doesn't work. I get a syntax error.
http://www.postgresql.org/docs/9.0/interactive/sql-explain.html
I believe you are looking for:
explain (ANALYZE, FORMAT YAML) create table...--
>
> Does anybody has an idea how to retrieve the output of explain within pgsql
> and store this in a variable?
> An alternative would be any other way to extract the information about
> tables used by arbitrary sql statements.
>
> best regards,
> Uwe
Adrian Klaver
adrian.klaver@gmail.com
Hi Pavel,
I don't believe it. the second answer on sunday evening within half an hour.
and it works. excellent!!!
best regards,
Uwe
I don't believe it. the second answer on sunday evening within half an hour.
and it works. excellent!!!
best regards,
Uwe
On 10 July 2011 21:20, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hello
2011/7/10 Uwe Bartels <uwe.bartels@gmail.com>:> Hi,Hello
>
> I'm starting up a datawarehouse with patitioning.
> my etl processes write directly into the corresponding partitions instead of
> using triggers.
>
> The reports I run in the datawarehouse are stored in a cache within the same
> database.
> Now I'd like to store besides the results the dependencies to the tables
> which were used to generate the report. with this information i could
> invalidate cache results for the tables I'm going to import with my etl
try
FOR l_explain IN EXPLAIN ANALYZE ...
LOOP
...
Regards
Pavel Stehule
> processes.
>
> explain analyze gives me the information which table or patition is read
> from for each report. e.g
> explain analyze (FORMAT YAML) create table cache.report234 as select
> col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27'
> and week <= '2011-07-11' group by col1,col2;
>
> now I'd like to store the output of explain analyze in a pgsql variable for
> further processing. that looks something like this.
>
> DO $$declare l_explain text;
> begin
> l_explain := explain analyze (FORMAT YAML) create table cache.report234 as
> select col1,col2,sum(clicks) from dwh.event_log_weekly where week >=
> '2011-06-27' and week <= '2011-07-11' group by col1,col2;
> select l_explain;
> end$$;
>
> But that doesn't work. I get a syntax error.
>
> Does anybody has an idea how to retrieve the output of explain within pgsql
> and store this in a variable?
> An alternative would be any other way to extract the information about
> tables used by arbitrary sql statements.
>
> best regards,
> Uwe
>
Hi Pavel,
is it posible to get this running even with dynamic sql?
I didn't write that. I'm using execute to run this create table ....
best regards,
Uwe
is it posible to get this running even with dynamic sql?
I didn't write that. I'm using execute to run this create table ....
best regards,
Uwe
On 10 July 2011 21:20, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hello
2011/7/10 Uwe Bartels <uwe.bartels@gmail.com>:> Hi,Hello
>
> I'm starting up a datawarehouse with patitioning.
> my etl processes write directly into the corresponding partitions instead of
> using triggers.
>
> The reports I run in the datawarehouse are stored in a cache within the same
> database.
> Now I'd like to store besides the results the dependencies to the tables
> which were used to generate the report. with this information i could
> invalidate cache results for the tables I'm going to import with my etl
try
FOR l_explain IN EXPLAIN ANALYZE ...
LOOP
...
Regards
Pavel Stehule
> processes.
>
> explain analyze gives me the information which table or patition is read
> from for each report. e.g
> explain analyze (FORMAT YAML) create table cache.report234 as select
> col1,col2,sum(clicks) from dwh.event_log_weekly where week >= '2011-06-27'
> and week <= '2011-07-11' group by col1,col2;
>
> now I'd like to store the output of explain analyze in a pgsql variable for
> further processing. that looks something like this.
>
> DO $$declare l_explain text;
> begin
> l_explain := explain analyze (FORMAT YAML) create table cache.report234 as
> select col1,col2,sum(clicks) from dwh.event_log_weekly where week >=
> '2011-06-27' and week <= '2011-07-11' group by col1,col2;
> select l_explain;
> end$$;
>
> But that doesn't work. I get a syntax error.
>
> Does anybody has an idea how to retrieve the output of explain within pgsql
> and store this in a variable?
> An alternative would be any other way to extract the information about
> tables used by arbitrary sql statements.
>
> best regards,
> Uwe
>
2011/7/10 Uwe Bartels <uwe.bartels@gmail.com>: > Hi Pavel, > > is it posible to get this running even with dynamic sql? > I didn't write that. I'm using execute to run this create table .... > probably yes postgres=# do $$ declare x text; begin execute e'explain(format yaml) select * from data where value = \'a\'' into x; raise notice '%', x; end; $$ language plpgsql; NOTICE: - Plan: Node Type: "Seq Scan" Relation Name: "data" Alias: "data" Startup Cost: 0.00 Total Cost: 23.38 Plan Rows: 5 Plan Width: 46 Filter: "((value)::text = 'a'::text)" DO > best regards, > Uwe > > On 10 July 2011 21:20, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> Hello >> >> >> >> 2011/7/10 Uwe Bartels <uwe.bartels@gmail.com>: >> > Hi, >> > >> > I'm starting up a datawarehouse with patitioning. >> > my etl processes write directly into the corresponding partitions >> > instead of >> > using triggers. >> > >> > The reports I run in the datawarehouse are stored in a cache within the >> > same >> > database. >> > Now I'd like to store besides the results the dependencies to the tables >> > which were used to generate the report. with this information i could >> > invalidate cache results for the tables I'm going to import with my etl >> Hello >> >> try >> >> FOR l_explain IN EXPLAIN ANALYZE ... >> LOOP >> ... >> >> Regards >> >> Pavel Stehule >> >> > processes. >> > >> > explain analyze gives me the information which table or patition is read >> > from for each report. e.g >> > explain analyze (FORMAT YAML) create table cache.report234 as select >> > col1,col2,sum(clicks) from dwh.event_log_weekly where week >= >> > '2011-06-27' >> > and week <= '2011-07-11' group by col1,col2; >> > >> > now I'd like to store the output of explain analyze in a pgsql variable >> > for >> > further processing. that looks something like this. >> > >> > DO $$declare l_explain text; >> > begin >> > l_explain := explain analyze (FORMAT YAML) create table cache.report234 >> > as >> > select col1,col2,sum(clicks) from dwh.event_log_weekly where week >= >> > '2011-06-27' and week <= '2011-07-11' group by col1,col2; >> > select l_explain; >> > end$$; >> > >> > But that doesn't work. I get a syntax error. >> > >> > Does anybody has an idea how to retrieve the output of explain within >> > pgsql >> > and store this in a variable? >> > An alternative would be any other way to extract the information about >> > tables used by arbitrary sql statements. >> > >> > best regards, >> > Uwe >> > > >
On 11/07/11 08:18, Pavel Stehule wrote:
I find that I understand things better if I rephrase things, so I took Pavel's code and converted it to use variables so I could see more clearly what is happening.
I think using variables makes the use of 'execute' more understandable.
I hope this version is of value to to others, I have included all the code required to run it as a working example.
CREATE TABLE data
(
id int,
value text
);
INSERT INTO
data (id, value)
VALUES
(1, 'a'),
(2, 'b');
do $$
declare
v_sql_query text;
v_sql_explain text;
v_result text;
begin
v_sql_query :=e'SELECT * FROM data d WHERE value = \'a\'';
v_sql_explain :=e'EXPLAIN(FORMAT YAML) ' || v_sql_query;
execute v_sql_explain into v_result;
raise notice 'v_result: %', v_result;
end;
$$ language plpgsql;
Cheers,
Gavin
[...]2011/7/10 Uwe Bartels <uwe.bartels@gmail.com>:Hi Pavel, is it posible to get this running even with dynamic sql? I didn't write that. I'm using execute to run this create table ....probably yes postgres=# do $$ declare x text; begin execute e'explain(format yaml) select * from data where value = \'a\'' into x; raise notice '%', x; end; $$ language plpgsql; NOTICE: - Plan: Node Type: "Seq Scan" Relation Name: "data" Alias: "data" Startup Cost: 0.00 Total Cost: 23.38 Plan Rows: 5 Plan Width: 46 Filter: "((value)::text = 'a'::text)" DO
I find that I understand things better if I rephrase things, so I took Pavel's code and converted it to use variables so I could see more clearly what is happening.
I think using variables makes the use of 'execute' more understandable.
I hope this version is of value to to others, I have included all the code required to run it as a working example.
CREATE TABLE data
(
id int,
value text
);
INSERT INTO
data (id, value)
VALUES
(1, 'a'),
(2, 'b');
do $$
declare
v_sql_query text;
v_sql_explain text;
v_result text;
begin
v_sql_query :=e'SELECT * FROM data d WHERE value = \'a\'';
v_sql_explain :=e'EXPLAIN(FORMAT YAML) ' || v_sql_query;
execute v_sql_explain into v_result;
raise notice 'v_result: %', v_result;
end;
$$ language plpgsql;
Cheers,
Gavin