Обсуждение: Query Plan Columns

Поиск
Список
Период
Сортировка

Query Plan Columns

От
"David E. Wheeler"
Дата:
Fellow Hackers,

I'm writing a function to turn an EXPLAIN plan into a table with columns. As such, I need to have a complete list of
thevarious bits of each plan node and their types for the table. Here's what I've got so far: 
   "Node Type"           TEXT,   "Strategy"            TEXT,   "Startup Cost"        NUMERIC,   "Total Cost"
NUMERIC,  "Plan Rows"           INTEGER,   "Plan Width"          INTEGER,   "Actual Startup Time" NUMERIC,   "Actual
TotalTime"   NUMERIC,   "Actual Rows"         INTEGER,   "Actual Width"        INTEGER,   "Actual Loops"
INTEGER,  "Parent Relationship" TEXT   "Sort Key"            TEXT[],     "Sort Method"         TEXT,
"Sort-Space-Used"    TEXT,   "Sort-Space-Type"     TEXT,   "Join Type"           TEXT,   "Join Filter"         TEXT,
"HashCond"           TEXT,   "Relation Name"       NAME,   "Alias"               NAME,   "Scan Direction"      TEXT,
"IndexName"          NAME,   "Index Cond"          TEXT,   "Subplan Name"        TEXT,   "Function Name"       TEXT,
"Filter"             TEXT 

Questions:

* Am I missing any?
* Are the data types correct?
* Is there some place in the source with a canonical list?

Thanks,

David



Re: Query Plan Columns

От
Andrew Dunstan
Дата:

On 11/05/2010 04:27 PM, David E. Wheeler wrote:
> Fellow Hackers,
>
> I'm writing a function to turn an EXPLAIN plan into a table with columns. As such, I need to have a complete list of
thevarious bits of each plan node and their types for the table. Here's what I've got so far:
 
>
>      "Node Type"           TEXT,
>      "Strategy"            TEXT,
>      "Startup Cost"        NUMERIC,
>      "Total Cost"          NUMERIC,
>      "Plan Rows"           INTEGER,
>      "Plan Width"          INTEGER,
>      "Actual Startup Time" NUMERIC,
>      "Actual Total Time"   NUMERIC,
>      "Actual Rows"         INTEGER,
>      "Actual Width"        INTEGER,
>      "Actual Loops"        INTEGER,
>      "Parent Relationship" TEXT
>      "Sort Key"            TEXT[],
>      "Sort Method"         TEXT,
>      "Sort-Space-Used"     TEXT,
>      "Sort-Space-Type"     TEXT,
>      "Join Type"           TEXT,
>      "Join Filter"         TEXT,
>      "Hash Cond"           TEXT,
>      "Relation Name"       NAME,
>      "Alias"               NAME,
>      "Scan Direction"      TEXT,
>      "Index Name"          NAME,
>      "Index Cond"          TEXT,
>      "Subplan Name"        TEXT,
>      "Function Name"       TEXT,
>      "Filter"              TEXT
>
> Questions:
>
> * Am I missing any?
> * Are the data types correct?
> * Is there some place in the source with a canonical list?


src/backend/commands/explain.c

Of course, there are containers too, which are not in your list at all. 
How do you intend to represent the tree-ish structure in a flat table?

cheers

andrew


Re: Query Plan Columns

От
Dimitri Fontaine
Дата:
"David E. Wheeler" <david@kineticode.com> writes:
> * Is there some place in the source with a canonical list?

It seems that you need to read through ExplainNode in
src/backend/commands/explain.c:

http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Query Plan Columns

От
Josh Berkus
Дата:
> Of course, there are containers too, which are not in your list at all.
> How do you intend to represent the tree-ish structure in a flat table?

Andrew: we'll use a proximity tree.


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 5, 2010, at 1:36 PM, Andrew Dunstan wrote:

> Of course, there are containers too, which are not in your list at all. How do you intend to represent the tree-ish
structurein a flat table? 

So far I see only two containers: Subplans and Sot Keys. The latter is represented as an array. The former I'm going to
representsimply by having a parent_id column to point to a parent from a child node. 

Are there others?

Thanks,

David



Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 5, 2010, at 1:38 PM, Dimitri Fontaine wrote:

> It seems that you need to read through ExplainNode in
> src/backend/commands/explain.c:
>
>
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617

Ah, great, thanks.

David



Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:

> On Nov 5, 2010, at 1:38 PM, Dimitri Fontaine wrote:
>
>> It seems that you need to read through ExplainNode in
>> src/backend/commands/explain.c:
>>
>>
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
>
> Ah, great, thanks.
>
> David

Hrm…
960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);

Should rows really be a float?

Best,

David



Re: Query Plan Columns

От
Josh Berkus
Дата:
>  960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
> 
> Should rows really be a float?

Yeah, since PropertyInteger is 32-bit.


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Query Plan Columns

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
>> 960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
>> 
>> Should rows really be a float?

> Yeah, since PropertyInteger is 32-bit.

Yes, plan_rows is a float.  We might need to represent very large
values, and we do not need high accuracy, so it's really the right type.
        regards, tom lane


Re: Query Plan Columns

От
David Fetter
Дата:
On Fri, Nov 05, 2010 at 01:39:07PM -0700, Josh Berkus wrote:
> 
> > Of course, there are containers too, which are not in your list at all.
> > How do you intend to represent the tree-ish structure in a flat table?
> 
> Andrew: we'll use a proximity tree.

Adjacency list?

If so, in my experience, it's best to separate the node descriptions
from the adjacency list that links them together.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:

>>
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
>
> Ah, great, thanks.

So based on this, I've come up with:
   "Node Type"             TEXT,   "Strategy"              TEXT,   "Operation"             TEXT,   "Startup Cost"
  FLOAT,   "Total Cost"            FLOAT,   "Plan Rows"             FLOAT,   "Plan Width"            INTEGER,   "Actual
StartupTime"   FLOAT,   "Actual Total Time"     FLOAT,   "Actual Rows"           FLOAT,   "Actual Loops"
FLOAT,  "Parent Relationship"   TEXT,   "Sort Key"              TEXT[],   "Sort Method"           TEXT[],   "Sort Space
Used"      BIGINT,   "Sort Space Type"       TEXT,   "Join Type"             TEXT,   "Join Filter"           TEXT,
"HashCond"             TEXT,   "Relation Name"         NAME,   "Alias"                 NAME,   "Scan Direction"
TEXT,  "Index Name"            TEXT,   "Index Cond"            TEXT,   "Recheck Cond"          TEXT,   "TID Cond"
      TEXT,   "Merge Cond"            TEXT,   "Subplan Name"          TEXT,   "Function Name"         TEXT,   "Function
Call"        TEXT,   "Filter"                TEXT,   "One-Time Filter"       TEXT,   "Command"               TEXT,
"SharedHit Blocks"     BIGINT,   "Shared Read Blocks"    BIGINT,   "Shared Written Blocks" BIGINT,   "Local Hit Blocks"
    BIGINT,   "Local Read Blocks"     BIGINT,   "Local Written Blocks"  BIGINT,   "Temp Read Blocks"      BIGINT,
"TempWritten Blocks"   BIGINT,   "Output"                TEXT[],   "Hash Buckets"          BIGINT,   "Hash Batches"
    BIGINT,   "Original Hash Batches" BIGINT,   "Peak Memory Usage"     BIGINT,   "Schema"                TEXT,   "CTE
Name"             TEXT 

Does that seem reasonable? Am I missing anything obvious?

Thanks,

David

Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 6, 2010, at 11:44 AM, David E. Wheeler wrote:

> On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:
>
>>>
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
>>
>> Ah, great, thanks.
>
> So based on this, I've come up with:

Updated:

CREATE TEMPORARY TABLE plans (   "Node Type"             TEXT NOT NULL,   "Strategy"              TEXT,   "Operation"
         TEXT,   "Startup Cost"          FLOAT,   "Total Cost"            FLOAT,   "Plan Rows"             FLOAT,
"PlanWidth"            INTEGER,   "Actual Startup Time"   FLOAT,   "Actual Total Time"     FLOAT,   "Actual Rows"
   FLOAT,   "Actual Loops"          FLOAT,   "Parent Relationship"   TEXT,   "Sort Key"              TEXT[],   "Sort
Method"          TEXT[],   "Sort Space Used"       BIGINT,   "Sort Space Type"       TEXT,   "Join Type"
TEXT,  "Join Filter"           TEXT,   "Hash Cond"             TEXT,   "Relation Name"         TEXT,   "Alias"
      TEXT,   "Scan Direction"        TEXT,   "Index Name"            TEXT,   "Index Cond"            TEXT,   "Recheck
Cond"         TEXT,   "TID Cond"              TEXT,   "Merge Cond"            TEXT,   "Subplan Name"          TEXT,
"FunctionName"         TEXT,   "Function Call"         TEXT,   "Filter"                TEXT,   "One-Time Filter"
TEXT,  "Command"               TEXT,   "Shared Hit Blocks"     BIGINT,   "Shared Read Blocks"    BIGINT,   "Shared
WrittenBlocks" BIGINT,   "Local Hit Blocks"      BIGINT,   "Local Read Blocks"     BIGINT,   "Local Written Blocks"
BIGINT,  "Temp Read Blocks"      BIGINT,   "Temp Written Blocks"   BIGINT,   "Output"                TEXT[],   "Hash
Buckets"         BIGINT,   "Hash Batches"          BIGINT,   "Original Hash Batches" BIGINT,   "Peak Memory Usage"
BIGINT,  "Schema"                TEXT,   "CTE Name"              TEXT 
);

Would I be right that "Node Type" is the only column can be NOT NULL?

Also, I'm thinking of making the "Actual Startup Time" and ""Actual Total Time" columns into INTERVALs. The times are
expressedin milliseconds, yes? I'm wondering if INTERVAL would be more convenient for querying… 

Thanks,

David



Re: Query Plan Columns

От
Roberto Mello
Дата:
On Sat, Nov 6, 2010 at 6:24 PM, David E. Wheeler <david@kineticode.com> wrote:
>
> Would I be right that "Node Type" is the only column can be NOT NULL?
>
> Also, I'm thinking of making the "Actual Startup Time" and ""Actual Total Time" columns into INTERVALs. The times are
expressedin milliseconds, yes? I'm wondering if INTERVAL would be more convenient for querying… 

Yes, but I am wondering whether you should just stick to what would
come out of a normal explain, for consistency sake. Maybe provide
another function, or parameter that would cast the results to
intervals?

Not sure.

Roberto


Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 7, 2010, at 5:24 AM, Roberto Mello wrote:

> Yes, but I am wondering whether you should just stick to what would
> come out of a normal explain, for consistency sake. Maybe provide
> another function, or parameter that would cast the results to
> intervals?

I think it's more convenient to have intervals.

So my final list:
   CREATE TABLE plans (       planned_at              TIMESTAMPTZ,       node_id                 TEXT PRIMARY KEY,
parent_id               TEXT REFERENCES plans(node_id),       node_type               TEXT NOT NULL,
total_runtime          INTERVAL,       strategy                TEXT,       operation               TEXT,
startup_cost           FLOAT,       total_cost              FLOAT,       plan_rows               FLOAT,
plan_width             INTEGER,       actual_startup_time     INTERVAL,       actual_total_time       INTERVAL,
actual_rows            FLOAT,       actual_loops            FLOAT,       parent_relationship     TEXT,       sort_key
            TEXT[],       sort_method             TEXT[],       sort_space_used         BIGINT,       sort_space_type
     TEXT,       join_type               TEXT,       join_filter             TEXT,       hash_cond               TEXT,
    relation_name           TEXT,       alias                   TEXT,       scan_direction          TEXT,
index_name             TEXT,       index_cond              TEXT,       recheck_cond            TEXT,       tid_cond
          TEXT,       merge_cond              TEXT,       subplan_name            TEXT,       function_name
TEXT,      function_call           TEXT,       filter                  TEXT,       one_time_filter         TEXT,
command                TEXT,       shared_hit_blocks       BIGINT,       shared_read_blocks      BIGINT,
shared_written_blocks  BIGINT,       local_hit_blocks        BIGINT,       local_read_blocks       BIGINT,
local_written_blocks   BIGINT,       temp_read_blocks        BIGINT,       temp_written_blocks     BIGINT,       output
                TEXT[],       hash_buckets            BIGINT,       hash_batches            BIGINT,
original_hash_batches  BIGINT,       peak_memory_usage       BIGINT,       schema                  TEXT,       cte_name
              TEXT,              triggers                trigger_plan[]   ); 

planned_at is just the current time (from NOW()).

node_id is simply `md5( pg_backend_pid() || clock_timestamp() )`, which is run just before each node is parsed.

trigger_plan is a composite type:
   CREATE TYPE trigger_plan AS (       trigger_name    TEXT,       constraint_name TEXT,       relation        TEXT,
  time            INTERVAL,       calls           FLOAT   ); 

I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core:
   EXPLAIN (format table) SELECT * FROM bar;

It could output a table like the above. FWIW, The function I've written works like this:
   SELECT plan('SELECT * FROM bar');

Which is an okay workaround. Anyone else think that this might be useful?

Best,

David



Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 8, 2010, at 3:12 PM, David E. Wheeler wrote:

> It could output a table like the above. FWIW, The function I've written works like this:
>
>    SELECT plan('SELECT * FROM bar');

Sorry, that's
   SELECT * FROM plan('SELECT * FROM bar');

Best,

David



Re: Query Plan Columns

От
Dimitri Fontaine
Дата:
"David E. Wheeler" <david@kineticode.com> writes:
> I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core:
>
>     EXPLAIN (format table) SELECT * FROM bar;

+1 from me here, as it happens "parsing" a table-like resultset is
already implemented everywhere :)

I'd add another +1 if it were easily usable as a "normal" relation (or
SRF) from inside a query, e.g. in a WITH table expression.

WITH plan AS ( EXPLAIN (format table) SELECT * FROM bar
)
SELECT some_plan_analysis_query_here;

Or even:

WITH plan AS ( EXPLAIN (format table) SELECT * FROM bar
)
INSERT INTO plan_audit
SELECT * FROM planWHERE actual_total_time > 12 * interval '100 ms';

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Query Plan Columns

От
"David E. Wheeler"
Дата:
On Nov 9, 2010, at 12:12 AM, Dimitri Fontaine wrote:

> WITH plan AS (
>  EXPLAIN (format table) SELECT * FROM bar
> )
> INSERT INTO plan_audit
> SELECT * FROM plan
> WHERE actual_total_time > 12 * interval '100 ms';

Yeah, that would be nice, but my current implementation has a row for each node, and a single explain can have many
nodes.With this, you'd only get the top-level node (and not even that, as you didn't do EXPLAIN ANALYZE, so
actual_total_timewould be blank!). 

But I do like the idea…

Best,

David