Обсуждение: ANALYZE'ing table hierarchies

Поиск
Список
Период
Сортировка

ANALYZE'ing table hierarchies

От
nunks
Дата:

Hello,

 

I have a question about running ANALYZE on table hierarchies.

 

The documentation page for partitioning mentions the need to manually issue VACUUM and ANALYZEs for each partition (http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS), while the page for ANALYZE (http://www.postgresql.org/docs/9.4/static/sql-analyze.html) mentions it does gather statistics for the master table and its children.

 

When I ANALYZE a master table PostgreSQL says it is working on the entire hierarchy and automatically ANALYZEs the child tables. However, the last_analyze column stays empty for the child tables on pg_stat_user_tables.

 

I find this somewhat confusing, what is the expected behavior? I figure the child tables have been properly ANALYZEd, but I can't find it registered anywhere. It would be nice to be able to parse pg_stat_user_tables for analyze statistics without having to worry about hierarchies, since each partition seems to have its own statistics for everything else.

 

      nunks=# analyze verbose tb05;

      INFO:  analyzing "public.tb05"

      INFO:  "tb05": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows

      INFO:  analyzing "public.tb05" inheritance tree

      INFO:  "tb05_2016": scanned 165 of 165 pages, containing 7465 live rows and 455 dead rows; 7465 rows in sample, 7465 estimated total rows

      INFO:  "tb05_2015": scanned 381 of 381 pages, containing 16281 live rows and 1504 dead rows; 16281 rows in sample, 16281 estimated total rows

      ANALYZE

     

      nunks=# select relname, last_analyze from pg_stat_user_tables where relname like 'tb05%';

        relname  |         last_analyze

      -----------+-------------------------------

       tb05      | 2016-02-18 22:47:32.770076-02

       tb05_2016 |

       tb05_2015 |

      (3 rows)

 

      nunks=# \d+ tb05;

                                                           Table "public.tb05"

               Column        |            Type             | Modifiers | Storage  | Stats target | Description

      ---------------------+-----------------------------+-----------+----------+--------------+-------------

        a                  | character varying(8)        | not null  | extended |              |

        b                  | date                        | not null  | plain    |              |

        c                  | timestamp without time zone | not null  | plain    |              |

        d                  | timestamp without time zone | not null  | plain    |              |

        e                  | bigint                      | not null  | plain    |              |

        f                  | bigint                      | not null  | plain    |              |

        g                  | bigint                      | not null  | plain    |              |

        h                  | bigint                      | not null  | plain    |              |

        i                  | bigint                      | not null  | plain    |              |

        j                  | bigint                      | not null  | plain    |              |

        k                  | bigint                      | not null  | plain    |              |

        l                  | bigint                      | not null  | plain    |              |

        m                  | bigint                      | not null  | plain    |              |

        n                  | bigint                      | not null  | plain    |              |

        o                  | bigint                      | not null  | plain    |              |

        p                  | bigint                      | not null  | plain    |              |

        q                  | bigint                      | not null  | plain    |              |

        r                  | bigint                      | not null  | plain    |              |

      Indexes:

            "tb05_pkey" PRIMARY KEY, btree (a, b)

      Triggers:

            tr_partition_tb05 BEFORE INSERT ON tb05 FOR EACH ROW EXECUTE PROCEDURE fn_partition_tb05()

      Child tables: tb05_2015,

                          tb05_2016

 

Thanks!

 Nunks

Re: ANALYZE'ing table hierarchies

От
"gaoqiangdba@163.com"
Дата:
Hi,Nunks

I have a test just now,as follows:
Hope to help you!

music=> create table music(id int,name text,style text);
CREATE TABLE
music=> create table rock (check(style = 'rock')) inherits(music);
CREATE TABLE
music=> create table pop (check(style = 'pop')) inherits(music);
CREATE TABLE
music=> create table classic (check(style = 'classic')) inherits(music);
CREATE TABLE
music=>  create table jazz (check(style = 'jazz')) inherits(music);
CREATE TABLE


music=> \dS+ music
                          Table "eric.music"
 Column |  Type   | Modifiers | Storage  | Stats target | Description 
--------+---------+-----------+----------+--------------+-------------
 id     | integer |           | plain    |              | 
 name   | text    |           | extended |              | 
 style  | text    |           | extended |              | 
Child tables: classic,
              jazz,
              pop,
              rock

music=> create index music_pop_id on pop (id);
CREATE INDEX
music=>  create index music_rock_id on rock (id);
CREATE INDEX
music=> create index music_jazz_id on jazz (id);
CREATE INDEX
music=> create index music_classic_id on classic (id);
CREATE INDEX


music=>  CREATE OR REPLACE FUNCTION music_insert_trigger()
music->  RETURNS TRIGGER AS 
music->  $$
music$>  BEGIN
music$>       IF (NEW.style = 'rock') THEN
music$>           INSERT INTO rock VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'pop') THEN
music$>           INSERT INTO pop VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'pop') THEN
music$>           INSERT INTO pop VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'jazz') THEN
music$>           INSERT INTO jazz VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'classic') THEN
music$>           INSERT INTO classic VALUES (NEW.*);
music$>       END IF;
music$>       RETURN NULL;
music$>  END;
music$>  $$
music->  LANGUAGE plpgsql ;
CREATE FUNCTION


music=> CREATE TRIGGER insert_music_trigger 
music-> BEFORE INSERT ON  music
music-> FOR EACH row EXECUTE PROCEDURE music_insert_trigger() ;
CREATE TRIGGER


music=> insert into music values(2,'Have a Nice Day','pop');
INSERT 0 0
music=> insert into music values(1,'21 Gun','rock');
INSERT 0 0


music=> select * from music;
 id |      name       | style 
----+-----------------+-------
  1 | 21 Gun          | rock
  2 | Have a Nice Day | pop
(2 rows)

music=> select * from pop
music-> ;
 id |      name       | style 
----+-----------------+-------
  2 | Have a Nice Day | pop
(1 row)

music=> select * from rock;
 id |  name  | style 
----+--------+-------
  1 | 21 Gun | rock
(1 row)

music=> analyze verbose music
music-> ;
INFO:  analyzing "eric.music"
INFO:  "music": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
INFO:  analyzing "eric.music" inheritance tree
INFO:  "rock": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  "pop": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'music';
 relname |         last_analyze          
---------+-------------------------------
 music   | 2016-02-18 22:29:56.528758-08
(1 row)

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'pop';
 relname | last_analyze 
---------+--------------
 pop     | 
(1 row)

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'rock';
 relname | last_analyze 
---------+--------------
 rock    | 
(1 row)


music=> analyze verbose pop;
INFO:  analyzing "eric.pop"
INFO:  "pop": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE
music=> select relname, last_analyze from pg_stat_user_tables where relname = 'rock';
 relname | last_analyze 
---------+--------------
 rock    | 
(1 row)

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'pop';
 relname |         last_analyze          
---------+-------------------------------
 pop     | 2016-02-18 22:31:55.666556-08
(1 row)

music=> analyze verbose rock;
INFO:  analyzing "eric.rock"
INFO:  "rock": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE
music=> select relname, last_analyze from pg_stat_user_tables where relname = 'rock';
 relname |         last_analyze          
---------+-------------------------------
 rock    | 2016-02-18 22:34:16.526558-08
(1 row)



gaoqiangdba@163.com
 
From: nunks
Date: 2016-02-19 10:02
Subject: [ADMIN] ANALYZE'ing table hierarchies

Hello,

 

I have a question about running ANALYZE on table hierarchies.

 

The documentation page for partitioning mentions the need to manually issue VACUUM and ANALYZEs for each partition (http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS), while the page for ANALYZE (http://www.postgresql.org/docs/9.4/static/sql-analyze.html) mentions it does gather statistics for the master table and its children.

 

When I ANALYZE a master table PostgreSQL says it is working on the entire hierarchy and automatically ANALYZEs the child tables. However, the last_analyze column stays empty for the child tables on pg_stat_user_tables.

 

I find this somewhat confusing, what is the expected behavior? I figure the child tables have been properly ANALYZEd, but I can't find it registered anywhere. It would be nice to be able to parse pg_stat_user_tables for analyze statistics without having to worry about hierarchies, since each partition seems to have its own statistics for everything else.

 

      nunks=# analyze verbose tb05;

      INFO:  analyzing "public.tb05"

      INFO:  "tb05": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows

      INFO:  analyzing "public.tb05" inheritance tree

      INFO:  "tb05_2016": scanned 165 of 165 pages, containing 7465 live rows and 455 dead rows; 7465 rows in sample, 7465 estimated total rows

      INFO:  "tb05_2015": scanned 381 of 381 pages, containing 16281 live rows and 1504 dead rows; 16281 rows in sample, 16281 estimated total rows

      ANALYZE

     

      nunks=# select relname, last_analyze from pg_stat_user_tables where relname like 'tb05%';

        relname  |         last_analyze

      -----------+-------------------------------

       tb05      | 2016-02-18 22:47:32.770076-02

       tb05_2016 |

       tb05_2015 |

      (3 rows)

 

      nunks=# \d+ tb05;

                                                           Table "public.tb05"

               Column        |            Type             | Modifiers | Storage  | Stats target | Description

      ---------------------+-----------------------------+-----------+----------+--------------+-------------

        a                  | character varying(8)        | not null  | extended |              |

        b                  | date                        | not null  | plain    |              |

        c                  | timestamp without time zone | not null  | plain    |              |

        d                  | timestamp without time zone | not null  | plain    |              |

        e                  | bigint                      | not null  | plain    |              |

        f                  | bigint                      | not null  | plain    |              |

        g                  | bigint                      | not null  | plain    |              |

        h                  | bigint                      | not null  | plain    |              |

        i                  | bigint                      | not null  | plain    |              |

        j                  | bigint                      | not null  | plain    |              |

        k                  | bigint                      | not null  | plain    |              |

        l                  | bigint                      | not null  | plain    |              |

        m                  | bigint                      | not null  | plain    |              |

        n                  | bigint                      | not null  | plain    |              |

        o                  | bigint                      | not null  | plain    |              |

        p                  | bigint                      | not null  | plain    |              |

        q                  | bigint                      | not null  | plain    |              |

        r                  | bigint                      | not null  | plain    |              |

      Indexes:

            "tb05_pkey" PRIMARY KEY, btree (a, b)

      Triggers:

            tr_partition_tb05 BEFORE INSERT ON tb05 FOR EACH ROW EXECUTE PROCEDURE fn_partition_tb05()

      Child tables: tb05_2015,

                          tb05_2016

 

Thanks!

 Nunks

Re: ANALYZE'ing table hierarchies

От
nunks
Дата:
Hi, thanks for the reply!

Yes, that's the same behavior I'm seeing in my database. Analyzing  the master table works on the inheritance tree but doesn't register to pg_stat_user_tables unless you explicitly analyze each child table on its own.

What eludes me is that part of the documentation says we should analyze each table individually, and another part says we can count on ANALYZE to work on the entire inheritance tree...

Could it be that when we analyze the master the scanning of child tables just means it gather statistics for the master alone, as if it were a single big table consisting of its contents plus the contents of its children? Then I'd expect those statistics to not be used for planning queries that are made directly to a child table, and the behavior we see on pg_stat_user_tables is correct in that the statistics gathered only work for queries made to the master table.

Thanks! =)

Nunks

----------
“Life beats down and crushes the soul and art reminds you that you have one.

- Stella Adler

On Fri, Feb 19, 2016 at 4:41 AM, gaoqiangdba@163.com <gaoqiangdba@163.com> wrote:
Hi,Nunks

I have a test just now,as follows:
Hope to help you!

music=> create table music(id int,name text,style text);
CREATE TABLE
music=> create table rock (check(style = 'rock')) inherits(music);
CREATE TABLE
music=> create table pop (check(style = 'pop')) inherits(music);
CREATE TABLE
music=> create table classic (check(style = 'classic')) inherits(music);
CREATE TABLE
music=>  create table jazz (check(style = 'jazz')) inherits(music);
CREATE TABLE


music=> \dS+ music
                          Table "eric.music"
 Column |  Type   | Modifiers | Storage  | Stats target | Description 
--------+---------+-----------+----------+--------------+-------------
 id     | integer |           | plain    |              | 
 name   | text    |           | extended |              | 
 style  | text    |           | extended |              | 
Child tables: classic,
              jazz,
              pop,
              rock

music=> create index music_pop_id on pop (id);
CREATE INDEX
music=>  create index music_rock_id on rock (id);
CREATE INDEX
music=> create index music_jazz_id on jazz (id);
CREATE INDEX
music=> create index music_classic_id on classic (id);
CREATE INDEX


music=>  CREATE OR REPLACE FUNCTION music_insert_trigger()
music->  RETURNS TRIGGER AS 
music->  $$
music$>  BEGIN
music$>       IF (NEW.style = 'rock') THEN
music$>           INSERT INTO rock VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'pop') THEN
music$>           INSERT INTO pop VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'pop') THEN
music$>           INSERT INTO pop VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'jazz') THEN
music$>           INSERT INTO jazz VALUES (NEW.*);
music$>       ELSEIF (NEW.style = 'classic') THEN
music$>           INSERT INTO classic VALUES (NEW.*);
music$>       END IF;
music$>       RETURN NULL;
music$>  END;
music$>  $$
music->  LANGUAGE plpgsql ;
CREATE FUNCTION


music=> CREATE TRIGGER insert_music_trigger 
music-> BEFORE INSERT ON  music
music-> FOR EACH row EXECUTE PROCEDURE music_insert_trigger() ;
CREATE TRIGGER


music=> insert into music values(2,'Have a Nice Day','pop');
INSERT 0 0
music=> insert into music values(1,'21 Gun','rock');
INSERT 0 0


music=> select * from music;
 id |      name       | style 
----+-----------------+-------
  1 | 21 Gun          | rock
  2 | Have a Nice Day | pop
(2 rows)

music=> select * from pop
music-> ;
 id |      name       | style 
----+-----------------+-------
  2 | Have a Nice Day | pop
(1 row)

music=> select * from rock;
 id |  name  | style 
----+--------+-------
  1 | 21 Gun | rock
(1 row)

music=> analyze verbose music
music-> ;
INFO:  analyzing "eric.music"
INFO:  "music": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
INFO:  analyzing "eric.music" inheritance tree
INFO:  "rock": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO:  "pop": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'music';
 relname |         last_analyze          
---------+-------------------------------
 music   | 2016-02-18 22:29:56.528758-08
(1 row)

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'pop';
 relname | last_analyze 
---------+--------------
 pop     | 
(1 row)

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'rock';
 relname | last_analyze 
---------+--------------
 rock    | 
(1 row)


music=> analyze verbose pop;
INFO:  analyzing "eric.pop"
INFO:  "pop": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE
music=> select relname, last_analyze from pg_stat_user_tables where relname = 'rock';
 relname | last_analyze 
---------+--------------
 rock    | 
(1 row)

music=> select relname, last_analyze from pg_stat_user_tables where relname = 'pop';
 relname |         last_analyze          
---------+-------------------------------
 pop     | 2016-02-18 22:31:55.666556-08
(1 row)

music=> analyze verbose rock;
INFO:  analyzing "eric.rock"
INFO:  "rock": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
ANALYZE
music=> select relname, last_analyze from pg_stat_user_tables where relname = 'rock';
 relname |         last_analyze          
---------+-------------------------------
 rock    | 2016-02-18 22:34:16.526558-08
(1 row)



 
From: nunks
Date: 2016-02-19 10:02
Subject: [ADMIN] ANALYZE'ing table hierarchies

Hello,

 

I have a question about running ANALYZE on table hierarchies.

 

The documentation page for partitioning mentions the need to manually issue VACUUM and ANALYZEs for each partition (http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS), while the page for ANALYZE (http://www.postgresql.org/docs/9.4/static/sql-analyze.html) mentions it does gather statistics for the master table and its children.

 

When I ANALYZE a master table PostgreSQL says it is working on the entire hierarchy and automatically ANALYZEs the child tables. However, the last_analyze column stays empty for the child tables on pg_stat_user_tables.

 

I find this somewhat confusing, what is the expected behavior? I figure the child tables have been properly ANALYZEd, but I can't find it registered anywhere. It would be nice to be able to parse pg_stat_user_tables for analyze statistics without having to worry about hierarchies, since each partition seems to have its own statistics for everything else.

 

      nunks=# analyze verbose tb05;

      INFO:  analyzing "public.tb05"

      INFO:  "tb05": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows

      INFO:  analyzing "public.tb05" inheritance tree

      INFO:  "tb05_2016": scanned 165 of 165 pages, containing 7465 live rows and 455 dead rows; 7465 rows in sample, 7465 estimated total rows

      INFO:  "tb05_2015": scanned 381 of 381 pages, containing 16281 live rows and 1504 dead rows; 16281 rows in sample, 16281 estimated total rows

      ANALYZE

     

      nunks=# select relname, last_analyze from pg_stat_user_tables where relname like 'tb05%';

        relname  |         last_analyze

      -----------+-------------------------------

       tb05      | 2016-02-18 22:47:32.770076-02

       tb05_2016 |

       tb05_2015 |

      (3 rows)

 

      nunks=# \d+ tb05;

                                                           Table "public.tb05"

               Column        |            Type             | Modifiers | Storage  | Stats target | Description

      ---------------------+-----------------------------+-----------+----------+--------------+-------------

        a                  | character varying(8)        | not null  | extended |              |

        b                  | date                        | not null  | plain    |              |

        c                  | timestamp without time zone | not null  | plain    |              |

        d                  | timestamp without time zone | not null  | plain    |              |

        e                  | bigint                      | not null  | plain    |              |

        f                  | bigint                      | not null  | plain    |              |

        g                  | bigint                      | not null  | plain    |              |

        h                  | bigint                      | not null  | plain    |              |

        i                  | bigint                      | not null  | plain    |              |

        j                  | bigint                      | not null  | plain    |              |

        k                  | bigint                      | not null  | plain    |              |

        l                  | bigint                      | not null  | plain    |              |

        m                  | bigint                      | not null  | plain    |              |

        n                  | bigint                      | not null  | plain    |              |

        o                  | bigint                      | not null  | plain    |              |

        p                  | bigint                      | not null  | plain    |              |

        q                  | bigint                      | not null  | plain    |              |

        r                  | bigint                      | not null  | plain    |              |

      Indexes:

            "tb05_pkey" PRIMARY KEY, btree (a, b)

      Triggers:

            tr_partition_tb05 BEFORE INSERT ON tb05 FOR EACH ROW EXECUTE PROCEDURE fn_partition_tb05()

      Child tables: tb05_2015,

                          tb05_2016

 

Thanks!

 Nunks


Re: ANALYZE'ing table hierarchies

От
Tom Lane
Дата:
nunks <nunks.lol@gmail.com> writes:
> What eludes me is that part of the documentation says we should analyze
> each table individually, and another part says we can count on ANALYZE to
> work on the entire inheritance tree...

Whole-tree stats are a different thing from per-table stats.  What
ANALYZE on a parent table produces is (1) stats for that table alone
and (2) summary stats for the whole inheritance tree rooted at that table.
If you want (3) stats for a child table alone then you need to ANALYZE
that child table by name.

            regards, tom lane


Re: ANALYZE'ing table hierarchies

От
nunks
Дата:
Thanks answering, Tom!

I deal with some big (TB-level) partitioned tables where the application does a hybrid of querying both the master and the child tables directly. This clarification will be very useful.

Thank you again!

Nunks

----------
“Life beats down and crushes the soul and art reminds you that you have one.

- Stella Adler

On Fri, Feb 19, 2016 at 11:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
nunks <nunks.lol@gmail.com> writes:
> What eludes me is that part of the documentation says we should analyze
> each table individually, and another part says we can count on ANALYZE to
> work on the entire inheritance tree...

Whole-tree stats are a different thing from per-table stats.  What
ANALYZE on a parent table produces is (1) stats for that table alone
and (2) summary stats for the whole inheritance tree rooted at that table.
If you want (3) stats for a child table alone then you need to ANALYZE
that child table by name.

                        regards, tom lane

Re: ANALYZE'ing table hierarchies

От
Tom Lane
Дата:
nunks <nunks.lol@gmail.com> writes:
> I deal with some big (TB-level) partitioned tables where the application
> does a hybrid of querying both the master and the child tables directly.
> This clarification will be very useful.

BTW, the autovacuum daemon is normally fairly good about analyzing
tables when the individual-table stats need updating (or if you think
it isn't, you can adjust its parameters to make it more or less
aggressive about that).  Where it falls down is actually in maintenance
of whole-tree stats: it does not realize that a bunch of updates on a
child table should prompt auto-analyze of the parent table as well as
the child.  So I'd actually not think that you need to worry too much
about manual ANALYZEs of the children.  You might indeed need to help
things out with periodic manual ANALYZEs of the parent table, though.

            regards, tom lane