Обсуждение: BUG #15245: pg_stat_all_tables does not include partition mastertables

Поиск
Список
Период
Сортировка

BUG #15245: pg_stat_all_tables does not include partition mastertables

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15245
Logged by:          Mahadevan Ramachandran
Email address:      mahadevan@rapidloop.com
PostgreSQL version: 10.4
Operating system:   any
Description:

The system view pg_catalog.pg_stat_all_tables (and therefore also
pg_catalog.pg_stat_user_tables) do not list partition master tables (CREATE
TABLE ... PARTITION BY ..).

This is because it only includes rows from pg_class with relkind IN ('r',
't', 'm')  (as defined in src/backend/catalog/system_views.sql).

IMHO, it should also include rows with relkind = 'p'.


Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Michael Paquier
Дата:
On Sun, Jun 17, 2018 at 08:54:06AM +0000, PG Bug reporting form wrote:
> The system view pg_catalog.pg_stat_all_tables (and therefore also
> pg_catalog.pg_stat_user_tables) do not list partition master tables (CREATE
> TABLE ... PARTITION BY ..).
>
> This is because it only includes rows from pg_class with relkind IN ('r',
> 't', 'm')  (as defined in src/backend/catalog/system_views.sql).
>
> IMHO, it should also include rows with relkind = 'p'.

Why?  A parent partition has no physical presence and no data of its
own.
--
Michael

Вложения

Re: BUG #15245: pg_stat_all_tables does not include partition master tables

От
Mahadevan Ramachandran
Дата:
Well, to the user it looks like a table because he created it
using a CREATE TABLE command and he can use it like
a normal table.

The view is named "_all_tables" so he'd expect, well, all
the tables to be listed there.

Doesn't it seem logical?

-Mahadevan


On Sun, Jun 17, 2018 at 5:36 PM, Michael Paquier <michael@paquier.xyz> wrote:
> On Sun, Jun 17, 2018 at 08:54:06AM +0000, PG Bug reporting form wrote:
>> The system view pg_catalog.pg_stat_all_tables (and therefore also
>> pg_catalog.pg_stat_user_tables) do not list partition master tables (CREATE
>> TABLE ... PARTITION BY ..).
>>
>> This is because it only includes rows from pg_class with relkind IN ('r',
>> 't', 'm')  (as defined in src/backend/catalog/system_views.sql).
>>
>> IMHO, it should also include rows with relkind = 'p'.
>
> Why?  A parent partition has no physical presence and no data of its
> own.
> --
> Michael


Re: BUG #15245: pg_stat_all_tables does not include partition master tables

От
"David G. Johnston"
Дата:
On Sunday, June 17, 2018, Mahadevan Ramachandran <mahadevan@rapidloop.com> wrote:

The view is named "_all_tables" so he'd expect, well, all
the tables to be listed there.

My first reaction was to agree, which suggests a need for a note in the docs why this isn't the case.  The _all_ specifically means both user and system; the _stats_ in the name means that only tables that have statistics are included.  I feel this is a reasonable decision.

David J.

Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Michael Paquier
Дата:
On Sun, Jun 17, 2018 at 08:29:33AM -0700, David G. Johnston wrote:
> My first reaction was to agree, which suggests a need for a note in the
> docs why this isn't the case.  The _all_ specifically means both user and
> system; the _stats_ in the name means that only tables that have statistics
> are included.  I feel this is a reasonable decision.

If there were anything to happen here, then I think that it would be
hard to define what the statistics of the parent partition should
reflect.  For example, the autovacuum and autoanalyze run times are I
think tricky as you cannot really define the last time autovacuum has
been run on the parent as the last time it has been run only only one of
the partitions or a sub-set of them.  This gets also messier if you come
across multiple partition layers.
--
Michael

Вложения

Re: BUG #15245: pg_stat_all_tables does not include partition master tables

От
David Rowley
Дата:
On 17 June 2018 at 20:54, PG Bug reporting form <noreply@postgresql.org> wrote:
> The system view pg_catalog.pg_stat_all_tables (and therefore also
> pg_catalog.pg_stat_user_tables) do not list partition master tables (CREATE
> TABLE ... PARTITION BY ..).
>
> This is because it only includes rows from pg_class with relkind IN ('r',
> 't', 'm')  (as defined in src/backend/catalog/system_views.sql).
>
> IMHO, it should also include rows with relkind = 'p'.

FWIW there was a similar discussion of whether
pg_relation_size('<partitioned table>') should report the sizes of all
its partitions in [1].  Most people seemed to vote to leave it
returning 0.

Some of the same reasoning might apply to this case.

[1] https://www.postgresql.org/message-id/495cec7e-f8d9-7e13-4807-90dbf4eec4ea@lab.ntt.co.jp

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Michael Paquier
Дата:
On Mon, Jun 18, 2018 at 05:49:47PM +1200, David Rowley wrote:
> FWIW there was a similar discussion of whether
> pg_relation_size('<partitioned table>') should report the sizes of all
> its partitions in [1].  Most people seemed to vote to leave it
> returning 0.
>
> Some of the same reasoning might apply to this case.
>
> [1] https://www.postgresql.org/message-id/495cec7e-f8d9-7e13-4807-90dbf4eec4ea@lab.ntt.co.jp

Having a wrapper on top of find_all_inheritors() which grabs all the
relations in a partition tree would be more helpful for all those cases
in my opinion, and this could just aggregate with pg_relation_size.
That's also discussed at the bottom of the thread David is mentioning.
It would be nice to get that into v12.
--
Michael

Вложения

Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Amit Langote
Дата:
On 2018/06/18 15:05, Michael Paquier wrote:
> On Mon, Jun 18, 2018 at 05:49:47PM +1200, David Rowley wrote:
>> FWIW there was a similar discussion of whether
>> pg_relation_size('<partitioned table>') should report the sizes of all
>> its partitions in [1].  Most people seemed to vote to leave it
>> returning 0.
>>
>> Some of the same reasoning might apply to this case.
>>
>> [1] https://www.postgresql.org/message-id/495cec7e-f8d9-7e13-4807-90dbf4eec4ea@lab.ntt.co.jp
> 
> Having a wrapper on top of find_all_inheritors() which grabs all the
> relations in a partition tree would be more helpful for all those cases
> in my opinion, and this could just aggregate with pg_relation_size.
> That's also discussed at the bottom of the thread David is mentioning.
> It would be nice to get that into v12.

Yeah.  Looking at that patch and subsequent comments on it, it seems we'd
like to have a function, say, pg_get_inheritance_tables
(pg_partition_tree_tables was what was used in one of the patches that
were posted), which can be used as follows:

create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);
create index on p (a);
insert into p select i % 3 + 1 from generate_series(1, 1000) i;

select  p as relname,
        pg_partition_parent(p) as parent,
        pg_partition_root(p) as root_parent,
        pg_total_relation_size(p) as size
from    pg_get_inheritance_tables('p') p
order by 4;
 relname | parent | root_parent | size
---------+--------+-------------+-------
 p       |        | p           |     0
 p123    | p      | p           |     0
 p12     | p123   | p           |     0
 p3      | p123   | p           | 57344
 p1      | p12    | p           | 57344
 p2      | p12    | p           | 57344
(6 rows)

select  pg_size_pretty(sum(pg_total_relation_size(p))) as size
from    pg_get_inheritance_tables('p') p
  size
--------
 168 kB
(1 row)

Thanks,
Amit



Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Amit Langote
Дата:
On 2018/06/18 11:06, Michael Paquier wrote:
> On Sun, Jun 17, 2018 at 08:29:33AM -0700, David G. Johnston wrote:
>> My first reaction was to agree, which suggests a need for a note in the
>> docs why this isn't the case.  The _all_ specifically means both user and
>> system; the _stats_ in the name means that only tables that have statistics
>> are included.  I feel this is a reasonable decision.
> 
> If there were anything to happen here, then I think that it would be
> hard to define what the statistics of the parent partition should
> reflect.  For example, the autovacuum and autoanalyze run times are I
> think tricky as you cannot really define the last time autovacuum has
> been run on the parent as the last time it has been run only only one of
> the partitions or a sub-set of them.  This gets also messier if you come
> across multiple partition layers.

I think autovacuum launcher ignores/skips relkind 'p' relations.  Also, we
don't emit any kind of statistics for relkind 'p' relations to the stats
collector, as individual inserts/deletes into such relations are rather
counted as insert/deletes on partitions into which they're routed.

Thanks,
Amit



Re: BUG #15245: pg_stat_all_tables does not include partition master tables

От
David Rowley
Дата:
On 18 June 2018 at 21:48, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2018/06/18 11:06, Michael Paquier wrote:
>> On Sun, Jun 17, 2018 at 08:29:33AM -0700, David G. Johnston wrote:
>>> My first reaction was to agree, which suggests a need for a note in the
>>> docs why this isn't the case.  The _all_ specifically means both user and
>>> system; the _stats_ in the name means that only tables that have statistics
>>> are included.  I feel this is a reasonable decision.
>>
>> If there were anything to happen here, then I think that it would be
>> hard to define what the statistics of the parent partition should
>> reflect.  For example, the autovacuum and autoanalyze run times are I
>> think tricky as you cannot really define the last time autovacuum has
>> been run on the parent as the last time it has been run only only one of
>> the partitions or a sub-set of them.  This gets also messier if you come
>> across multiple partition layers.
>
> I think autovacuum launcher ignores/skips relkind 'p' relations.  Also, we
> don't emit any kind of statistics for relkind 'p' relations to the stats
> collector, as individual inserts/deletes into such relations are rather
> counted as insert/deletes on partitions into which they're routed.

I think what Michael was meaning is that there is no clear way to
aggregate the last_* columns in that view. For the bigint columns,
it's a bit more clear, probably taking the sum() of all partitions
would be the way to go. However, what do you do with the timestamp
columns? Max()... ?  it was only really a "partial" vacuum of the
partitioned table, so maybe NULL?  I'm not sure.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Amit Langote
Дата:
On 2018/06/18 19:00, David Rowley wrote:
> On 18 June 2018 at 21:48, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2018/06/18 11:06, Michael Paquier wrote:
>>> On Sun, Jun 17, 2018 at 08:29:33AM -0700, David G. Johnston wrote:
>>>> My first reaction was to agree, which suggests a need for a note in the
>>>> docs why this isn't the case.  The _all_ specifically means both user and
>>>> system; the _stats_ in the name means that only tables that have statistics
>>>> are included.  I feel this is a reasonable decision.
>>>
>>> If there were anything to happen here, then I think that it would be
>>> hard to define what the statistics of the parent partition should
>>> reflect.  For example, the autovacuum and autoanalyze run times are I
>>> think tricky as you cannot really define the last time autovacuum has
>>> been run on the parent as the last time it has been run only only one of
>>> the partitions or a sub-set of them.  This gets also messier if you come
>>> across multiple partition layers.
>>
>> I think autovacuum launcher ignores/skips relkind 'p' relations.  Also, we
>> don't emit any kind of statistics for relkind 'p' relations to the stats
>> collector, as individual inserts/deletes into such relations are rather
>> counted as insert/deletes on partitions into which they're routed.
> 
> I think what Michael was meaning is that there is no clear way to
> aggregate the last_* columns in that view. For the bigint columns,
> it's a bit more clear, probably taking the sum() of all partitions
> would be the way to go. However, what do you do with the timestamp
> columns? Max()... ?  it was only really a "partial" vacuum of the
> partitioned table, so maybe NULL?  I'm not sure.

I'm not sure of the vacuum (maybe NULL's the answer in that case), but if
we ever teach autovacuum the ability to *analyze* partitioned tables, then
we could simply record the time when that happens, instead of worrying
about how to aggregate across partitions.  We could leave that NULL for
now, that is, if we decide to list partitioned tables in the
pg_stat_all_tables output at all re OP's complaint.

Thanks,
Amit



Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Michael Paquier
Дата:
On Mon, Jun 18, 2018 at 07:21:17PM +0900, Amit Langote wrote:
> I'm not sure of the vacuum (maybe NULL's the answer in that case), but if
> we ever teach autovacuum the ability to *analyze* partitioned tables, then
> we could simply record the time when that happens, instead of worrying
> about how to aggregate across partitions.  We could leave that NULL for
> now, that is, if we decide to list partitioned tables in the
> pg_stat_all_tables output at all re OP's complaint.

Even for the last_* columns, it is not really possible to have one
single definition for the partition parents.  Here are a set of stats
which could be useful depending on the context:
- average of the last runs.
- deviation of the last runs.
- Latest one which ran.
- Oldest one which ran.

The same applies for the tuples inserted and/or deleted, scans,
etc. Users may want to know the deviation to check for the balance
across partitions, or the sum of them, or even the average.  For those
reasons having entries in pg_stat_all_tables or such for parent
partitions has little meaning, and having something which allows to grab
a complete partition tree has way more value because it becomes easier
to gather stats about the whole tree.
--
Michael

Вложения

Re: BUG #15245: pg_stat_all_tables does not include partition master tables

От
Mahadevan Ramachandran
Дата:
On Mon, Jun 18, 2018 at 3:51 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> about how to aggregate across partitions.  We could leave that NULL for
> now, that is, if we decide to list partitioned tables in the
> pg_stat_all_tables output at all re OP's complaint.

"Complaint" is too strong a word :-)

I guess the real issue is that both options are a bit inconsistent. Not listing
a partitoned table in an "all tables" view does not seem any worse or better
than having it there with all columns as NULL. Should've thought about it
before I raised the bug :-)

Might I suggest a new system view that lists all (parent, child) associations,
which can then be joined with pg_stat_all_tables so users can roll up
child table stats into each parent using whatever aggregation functions
they'd like?

Thanks for your time.

-Mahadevan


Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Amit Langote
Дата:
Hello.

On 2018/06/18 22:58, Mahadevan Ramachandran wrote:
> On Mon, Jun 18, 2018 at 3:51 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> about how to aggregate across partitions.  We could leave that NULL for
>> now, that is, if we decide to list partitioned tables in the
>> pg_stat_all_tables output at all re OP's complaint.
> 
> "Complaint" is too strong a word :-)

Ah, maybe "opinion" would've been more appropriate. ;-)

> I guess the real issue is that both options are a bit inconsistent. Not listing
> a partitoned table in an "all tables" view does not seem any worse or better
> than having it there with all columns as NULL. Should've thought about it
> before I raised the bug :-)

Yeah.

> Might I suggest a new system view that lists all (parent, child) associations,
> which can then be joined with pg_stat_all_tables so users can roll up
> child table stats into each parent using whatever aggregation functions
> they'd like?

I guess that's more or less what Michael mentioned upthread.  That we need
a user-level function(s) that exposes the partition tree structure and
perhaps an accompanying view that makes inspecting the stats, relation
sizes, etc. of the tables in a given partition tree easier.

Thanks,
Amit



Re: BUG #15245: pg_stat_all_tables does not include partition mastertables

От
Amit Langote
Дата:
On 2018/06/18 21:49, Michael Paquier wrote:
> On Mon, Jun 18, 2018 at 07:21:17PM +0900, Amit Langote wrote:
>> I'm not sure of the vacuum (maybe NULL's the answer in that case), but if
>> we ever teach autovacuum the ability to *analyze* partitioned tables, then
>> we could simply record the time when that happens, instead of worrying
>> about how to aggregate across partitions.  We could leave that NULL for
>> now, that is, if we decide to list partitioned tables in the
>> pg_stat_all_tables output at all re OP's complaint.
> 
> Even for the last_* columns, it is not really possible to have one
> single definition for the partition parents.  Here are a set of stats
> which could be useful depending on the context:
>
> - average of the last runs.
> - deviation of the last runs.
> - Latest one which ran.
> - Oldest one which ran.

That seems to assume that we'll never support certain operations whose
information appears in a row of a pg_stat_* view on partitioned table parents.

For example, suppose we add a feature where autovacuum forks a worker to
analyze partitioned table parent, wouldn't it be appropriate to have it
reported via pg_stat_* with that table having its own entry?

> The same applies for the tuples inserted and/or deleted, scans,
> etc. Users may want to know the deviation to check for the balance
> across partitions, or the sum of them, or even the average.  For those
> reasons having entries in pg_stat_all_tables or such for parent
> partitions has little meaning, and having something which allows to grab
> a complete partition tree has way more value because it becomes easier
> to gather stats about the whole tree.

Yes, if we have such partition tree structure exposing function, users can
aggregate stats for partitions of a parent in all sorts of ways.  But the
question is whether a row in pg_stat_* view for a parent is responsible
for showing it.  If we ever have such rows, then I suppose it would show
the same information for partitioned tables as they do for normal tables,
but perhaps the amount of useful information would be much less than for
normal tables.

Thanks,
Amit