Обсуждение: Add starelid, attnum to pg_stats and leverage this in pg_dump

Поиск
Список
Период
Сортировка

Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:

Per side conversation in [1], this patch exposes pg_statistic.starelid in the pg_stats view (0001) and then changes pg_dump to use the starelid in the queries on pg_stats rather than the combination of schemaname+relname, which gets a bit clumsy in bulk array fetching, and also leads to bad query plans against pg_stats because the security barrier is also an optimization barrier, which means that queries often try to hash join, which causes the very large table pg_statistic to be sequentially scanned, and that's a bad time. Currently we get around this by adding a redundant qual to the query which gooses the plan towards an index, and that works fine for now, but there's no guarantee that it will continue to work in the future, so this change brings some plan safety as well.

0001 also exposes pg_statistic.attnum. This is no direct application of this in 0002, but people have often expressed surprise that pg_dump orders the pg_restore_attribute_stats() calls by attname rather than attnum, and if we wanted to change that now we could (albeit only on new versions).

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
Hi,

> Per side conversation in [1], this patch exposes pg_statistic.starelid in the pg_stats view (0001)
> and then changes pg_dump to use the starelid in the queries on pg_stats rather than the
> combination of schemaname+relname, which gets a bit clumsy in bulk array fetching, and
> also leads to bad query plans against pg_stats because the security barrier is also an
> optimization barrier, which means that queries often try to hash join, which causes the
> very large table pg_statistic to be sequentially scanned, and that's a bad time. Currently
> we get around this by adding a redundant qual to the query which gooses the plan towards
> an index, and that works fine for now, but there's no guarantee that it will continue to work
> in the future, so this change brings some plan safety as well.

This alone seems like a good enough reason to include startled in
pg_stat, besides other
future use-cases this will simplify as discussed in [1]

> 0001 also exposes pg_statistic.attnum. This is no direct application of this in 0002,
> but people have often expressed surprise that pg_dump orders the
> pg_restore_attribute_stats() calls by attname rather than attnum,
> and if we wanted to change that now we could (albeit only on new versions).

This seems logical

v1-0001 comments:

1/

-        nspname AS schemaname,
-        relname AS tablename,
-        attname AS attname,
+        n.nspname AS schemaname,
+        c.relname AS tablename,
+        s.starelid AS starelid,
+        a.attnum AS attnum,
+        a.attname AS attname,

Ideally, we would want the OID and attnum to be first columns to match
the pg_stat_activity/pg_stat_progress style, but that will be too invasive.

The new attname column should be moved after tablename, however.

2/

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>attnum</structfield> <type>name</type>
+       (references <link
linkend="catalog-pg-attribute"><structname>pg_attribute</structname></link>.<structfield>attnum</structfield>)
+      </para>
+      <para>
+       Position of column described by this row
+      </para></entry>
+     </row>

- This <type> should be int2
- Maybe for the describtion, it should be something like:

  "The number of the column, as in
<structname>pg_attribute</structname>.<structfield>attnum</structfield>"

  to be close to the description in pg_attribute

3/

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>starelid</structfield> <type>oid</type>
+      </para>
+      <para>
+       ID of the relation
+      </para></entry>
+     </row>

This should indicate it is a reference to pg_class.oid, like so:

```
(references <link
linkend="catalog-pg-class"><structname>pg_class</structname></link>.<structfield>oid</structfield>)
```

Maybe "ID of the table or index" is better, since this can only be a
table or index
for pg_stats.

I dislike the existing "pg_stats.tablename", since this can also be an
expression index.
"pg_stats.relation" with a description of "Name of table or index" is
more appropriate.
It is a change that we can possibly make in a major version. Looked
through the archives,
and did not see this being reported/discussed.

v1-0002:

I examined the version variants of getAttributeStats and all looks good,
and also ran a test on a 18 and 19 server version with the patched pg_dump
client and all looks good.

One minor comment is:


+                    pg_fatal("statistics table oid information missing");


I noticed in other pg_fatal messages, we include OIDs

            pg_fatal("could not find function definition for function
with OID %u",
                     cast->castfunc);

Should we do the same here?



[1] https://www.postgresql.org/message-id/aZ3RbTE8Racscykc@nathan

--
Sami Imseih
Amazon Web Services (AWS)



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Tom Lane
Дата:
Corey Huinker <corey.huinker@gmail.com> writes:
> Per side conversation in [1], this patch exposes pg_statistic.starelid in
> the pg_stats view (0001) and then changes pg_dump to use the starelid in
> the queries on pg_stats rather than the combination of schemaname+relname,

I don't object to the idea, but I don't like exposing the name
"starelid".  That just screams implementation artifact, and it
goes against the idea that pg_stats is trying to hide the physical
representation of pg_statistic.  I wish we could use "tableoid",
but that's taken already as a system column.  Maybe "tableid" or
"tablerelid"?

Also, the proposed column ordering seems excessively random:

        n.nspname AS schemaname,
        c.relname AS tablename,
        s.starelid AS starelid,
        a.attnum AS attnum,
        a.attname AS attname,
        stainherit AS inherited,

I could see either of these as plausible:

        n.nspname AS schemaname,
        c.relname AS tablename,
        s.starelid AS tableid,
        a.attname AS attname,
        a.attnum AS attnum,
        stainherit AS inherited,

        n.nspname AS schemaname,
        c.relname AS tablename,
        a.attname AS attname,
        s.starelid AS tableid,
        a.attnum AS attnum,
        stainherit AS inherited,

but I don't see the rationale behind your version.

            regards, tom lane



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:



Ideally, we would want the OID and attnum to be first columns to match
the pg_stat_activity/pg_stat_progress style, but that will be too invasive.

I don't think there is an order that will make anyone happy, let alone everyone.

The existing columns (schemaname, tablename, attname, inherited) constitute a grain, so its hard to separate them.

Once we add attnum, that can replace attname in the above grain. Once we add starelid/tableid, then that can be combined win inherited and either attnum/attname for a grain. The ordering of these columns will largely fall down to the perspective of the person seeking to use it.

 

- This <type> should be int2
- Maybe for the describtion, it should be something like:

+1
 

  "The number of the column, as in
<structname>pg_attribute</structname>.<structfield>attnum</structfield>"

We don't do that for attname, why would attnum be different?
 


+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>starelid</structfield> <type>oid</type>
+      </para>
+      <para>
+       ID of the relation
+      </para></entry>
+     </row>

This should indicate it is a reference to pg_class.oid, like so:

I went with pg_attribute, but the reference is there now.
 

Maybe "ID of the table or index" is better, since this can only be a
table or index
for pg_stats.

I went with table, because it matches the tablename definition (i.e. it is equally inaccurate).
 

I dislike the existing "pg_stats.tablename", since this can also be an
expression index.
"pg_stats.relation" with a description of "Name of table or index" is
more appropriate.
It is a change that we can possibly make in a major version. Looked
through the archives,
and did not see this being reported/discussed.

I don't see it changing in any version, minor or major.

 

v1-0002:

I examined the version variants of getAttributeStats and all looks good,
and also ran a test on a 18 and 19 server version with the patched pg_dump
client and all looks good.

One minor comment is:


+                    pg_fatal("statistics table oid information missing");


I noticed in other pg_fatal messages, we include OIDs

            pg_fatal("could not find function definition for function
with OID %u",
                     cast->castfunc);

Should we do the same here?

If I had the oid, I wouldn't have the error. :)

I've modified the error message to give the namespace/tag of the entry. Will post revised patch later.

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:
I don't object to the idea, but I don't like exposing the name
"starelid".  That just screams implementation artifact, and it
goes against the idea that pg_stats is trying to hide the physical
representation of pg_statistic.  I wish we could use "tableoid",
but that's taken already as a system column.  Maybe "tableid" or
"tablerelid"?

I went with tableid, as it pairs with tablename, and at least they're inaccurate in the same way.
 

Also, the proposed column ordering seems excessively random:

        n.nspname AS schemaname,
        c.relname AS tablename,
        s.starelid AS starelid,
        a.attnum AS attnum,
        a.attname AS attname,
        stainherit AS inherited,

I could see either of these as plausible:

        n.nspname AS schemaname,
        c.relname AS tablename,
        s.starelid AS tableid,
        a.attname AS attname,
        a.attnum AS attnum,
        stainherit AS inherited,

        n.nspname AS schemaname,
        c.relname AS tablename,
        a.attname AS attname,
        s.starelid AS tableid,
        a.attnum AS attnum,
        stainherit AS inherited,

but I don't see the rationale behind your version.

The rationale went like this:

In order to get the grain of this table you need:
    (  (schemaname AND tablename) OR tableid )
    AND
    (attnum OR attname)
    AND
    inherited

As I stated earlier, I don't think there's a way to "win" with the column ordering here, any ordering we choose will look awkward to at least one use case. Your first suggestion seems fine to me, so I'm going with that one in this round.


Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
>> I dislike the existing "pg_stats.tablename", since this can also be an
>> expression index.
>> "pg_stats.relation" with a description of "Name of table or index" is
>> more appropriate.
>> It is a change that we can possibly make in a major version. Looked
>> through the archives,
>> and did not see this being reported/discussed.
>
>
> I don't see it changing in any version, minor or major.

This could be a separate discussion as it's not the fault of this patch,
but clearly "tablename" is not correct here.

>> I noticed in other pg_fatal messages, we include OIDs
>>
>>             pg_fatal("could not find function definition for function
>> with OID %u",
>>                      cast->castfunc);
>>
>> Should we do the same here?
>
>
> If I had the oid, I wouldn't have the error. :)

oops, you're right.


I noticed that you changed the tests to selecting individual columns. I am
not clear as to why this is better?

-SELECT *
+SELECT schemaname, tablename, attname, attnum, inherited, null_frac, avg_width,
+    n_distinct, most_common_vals, most_common_freqs, histogram_bounds,
+    correlation, most_common_elems, most_common_elem_freqs,
+    elem_count_histogram, range_length_histogram, range_empty_frac,
+    range_bounds_histogram

Otherwise v2 LGTM.

--
Sami Imseih
Amazon Web Services (AWS)



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:
>> "pg_stats.relation" with a description of "Name of table or index" is
>> more appropriate.
>> It is a change that we can possibly make in a major version. Looked
>> through the archives,
>> and did not see this being reported/discussed.
>
>
> I don't see it changing in any version, minor or major.

This could be a separate discussion as it's not the fault of this patch,
but clearly "tablename" is not correct here.

Certainly up for debate, but changing it would break existing scripts, and that's generally a non-starter around here.
 
I noticed that you changed the tests to selecting individual columns. I am
not clear as to why this is better?

-SELECT *
+SELECT schemaname, tablename, attname, attnum, inherited, null_frac, avg_width,
+    n_distinct, most_common_vals, most_common_freqs, histogram_bounds,
+    correlation, most_common_elems, most_common_elem_freqs,
+    elem_count_histogram, range_length_histogram, range_empty_frac,
+    range_bounds_histogram

Well, the oid is now a part of the view, and that's not stable from one regression run to the next, so we have to exclude it.

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
>> I noticed that you changed the tests to selecting individual columns. I am
>> not clear as to why this is better?
>>
>> -SELECT *
>> +SELECT schemaname, tablename, attname, attnum, inherited, null_frac, avg_width,
>> +    n_distinct, most_common_vals, most_common_freqs, histogram_bounds,
>> +    correlation, most_common_elems, most_common_elem_freqs,
>> +    elem_count_histogram, range_length_histogram, range_empty_frac,
>> +    range_bounds_histogram

Makes sense.

BTW, I do not see a CF enty for this.

Thanks!

--
Sami Imseih
Amazon Web Services (AWS)



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:

BTW, I do not see a CF enty for this.

 

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Thu, Mar 05, 2026 at 04:40:56PM -0500, Corey Huinker wrote:
> I just created https://commitfest.postgresql.org/patch/6568/ - thanks!

I just skimmed through the latest patches, and they look generally
reasonable to me.  I'll take a closer look soon and hopefully get these
committed.

Note to self: 0001 requires a catversion bump.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Fri, Mar 06, 2026 at 05:08:58PM -0600, Nathan Bossart wrote:
> I just skimmed through the latest patches, and they look generally
> reasonable to me.  I'll take a closer look soon and hopefully get these
> committed.

Oh, a question I forgot to ask: why wouldn't we do the same thing for
pg_stats_ext and pg_stats_ext_exprs?

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
> Oh, a question I forgot to ask: why wouldn't we do the same thing for
> pg_stats_ext and pg_stats_ext_exprs?

The primary purpose of adding the relid is to optimize queries on
pg_statistics in pg_dump.

+ *
* The redundant filter clause on s.tablename = ANY(...) seems
* sufficient to convince the planner to use
* pg_class_relname_nsp_index, which avoids a full scan of pg_stats.
- * This may not work for all versions.
+ * This seems to work for all version prior to v19, after which we
+ * will use the starelid, which is simpler.

pg_stats_ext and pg_stats_ext_expr do not have the same concern.

If we do add relid there, it will be for consistency only.

--
Sami Imseih
Amazon Web Services (AWS)



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Mon, Mar 09, 2026 at 10:44:58AM -0500, Sami Imseih wrote:
>> Oh, a question I forgot to ask: why wouldn't we do the same thing for
>> pg_stats_ext and pg_stats_ext_exprs?
> 
> [...]
> 
> If we do add relid there, it will be for consistency only.

It might be only for consistency for now, but it strikes me as something
that could be handy down the road (in which case it'd be nice to minimize
the number of versions that need a fallback).

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:


On Mon, Mar 9, 2026 at 11:51 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 09, 2026 at 10:44:58AM -0500, Sami Imseih wrote:
>> Oh, a question I forgot to ask: why wouldn't we do the same thing for
>> pg_stats_ext and pg_stats_ext_exprs?
>
> [...]
>
> If we do add relid there, it will be for consistency only.

It might be only for consistency for now, but it strikes me as something
that could be handy down the road (in which case it'd be nice to minimize
the number of versions that need a fallback).

--
nathan

You're both right. Currently, we fetch extended stats one at a time, thus there's no _immediate_ need to do so.

But "why wait until there is a crisis" is solid reasoning and for that I had already coded up the change. I did have one small problem in that Michael Paquier had hoped that we could get the expr index (-1, -2, etc) on the expressions as that was something that we at least briefly thought we'd need for importing expression statistics. However the existing query uses a SELECT unnest(a), unnest(b) pattern in it, and WITH ORDINALITY is not allowed, and the workarounds I found seemed a bit tortured. Hence, I decided to leave that out so as not to distract from the now accepted patch, and with that out of the way I'll happily inflict that tortured SQL on y'all. 

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
>> nathan
>
>
> You're both right. Currently, we fetch extended stats one at a time, thus there's no _immediate_ need to do so.
>
> But "why wait until there is a crisis" is solid reasoning and for that I had already coded up the change.
>I did have one small problem in that Michael Paquier had hoped that we could get the expr index (-1, -2, etc) on the
expressions
> as that was something that we at least briefly thought we'd need for importing expression statistics. However the
existingquery
 
> uses a SELECT unnest(a), unnest(b) pattern in it, and WITH ORDINALITY is not allowed, and the workarounds I found
seemeda
 
> bit tortured. Hence, I decided to leave that out so as not to distract from the now accepted patch, and with that out
ofthe way
 
> I'll happily inflict that tortured SQL on y'all.

Can we add the tableid  for now (see 0003) and follow-up with another
thread with the sql changes to pg_dump?

I also corrected v2-0001. The docs were still referencing "starlid"
instead of "tableid"

--
Sami Imseih
Amazon Web Services (AWS)

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:


On Mon, Mar 9, 2026 at 2:56 PM Sami Imseih <samimseih@gmail.com> wrote:
>> nathan
>
>
> You're both right. Currently, we fetch extended stats one at a time, thus there's no _immediate_ need to do so.
>
> But "why wait until there is a crisis" is solid reasoning and for that I had already coded up the change.
>I did have one small problem in that Michael Paquier had hoped that we could get the expr index (-1, -2, etc) on the expressions
> as that was something that we at least briefly thought we'd need for importing expression statistics. However the existing query
> uses a SELECT unnest(a), unnest(b) pattern in it, and WITH ORDINALITY is not allowed, and the workarounds I found seemed a
> bit tortured. Hence, I decided to leave that out so as not to distract from the now accepted patch, and with that out of the way
> I'll happily inflict that tortured SQL on y'all.

Can we add the tableid  for now (see 0003) and follow-up with another
thread with the sql changes to pg_dump?

Presently, I don't think we make any changes to pg_dump, unless Nathan feels strongly that we should. If and when the need for oid-based fetching of extended stats becomes necessary, we'll at least have a couple versions where the catalog already had the oids handy.
 

I also corrected v2-0001. The docs were still referencing "starlid"
instead of "tableid"

I'll make that change in my forthcoming patch. Just rebasing some things.


Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Mon, Mar 09, 2026 at 03:28:40PM -0400, Corey Huinker wrote:
> Presently, I don't think we make any changes to pg_dump, unless Nathan
> feels strongly that we should. If and when the need for oid-based fetching
> of extended stats becomes necessary, we'll at least have a couple versions
> where the catalog already had the oids handy.

That's fine with me.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:
On Mon, Mar 9, 2026 at 5:04 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 09, 2026 at 03:28:40PM -0400, Corey Huinker wrote:
> Presently, I don't think we make any changes to pg_dump, unless Nathan
> feels strongly that we should. If and when the need for oid-based fetching
> of extended stats becomes necessary, we'll at least have a couple versions
> where the catalog already had the oids handy.

That's fine with me.

--
nathan

I was sidetracked for a bit because the tests (which are accurate) seemed strange in that pg_stats_ext_exprs would have 2 rows with NULL inherited...and it took me a bit to realize that that actually makes sense because the pg_stats_ext_exprs should have one row per expression whether or not there are stats to match it. If that sounds like it could mess up vacuumdb, it actually can't, because we base those tests based off pg_stats_ext not pg_stats_ext_exprs - which does vary according to whether stats exist for the object or not. I added a comment to the regression test to that effect.

The new column expr_attnum gives the negative number that a given expression has in pg_dependencites and pg_ndistinct - as you might imagine, the number is determined by its order within the list of expressions.

If you want 0003 split into two (one for pg_stats_ext and one for pg_stats_ext_exprs) I can do that, but they felt like a package deal to me.
Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Mon, Mar 09, 2026 at 06:42:46PM -0400, Corey Huinker wrote:
> If you want 0003 split into two (one for pg_stats_ext and one for
> pg_stats_ext_exprs) I can do that, but they felt like a package deal to me.

Sorry to throw a curveball here, but I think the best way to structure this
patch set is as follows:

* 0001: Just the test changes (e.g., replacing * with column lists).  By
doing that first, the actual functionality changes in 0002 will be easier
to see.

* 0002: Adding columns to the all the views (this one will do a catalog
bump).  If we need to make a couple of test adjustments to verify the OIDs
or attribute IDs, that's fine, but those can be pretty minimal.

* 0003: Associated pg_dump changes.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:


On Tue, Mar 10, 2026 at 12:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 09, 2026 at 06:42:46PM -0400, Corey Huinker wrote:
> If you want 0003 split into two (one for pg_stats_ext and one for
> pg_stats_ext_exprs) I can do that, but they felt like a package deal to me.

Sorry to throw a curveball here, but I think the best way to structure this
patch set is as follows:

* 0001: Just the test changes (e.g., replacing * with column lists).  By
doing that first, the actual functionality changes in 0002 will be easier
to see.

Makes sense.
 

* 0002: Adding columns to the all the views (this one will do a catalog
bump).  If we need to make a couple of test adjustments to verify the OIDs
or attribute IDs, that's fine, but those can be pretty minimal.

So all three views in one go. +1
 

* 0003: Associated pg_dump changes.

On it.

 

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:
On Tue, Mar 10, 2026 at 12:42 PM Corey Huinker <corey.huinker@gmail.com> wrote:


On Tue, Mar 10, 2026 at 12:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 09, 2026 at 06:42:46PM -0400, Corey Huinker wrote:
> If you want 0003 split into two (one for pg_stats_ext and one for
> pg_stats_ext_exprs) I can do that, but they felt like a package deal to me.

Sorry to throw a curveball here, but I think the best way to structure this
patch set is as follows:

* 0001: Just the test changes (e.g., replacing * with column lists).  By
doing that first, the actual functionality changes in 0002 will be easier
to see.

Makes sense.
 

* 0002: Adding columns to the all the views (this one will do a catalog
bump).  If we need to make a couple of test adjustments to verify the OIDs
or attribute IDs, that's fine, but those can be pretty minimal.

So all three views in one go. +1
 

* 0003: Associated pg_dump changes.

On it.

Apologies for the delay. I thought I had posted this. I rebased it against a630ac5c2016e52 (current HEAD).
Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
> Apologies for the delay. I thought I had posted this. I rebased it against a630ac5c2016e52 (current HEAD).

Thanks!

just a few things.


v5-0001:

+-- Set up convenience views
+--
+CREATE VIEW stats_import.pg_stats_check AS

It will be good to add more commentary here to the next person adding
a test, and mention
that only stable values should be included in this view. what do you think?

v5-0002:

This is the wrong link. It should be catalog-pg-attribute and not
catalog-pg-class

+       (references <link
linkend="catalog-pg-class"><structname>pg_attribute</structname></link>.<structfield>attrelid</structfield>)

and a few occurrences where the link to pg_statistic_ext is pointing
to catalog-pg-class:

+       (references <link
linkend="catalog-pg-class"><structname>pg_statistic_ext</structname></link>.<structfield>stxrelid</structfield>)


v5-0003:

LGTM


--
Sami Imseih
Amazon Web Services (AWS)



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:



It will be good to add more commentary here to the next person adding
a test, and mention
that only stable values should be included in this view. what do you think?

I added a comment, but I don't think a test writer would be confused for long if they did break that rule.
 


+       (references <link
linkend="catalog-pg-class"><structname>pg_statistic_ext</structname></link>.<structfield>stxrelid</structfield>)

Fixed catalog-pg-attribute and catalog-pg-statistic-ext

 

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Sami Imseih
Дата:
>> It will be good to add more commentary here to the next person adding
>> a test, and mention
>> that only stable values should be included in this view. what do you think?
>
>
> I added a comment, but I don't think a test writer would be confused for long if they did break that rule.

Thanks! it looks like you added the additional comment to 0002 rather than 0001
which introduced the view. I fixed that in the attached.

>> +       (references <link
>> linkend="catalog-pg-class"><structname>pg_statistic_ext</structname></link>.<structfield>stxrelid</structfield>)
>
>
> Fixed catalog-pg-attribute and catalog-pg-statistic-ext

A few references were missed in 0002.

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>expr_attnum</structfield> <type>int2</type>
+      </para>
+      <para>
+       Synthetic attnum used to reference this expression in
+       <link
linkend="catalog-pg-class"><structname>pg_statistic_ext</structname></link>.<structfield>stxdndistinct</structfield>
+       and
+       <link
linkend="catalog-pg-class"><structname>pg_statistic_ext</structname></link>.<structfield>stxddependencies</structfield>
+      </para></entry>
+     </row>

Also fixed in v7


--
Sami Imseih
Amazon Web Services (AWS)

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
I committed 0001, albeit with some editorialization.  The others might need
some rebasing, which I can help with next week if someone else doesn't beat
me to it.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:


On Fri, Mar 13, 2026 at 4:07 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
I committed 0001, albeit with some editorialization.  The others might need
some rebasing, which I can help with next week if someone else doesn't beat
me to it.


"_stable", nice name choice. Rebasing was nonzero but just barely.
Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Sat, Mar 14, 2026 at 05:13:19PM -0400, Corey Huinker wrote:
> "_stable", nice name choice. Rebasing was nonzero but just barely.

Thanks.  Here is what I have staged for commit for the next patch in the
series.

>  CREATE VIEW pg_stats WITH (security_barrier) AS
>      SELECT
> -        nspname AS schemaname,
> -        relname AS tablename,
> -        attname AS attname,
> +        n.nspname AS schemaname,
> +        c.relname AS tablename,
> +        a.attrelid AS tableid,
> +        a.attname AS attname,
> +        a.attnum AS attnum,

I didn't see why we needed to change the lines for the existing columns, so
I left those parts out.

>  CREATE VIEW pg_stats_ext WITH (security_barrier) AS
>      SELECT cn.nspname AS schemaname,
>             c.relname AS tablename,
> +           s.stxrelid AS tableid,
>             sn.nspname AS statistics_schemaname,
>             s.stxname AS statistics_name,
> +           s.oid AS statid,

I went with "statistics_id" to match the naming scheme.

>  CREATE VIEW pg_stats_ext_exprs WITH (security_barrier) AS
>      SELECT cn.nspname AS schemaname,
>             c.relname AS tablename,
> +           s.stxrelid AS tableid,
>             sn.nspname AS statistics_schemaname,
>             s.stxname AS statistics_name,
> +           s.oid AS statid,
>             pg_get_userbyid(s.stxowner) AS statistics_owner,
> -           stat.expr,
> +           expr.expr,
> +           0 - expr.ordinality AS expr_attnum,

I left the expr_attnum stuff out.  It seems to make this patch quite large
and complicated, we don't plan to use it for the pg_dump patch, and I'm not
sure about showing users a "synthetic attnum" that seems to have no other
point of reference.  Would this information be useful in pg_dump somewhere?
I'm curious to hear more about the intent.

>  CREATE VIEW stats_import.pg_stats_stable AS
> -  SELECT schemaname, tablename, attname, inherited, null_frac, avg_width,
> +  SELECT schemaname, tablename, attname, attnum, inherited, null_frac, avg_width,

I didn't see much value in adding attnum here given the size of the changes
to the expected output it produces.

> +       <structfield>tableid</structfield> <type>oid</type>
> +       (references <link
linkend="catalog-pg-attribute"><structname>pg_attribute</structname></link>.<structfield>attrelid</structfield>)

While we might be pulling the OID from pg_attribute in the view, we seem to
point to the true origin for these reference notes elsewhere, so I changed
it to pg_class.oid here.

> +       <structfield>tableid</structfield> <type>oid</type>
> +       (references <link
linkend="catalog-pg-statistic-ext"><structname>pg_statistic_ext</structname></link>.<structfield>stxrelid</structfield>)

... and here.

> +       <structfield>tableid</structfield> <type>oid</type>
> +       (references <link
linkend="catalog-pg-statistic-ext"><structname>pg_statistic_ext</structname></link>.<structfield>stxrelid</structfield>)

... and here.

-- 
nathan

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:

>  CREATE VIEW pg_stats_ext_exprs WITH (security_barrier) AS
>      SELECT cn.nspname AS schemaname,
>             c.relname AS tablename,
> +           s.stxrelid AS tableid,
>             sn.nspname AS statistics_schemaname,
>             s.stxname AS statistics_name,
> +           s.oid AS statid,
>             pg_get_userbyid(s.stxowner) AS statistics_owner,
> -           stat.expr,
> +           expr.expr,
> +           0 - expr.ordinality AS expr_attnum,

I left the expr_attnum stuff out.  It seems to make this patch quite large
and complicated, we don't plan to use it for the pg_dump patch, and I'm not
sure about showing users a "synthetic attnum" that seems to have no other
point of reference.  Would this information be useful in pg_dump somewhere?
I'm curious to hear more about the intent.

expr_attnum was something that Michael Paquier had lamented that the view didn't have. There is obviously no present need for it, as pg_dump isn't being modified for extended stats at all.
 
I didn't see much value in adding attnum here given the size of the changes
to the expected output it produces.

Same reasons for putting that in - people had lamented that we couldn't order the dump by attnum, and ordering by attname feels weird somehow. Again, we don't presently need it.

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Mon, Mar 16, 2026 at 04:04:23PM -0400, Corey Huinker wrote:
>> I left the expr_attnum stuff out.  It seems to make this patch quite large
>> and complicated, we don't plan to use it for the pg_dump patch, and I'm not
>> sure about showing users a "synthetic attnum" that seems to have no other
>> point of reference.  Would this information be useful in pg_dump somewhere?
>> I'm curious to hear more about the intent.
> 
> expr_attnum was something that Michael Paquier had lamented that the view
> didn't have. There is obviously no present need for it, as pg_dump isn't
> being modified for extended stats at all.

Okay.  I think I'll continue to leave this one out for now.

>> I didn't see much value in adding attnum here given the size of the changes
>> to the expected output it produces.
> 
> Same reasons for putting that in - people had lamented that we couldn't
> order the dump by attnum, and ordering by attname feels weird somehow.
> Again, we don't presently need it.

This note was about adding attnum to the pg_stats_stable view in the test.
I don't have any problem with adding it to pg_stats.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Michael Paquier
Дата:
On Mon, Mar 16, 2026 at 03:15:14PM -0500, Nathan Bossart wrote:
> On Mon, Mar 16, 2026 at 04:04:23PM -0400, Corey Huinker wrote:
>> expr_attnum was something that Michael Paquier had lamented that the view
>> didn't have. There is obviously no present need for it, as pg_dump isn't
>> being modified for extended stats at all.
>
> Okay.  I think I'll continue to leave this one out for now.

Lamenting is the right term.  The expressions stored in an extended
stats object have attnumbers computed by the backend, starting from -1
and decremented, and we don't expose this information at all in any
system view.  It could have helped in enforcing a stronger ordering of
the items dumps for the extstats restore functions.  I still think
that it could provide an extra layer of safety.  Now, we don't
critically require it either based on how we pass down the input
arrays, how we dump the data from the catalogs, and how we treat the
order of the items given as function args.
--
Michael

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Tue, Mar 17, 2026 at 06:49:17AM +0900, Michael Paquier wrote:
> On Mon, Mar 16, 2026 at 03:15:14PM -0500, Nathan Bossart wrote:
>> On Mon, Mar 16, 2026 at 04:04:23PM -0400, Corey Huinker wrote:
>>> expr_attnum was something that Michael Paquier had lamented that the view
>>> didn't have. There is obviously no present need for it, as pg_dump isn't
>>> being modified for extended stats at all.
>> 
>> Okay.  I think I'll continue to leave this one out for now.
> 
> Lamenting is the right term.  The expressions stored in an extended
> stats object have attnumbers computed by the backend, starting from -1
> and decremented, and we don't expose this information at all in any
> system view.  It could have helped in enforcing a stronger ordering of
> the items dumps for the extstats restore functions.  I still think
> that it could provide an extra layer of safety.  Now, we don't
> critically require it either based on how we pass down the input
> arrays, how we dump the data from the catalogs, and how we treat the
> order of the items given as function args.

FTR I'm not mortally opposed to the idea.  I just want to get the easier
stuff out of the way first so we can commit the pg_dump change.  Then we
can give expr_attnum our undivided attention.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Michael Paquier
Дата:
On Mon, Mar 16, 2026 at 05:52:38PM -0500, Nathan Bossart wrote:
> FTR I'm not mortally opposed to the idea.  I just want to get the easier
> stuff out of the way first so we can commit the pg_dump change.  Then we
> can give expr_attnum our undivided attention.

Makes sense here to keep it simple.  I don't want to change the
extstat restore code more than necessary at this stage of the CF
calendar cycle, except for bugs and issues discussed during the beta
cycle.  Perhaps we could do something in this area for v20.  Or
perhaps not.
--
Michael

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Corey Huinker
Дата:
Makes sense here to keep it simple.  I don't want to change the
extstat restore code more than necessary at this stage of the CF
calendar cycle, except for bugs and issues discussed during the beta
cycle.

+1. I've also not advocated any changes to the extstat dump/restore code, and I doubt I would until we start needing to fetch them in batches.

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
Committed the next patch in the series.  I'll have a rebased version of the
last one ready to share soon.

-- 
nathan



Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Tue, Mar 17, 2026 at 09:26:57AM -0500, Nathan Bossart wrote:
> Committed the next patch in the series.  I'll have a rebased version of the
> last one ready to share soon.

As promised...

-- 
nathan

Вложения

Re: Add starelid, attnum to pg_stats and leverage this in pg_dump

От
Nathan Bossart
Дата:
On Tue, Mar 17, 2026 at 10:01:42AM -0500, Nathan Bossart wrote:
> On Tue, Mar 17, 2026 at 09:26:57AM -0500, Nathan Bossart wrote:
>> Committed the next patch in the series.  I'll have a rebased version of the
>> last one ready to share soon.
> 
> As promised...

Committed.

-- 
nathan