Re: Improve docs for n_distinct_inherited
От | David Rowley |
---|---|
Тема | Re: Improve docs for n_distinct_inherited |
Дата | |
Msg-id | CAApHDvp2gSNOtzKQg8jH=j8A6jMFrE=xPr-+3z_yPd8ykL2rXQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Improve docs for n_distinct_inherited ("David G. Johnston" <david.g.johnston@gmail.com>) |
Ответы |
Re: Improve docs for n_distinct_inherited
Re: Improve docs for n_distinct_inherited |
Список | pgsql-hackers |
Just picking this one up again. I forgot to come back to this after PGConf.dev. On Fri, 9 May 2025 at 02:50, David G. Johnston <david.g.johnston@gmail.com> wrote: > I was missing this key piece of knowledge which invalidated my entire attempt. > > Here's an attempt at shortening this now that I understand the mechanics better. > > Separate options exist because an inheritance parent table has two > different sets of statistics: one considering only itself and one which > also includes its children (<literal>n_distinct_inherited</literal>). > Partitioned tables, which only have rows in the children, likewise uses > the inherited option while everyone else uses <literal>n_distinct</literal>. I wasn't quite happy with that as the text indicates that n_distinct_inherited is the statistics. But, it's not, it's just the option that allows some modification of the gathered statistics. I came up with: Ordinarily <literal>n_distinct</literal> is used. <literal>n_distinct_inherited</literal> exists to allow the distinct estimate to be overwritten for the statistics gathered for inheritance parent tables and for partitioned tables. I also fixed what I thought was some misleading text about ANALYZE using this value to calculate things. That's not true. It's the query planner that uses this value. ANALYZE just stores whatever this is set to into pg_statistic. I also adjusted the text that was talking about "the size of the table", which, as I mentioned earlier isn't correct. It's all related to the estimated number rows in the table, per "ntuples = vardata->rel->tuples;" in get_variable_numdistinct(). Also fixed a typo; "twice on the average" shouldn't contain "the". I wonder if ", since the multiplication by the number of rows in the table is not performed until query planning time" should be deleted since I modified the text earlier to talk about "the query planner". David
Вложения
В списке pgsql-hackers по дате отправления: