Re: list of extended statistics on psql

Поиск
Список
Период
Сортировка
От Tatsuro Yamada
Тема Re: list of extended statistics on psql
Дата
Msg-id 2a58a2c3-3897-afe9-7c5e-3b2e91cdb021@nttcom.co.jp_1
обсуждение исходный текст
Ответ на Re: list of extended statistics on psql  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
Hi Alvaro,

>>> IMO the per-type columns should show both the type being enabled as
>>> well as it being built.
>>
>> Hmm. I'm not sure how to get the status (enabled or disabled) of
>> extended stats. :(
>> Could you explain it more?
>
> pg_statistic_ext_data.stxdndistinct is not null if the stats have been
> built. (I'm not sure whether there's an easier way to determine this.)


Ah.. I see! Thank you.


> I suggest to do this
>
>    Name    | Schema | Definition               | Ndistinct | Dependencies | MCV
> -----------+--------+--------------------------+-----------+--------------+-----
>  stts_1    | public | (a, b) FROM t1           | f         | t            | f
>
>> I suppose that the current column order is sufficient if there is
>> no improvement of extended stats on PG14. Do you know any plan to
>> improve extended stats such as to allow it to cross multiple tables on PG14?
>
> I suggest that changing it in the future is going to be an uphill
> battle, so better get it right from the get go, without requiring a
> future restructure.


I understand your suggestions. I'll replace "Columns" and "Table" columns with "Definition" column.


>> Currently, I use this query to get Extended stats info from pg_statistic_ext.
>
> Maybe something like this would do
>
> SELECT
> stxnamespace::pg_catalog.regnamespace AS "Schema",
> stxname AS "Name",
> format('%s FROM %s',
>  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
>   FROM pg_catalog.unnest(stxkeys) s(attnum)
>   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
>   a.attnum = s.attnum AND NOT attisdropped)),
>   stxrelid::regclass) AS "Definition",
>   CASE WHEN stxdndistinct IS NOT NULL THEN 'built' WHEN 'd' = any(stxkind) THEN 'enabled, not built' END AS
"n-distinct",
>   CASE WHEN stxddependencies IS NOT NULL THEN 'built' WHEN 'f' = any(stxkind) THEN 'enabled, not built' END AS
"functionaldependencies",
 
>   CASE WHEN stxdmcv IS NOT NULL THEN 'built' WHEN 'm' = any(stxkind) THEN 'enabled, not built' END AS mcv
>  FROM pg_catalog.pg_statistic_ext es
>  INNER JOIN pg_catalog.pg_class c
>  ON stxrelid = c.oid
>  LEFT JOIN pg_catalog.pg_statistic_ext_data esd ON es.oid = esd.stxoid
>  ORDER BY 1, 2, 3;

Great! It helped me a lot to understand your suggestions correctly. Thanks. :-D
I got the below results by your query.

========
create table t1 (a int, b int);
create statistics stts_1 (dependencies) on a, b from t1;
create statistics stts_2 (dependencies, ndistinct) on a, b from t1;
create statistics stts_3 (dependencies, ndistinct, mcv) on a, b from t1;
create table t2 (a int, b int, c int);
create statistics stts_4 on b, c from t2;
create table hoge (col1 int, col2 int, col3 int);
create statistics stts_hoge on col1, col2, col3 from hoge;

insert into t1 select i,i from generate_series(1,100) i;
analyze t1;


Your query gave this result:

  Schema |   Name    |         Definition         |     n-distinct     | functional dependencies |        mcv
--------+-----------+----------------------------+--------------------+-------------------------+--------------------
  public | stts_1    | a, b FROM t1               |                    | built                   |
  public | stts_2    | a, b FROM t1               | built              | built                   |
  public | stts_3    | a, b FROM t1               | built              | built                   | built
  public | stts_4    | b, c FROM t2               | enabled, not built | enabled, not built      | enabled, not built
  public | stts_hoge | col1, col2, col3 FROM hoge | enabled, not built | enabled, not built      | enabled, not built
(5 rows)
========

I guess "enabled, not built" is a little redundant. The status would better to
have three patterns: "built", "not built" or nothing (NULL) like these:

   - "built":  extended stats is defined and built (collected by analyze cmd)
   - "not built": extended stats is defined but have not built yet
   - nothing (NULL): extended stats is not defined

What do you think about it?


I will send a new patch including :

   - Replace "Columns" and "Table" column with "Definition"
   - Show the status (built/not built/null) of extended stats by using
     pg_statistic_ext_data

Thanks,
Tatsuro Yamada






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: New default role- 'pg_read_all_data'
Следующее
От: Andy Fan
Дата:
Сообщение: Re: Get rid of runtime handling of AlternativeSubPlan?