pgsql: Make pg_statistic and related code account more honestly forcol

Поиск
Список
Период
Сортировка
От Tom Lane
Тема pgsql: Make pg_statistic and related code account more honestly forcol
Дата
Msg-id E1gXre2-0008Mu-NO@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Make pg_statistic and related code account more honestly for collations.

When we first put in collations support, we basically punted on teaching
pg_statistic, ANALYZE, and the planner selectivity functions about that.
They've just used DEFAULT_COLLATION_OID independently of the actual
collation of the data.  It's time to improve that, so:

* Add columns to pg_statistic that record the specific collation associated
with each statistics slot.

* Teach ANALYZE to use the column's actual collation when comparing values
for statistical purposes, and record this in the appropriate slot.  (Note
that type-specific typanalyze functions are now expected to fill
stats->stacoll with the appropriate collation, too.)

* Teach assorted selectivity functions to use the actual collation of
the stats they are looking at, instead of just assuming it's
DEFAULT_COLLATION_OID.

This should give noticeably better results in selectivity estimates for
columns with nondefault collations, at least for query clauses that use
that same collation (which would be the default behavior in most cases).
It's still true that comparisons with explicit COLLATE clauses different
from the stored data's collation won't be well-estimated, but that's no
worse than before.  Also, this patch does make the first step towards
doing better with that, which is that it's now theoretically possible to
collect stats for a collation other than the column's own collation.

Patch by me; thanks to Peter Eisentraut for review.

Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/5e09280057a4c3f5db297348ea3e044c9c5f4ef8

Modified Files
--------------
doc/src/sgml/catalogs.sgml                       | 12 +++++
src/backend/commands/analyze.c                   | 26 ++++++++--
src/backend/statistics/dependencies.c            |  5 +-
src/backend/statistics/extended_stats.c          |  8 +--
src/backend/statistics/mvdistinct.c              |  5 +-
src/backend/tsearch/ts_typanalyze.c              |  2 +
src/backend/utils/adt/array_selfuncs.c           | 59 +++++++++++-----------
src/backend/utils/adt/array_typanalyze.c         | 17 +++++--
src/backend/utils/adt/rangetypes_typanalyze.c    |  1 +
src/backend/utils/adt/selfuncs.c                 | 64 ++++++++++++++----------
src/backend/utils/cache/lsyscache.c              | 19 +++++++
src/backend/utils/cache/typcache.c               |  1 +
src/include/catalog/catversion.h                 |  2 +-
src/include/catalog/pg_statistic.h               | 43 ++++++++++------
src/include/commands/vacuum.h                    | 11 ++--
src/include/statistics/extended_stats_internal.h |  2 +-
src/include/utils/lsyscache.h                    |  1 +
src/include/utils/typcache.h                     |  1 +
18 files changed, 189 insertions(+), 90 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: pgsql: Introduce new extended routines for FDW and foreign serverlooku
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Make error handling in parallel pg_upgrade less bogus.