pgsql: Use abbreviated keys for faster sorting of text datums.

Поиск
Список
Период
Сортировка
От Robert Haas
Тема pgsql: Use abbreviated keys for faster sorting of text datums.
Дата
Msg-id E1YDIw9-0001Ol-3r@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Use abbreviated keys for faster sorting of text datums.

This commit extends the SortSupport infrastructure to allow operator
classes the option to provide abbreviated representations of Datums;
in the case of text, we abbreviate by taking the first few characters
of the strxfrm() blob.  If the abbreviated comparison is insufficent
to resolve the comparison, we fall back on the normal comparator.
This can be much faster than the old way of doing sorting if the
first few bytes of the string are usually sufficient to resolve the
comparison.

There is the potential for a performance regression if all of the
strings to be sorted are identical for the first 8+ characters and
differ only in later positions; therefore, the SortSupport machinery
now provides an infrastructure to abort the use of abbreviation if
it appears that abbreviation is producing comparatively few distinct
keys.  HyperLogLog, a streaming cardinality estimator, is included in
this commit and used to make that determination for text.

Peter Geoghegan, reviewed by me.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/4ea51cdfe85ceef8afabceb03c446574daa0ac23

Modified Files
--------------
src/backend/access/nbtree/nbtsort.c    |    2 +
src/backend/commands/analyze.c         |    6 +
src/backend/executor/nodeAgg.c         |    4 +
src/backend/executor/nodeMergeAppend.c |    9 +
src/backend/executor/nodeMergejoin.c   |    8 +
src/backend/lib/Makefile               |    2 +-
src/backend/lib/hyperloglog.c          |  228 ++++++++++++++++++
src/backend/utils/adt/orderedsetaggs.c |    8 +-
src/backend/utils/adt/varlena.c        |  330 +++++++++++++++++++++++--
src/backend/utils/sort/tuplesort.c     |  413 ++++++++++++++++++++++++++++----
src/include/lib/hyperloglog.h          |   67 ++++++
src/include/pg_config_manual.h         |   14 +-
src/include/utils/sortsupport.h        |  134 ++++++++++-
13 files changed, 1149 insertions(+), 76 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: pgsql: Typo fix.
Следующее
От: Andres Freund
Дата:
Сообщение: pgsql: Fix various shortcomings of the new PrivateRefCount infrastructu