pgsql: Prevent O(N^2) unique index insertion edge case.

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема pgsql: Prevent O(N^2) unique index insertion edge case.
Дата
Msg-id E1hIzJC-0007Gi-KA@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Prevent O(N^2) unique index insertion edge case.

Commit dd299df8 made nbtree treat heap TID as a tiebreaker column,
establishing the principle that there is only one correct location (page
and page offset number) for every index tuple, no matter what.
Insertions of tuples into non-unique indexes proceed as if heap TID
(scan key's scantid) is just another user-attribute value, but
insertions into unique indexes are more delicate.  The TID value in
scantid must initially be omitted to ensure that the unique index
insertion visits every leaf page that duplicates could be on.  The
scantid is set once again after unique checking finishes successfully,
which can force _bt_findinsertloc() to step right one or more times, to
locate the leaf page that the new tuple must be inserted on.

Stepping right within _bt_findinsertloc() was assumed to occur no more
frequently than stepping right within _bt_check_unique(), but there was
one important case where that assumption was incorrect: inserting a
"duplicate" with NULL values.  Since _bt_check_unique() didn't do any
real work in this case, it wasn't appropriate for _bt_findinsertloc() to
behave as if it was finishing off a conventional unique insertion, where
any existing physical duplicate must be dead or recently dead.
_bt_findinsertloc() might have to grovel through a substantial portion
of all of the leaf pages in the index to insert a single tuple, even
when there were no dead tuples.

To fix, treat insertions of tuples with NULLs into a unique index as if
they were insertions into a non-unique index: never unset scantid before
calling _bt_search() to descend the tree, and bypass _bt_check_unique()
entirely.  _bt_check_unique() is no longer responsible for incoming
tuples with NULL values.

Discussion: https://postgr.es/m/CAH2-Wzm08nr+JPx4jMOa9CGqxWYDQ-_D4wtPBiKghXAUiUy-nQ@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/9b10926263d831fac5758f1493c929a49b55669b

Modified Files
--------------
src/backend/access/nbtree/nbtinsert.c | 117 ++++++++++++----------------------
src/backend/access/nbtree/nbtsearch.c |   1 +
src/backend/access/nbtree/nbtutils.c  |   4 ++
src/include/access/nbtree.h           |  10 +++
4 files changed, 56 insertions(+), 76 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: pgsql: Avoid order-of-execution problems with ALTER TABLE ADD PRIMARYK
Следующее
От: Tom Lane
Дата:
Сообщение: pgsql: Remove useless comment.